SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Member
    Join Date
    Jun 2004
    Location
    Grafton
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Regex to detect filename.pdf in string

    I have a string of text and want to be able to detect when a pdf filename exists in the string.

    For example:'... and you can get more info by downloading the pdf file moreinfo.pdf ...'

    I want to match the 'moreinfo.pdf' so I can create a hyperlink to the file. All pdfs are in the same directory so there wont be full paths to the file, just the file name.

    I already have this code to match urls:
    $text = preg_replace("/([a-zA-Z]+:\/\/[a-z0-9\_\.\-]+".
    "[a-z]{2,6}[a-zA-Z0-9\/\*\-\?\&\%\=\,\.]+)/",
    " <a href=\"$1\" target=\"_blank\" title=\"$1\">link >>></a> ", $text);

    But I dont know how to modify it to do what I need. Can someone help me please?

  2. #2
    SitePoint Evangelist catweasel's Avatar
    Join Date
    Apr 2007
    Location
    Goldfields, VIC, Australia
    Posts
    518
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    try this -
    PHP Code:
    <?php
    $text 
    "...and you can get more info by downloading the pdf file moreinfo.pdf or another file_name.pdf or perhaps one/with/a/directory/path.pdf or one-with-dashes.pdf";
    $pattern "#([a-zA-Z0-9-_\/]+\.pdf)#";
    $text preg_replace($pattern,"<a href=\"$1\" target=\"_blank\" title=\"$1\">$1</a>"$text);
    echo 
    $text;
    ?>

  3. #3
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    or could... ~([\w/]\S*?\.[pP][dD][fF])~
    Starts with a-z 0-9 _ / and can contain any amount of non-spaces chars until .pdf
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  4. #4
    SitePoint Member
    Join Date
    Jun 2004
    Location
    Grafton
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks catweasel that works great.

    How is yours different, logic_earth (in behaviour)?

  5. #5
    SitePoint Evangelist simshaun's Avatar
    Join Date
    Apr 2008
    Location
    North Carolina
    Posts
    438
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Or more simply,

    Code:
    ~(\S+?\.pdf)~i

  6. #6
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by dr_snapid View Post
    Thanks catweasel that works great.

    How is yours different, logic_earth (in behaviour)?
    it allows any character that is not a whitespace character ( space, newlines ) between two boundary parts it must start with either a-z 0-9 _ / and must end with .pdf (case-insensitive)

    The boundary is there for a reason while the post simshaun posted is simpler it has a flaw. Example:

    Code:
    blah blah blah file.pdf more blahs go here then call another file (another_file.pdf) some more text.
    Now if you use it with out boundaries your link for the last file will be invalid because it will contain the parenthesizes "(".
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  7. #7
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Since OP said there won't be full paths, just the filenames, this should work best

    ~\w+\.pdf~


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •