SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Zealot thetzfreak's Avatar
    Join Date
    Aug 2004
    Location
    United States
    Posts
    154
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Some Regex Help Needed

    Hi,

    I have a URL resembling this:

    Code:
    http://www.domain.com/folder/?S=Gfile%20name.jpg
    Sometimes, I find online URLs that have this: ?J=E or ?W=B;R=O

    It would always be a question mark, followed by an uppercase letter, followed by an equals sign, followed by an uppercase letter, and in some cases a repetition with a ";" in the middle.

    I'd like to use regex on this URL to take out the "?S=A" parts. What is the regex to do this? I'm really bad at it, so I need some help. This is the only code I could come up with myself:

    Code:
    $newlink = preg_replace("/(?){0,1}(A-Z){0,1}(=){0,1}(A-Z){0,1}(;){0,1}/", "", $newlink);
    I have to make it so that it doesn't HAVE to find it there, because sometimes it isn't present. I'm sure my code is wrong and very inefficient.

    Thanks

  2. #2
    SitePoint Addict Trent Reimer's Avatar
    Join Date
    Sep 2005
    Location
    Canada
    Posts
    228
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Anything after the ? is a query string. So as long as you are removing the ? and everything after it your job is fairly simple. In fact you probably don't need a regex.

    These examples should each return the same thing:

    PHP Code:
    // regex
    $newlink preg_replace('/\?.*$/'''$newlink); // ? is a regex character so here we have to escape it with a backslash

    // substring - I would be more inclined toward this.
    // First see if we even have a query string to parse. If not we're done.
    if (($pos strpos($newlink'?')) !== false$newlink substr($newlink0$pos); 
    http://www.php.net/strpos
    http://www.php.net/substr

  3. #3
    SitePoint Zealot thetzfreak's Avatar
    Join Date
    Aug 2004
    Location
    United States
    Posts
    154
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the reply! But, I think you misunderstood me a bit. I gave a link:

    http://www.domain.com/folder/?S=Gfilename.jpg

    I JUST want to remove the ?S=G. I still need the file name and extension there. So removing it from the above url would give:

    http://www.domain.com/folder/filename.jpg

    Another example:

    http://www.domain.com/folder/?W=A;B=Ffilename.jpg

    to change to:

    http://www.domain.com/folder/filename.jpg

    Thanks for helping

  4. #4
    SitePoint Addict Trent Reimer's Avatar
    Join Date
    Sep 2005
    Location
    Canada
    Posts
    228
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ok, thanks for clarifying that for me. So it sounds like the minimum replacement logic would be to eliminate the "?" and everything up to and including the first capital after the final "=".

    PHP Code:
    $newlink preg_replace('/\?.*=[A-Z]/'''$newlink); 

  5. #5
    SitePoint Zealot thetzfreak's Avatar
    Join Date
    Aug 2004
    Location
    United States
    Posts
    154
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks, that should do it. I'll try it. May I ask what the period is for? What does it mean?

  6. #6
    SitePoint Addict Trent Reimer's Avatar
    Join Date
    Sep 2005
    Location
    Canada
    Posts
    228
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Glad to hear it worked. The "." is a wildcard which can represent any character.

    Here's a Sitepoint article on regular expressions which might possibly be helpful:

    http://www.sitepoint.com/blogs/2006/...expressions-1/


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •