SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    SitePoint Guru davedibiase's Avatar
    Join Date
    Aug 2001
    Location
    Toronto, Canada
    Posts
    829
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    regex image finding

    Hi there,

    I have this regex command that is currently finding a list of images in a string but the issue is it's not designed to scan for cases where no quotes are available. Take for example:

    <img src=http://www.sitepointforums.com/images/main.gif>

    Believe it or not some people are crazy enough to work that into their code. So this is what I currently have:

    preg_match_all("/src=[\"']([^\"']+)/", $fixedText, $sub, PREG_SET_ORDER);

    The trouble is how do I count for a quote less tags. I've tried a few variations but quite honestly I'm still learning regex. Feel like I'm crawling actually spite the material I'm reading.

    I'm thinking somehow I need to check for the quotes if they are there, but disregard them if necessary. The other thing is I'm afraid to change it as this line is really important to a piece of software.

    If I change it I'll need to be 100% positive it's right.

    *gets on knees and begs* haha.
    ||Dave Di Biase||
    ----------------------------------
    "There are 2 secrets in life. 1) Never say everything you know."
    GFXWARS - The ultimate graphics battle!

  2. #2
    I meant that to happen silver trophybronze trophy Raffles's Avatar
    Join Date
    Sep 2005
    Location
    Tanzania
    Posts
    4,662
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    PHP Code:
     preg_match_all("/src=[\"']?([^\"']+)[\"']?/"$fixedText$subPREG_SET_ORDER); 
    The question mark means "0 or 1 of the preceding character/range". I'm no genius at regex, but I tried it and it seems to work.

  3. #3
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally, the quote through not quote limitation was keeping the regex from overmatching.

    Just eyeballing so I might be off, but without quotes, I suspect [^\"']+ will zip straight past spaces and closing >, all the way to the next quote. I think you toss a > and a space into the "not these characters" block and you might be good.
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?

  4. #4
    SitePoint Guru davedibiase's Avatar
    Join Date
    Aug 2001
    Location
    Toronto, Canada
    Posts
    829
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ok so I got this:

    PHP Code:
    preg_match_all("/src=[\"']?([^\"']+)[\"'> ]?/"$fixedText$subPREG_SET_ORDER); 
    Adding the space to the "not these characters block" would be dangerous wouldn't it? What if a URL comes in with a space? ie. <img src=http://www.sitepointforums.com/images/this image is cool.gif>

    or about the scenario:

    <img src=http://www.sitepointforums.com/images/this image is cool.gif alt=The cool image>

    It's bad HTML structure I agree and I really don't have to add it because not even 1&#37; of users do it, but for the sake of allowing for greater support I was hoping to perfect the function. hehe.

    Any ideas?
    ||Dave Di Biase||
    ----------------------------------
    "There are 2 secrets in life. 1) Never say everything you know."
    GFXWARS - The ultimate graphics battle!

  5. #5
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I see what you mean now ... you want to support spaces when they are in quotes, that's reasonable.
    Supporting spaces without quotes would be ridiculous.

    This might actually be a job for reversed greed.

    /src=['"]?([^\"']+?)['"]?[> ]/

    I don't know.
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?

  6. #6
    SitePoint Guru davedibiase's Avatar
    Join Date
    Aug 2001
    Location
    Toronto, Canada
    Posts
    829
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You mean when they aren't in quotes don't you?
    ||Dave Di Biase||
    ----------------------------------
    "There are 2 secrets in life. 1) Never say everything you know."
    GFXWARS - The ultimate graphics battle!

  7. #7
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by davedibiase View Post
    You mean when they aren't in quotes don't you?
    Is there any browser that will actually render this?

    <img src=http://www.sitepointforums.com/images/this image is cool.gif alt=The cool image>

    I'm guessing every browser out there will, at best, try for:
    http://www.sitepointforums.com/images/this
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?

  8. #8
    SitePoint Guru davedibiase's Avatar
    Join Date
    Aug 2001
    Location
    Toronto, Canada
    Posts
    829
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yeah, good point - haha. Ok I've determined another case that I technically should account for. src= can be either in caps or no caps. I've tried looking for some sort of case sensitivity feature to shut off, if that makes sense.

    How would I do that?

    Thanks again for your help my friend!
    ||Dave Di Biase||
    ----------------------------------
    "There are 2 secrets in life. 1) Never say everything you know."
    GFXWARS - The ultimate graphics battle!

  9. #9
    Worship the Krome kromey's Avatar
    Join Date
    Sep 2006
    Location
    Fairbanks, AK
    Posts
    1,621
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Simply add 'i' after your pattern's closing delimiter:
    /src=['"]?([^\"']+?)['"]?[> ]/i
    PHP questions? RTFM
    MySQL questions? RTFM


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •