SitePoint Sponsor

User Tag List

Results 1 to 17 of 17

Hybrid View

  1. #1
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,048
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Need some help with scope of programming

    Hi Guys

    I need some advice with regards to how to code something. I am using CURL to retrieve the html on a page and on that page is a <select> field. I would like PHP to extract the highest value from the select box, please take below as an example:

    Code:
    <select id="provide_vrm:prVRMfrag:prVRMCon:vrmRegistered" class="inputTextBox provideVrmWidth" size="1" name="provide_vrm:prVRMfrag:prVRMCon:vrmRegistered">
    <option selected="selected" value="0">Select a vehicle</option>
    <option value="1">REG1</option>
    <option value="2">REG2</option>
    </select>
    How can I extract the highest option value? Ie. in this instance the value I want returned is 2 (value="2")

    Please help! J

  2. #2
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,493
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    You can extract all values in an array with preg_match_all and a regular expression, something like
    PHP Code:
    preg_match_all('%<option[.]* value="([^"]+)"%'$yourdata$matchesPREG_PATTERN_ORDER); 
    where $yourdata contains the html code you got with curl.

    Do a var_dump of $matches to see the result.

    Then get the highest value from the array (take a look at rsort)

  3. #3
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA, USA
    Posts
    3,748
    Mentioned
    69 Post(s)
    Tagged
    0 Thread(s)
    er... be careful doing that, Guido - if there's more than one select box on the page (Like... a language dropdown?), that could end up giving some very bad responses.

    Lets make sure we get the -specific- box we're after.
    Something a bit more like...
    PHP Code:
    preg_match_all('%<select id="provide_vrm:prVRMfrag:prVRMCon:vrmRegistered".*?(<option.*? value="([^"]+)">.+?</option>)+</select>%'$yourdata$matchesPREG_PATTERN_ORDER); 
    Perhaps?
    (Note: This will change the location of your desired values in the $matches array, because we added another subpattern)
    Never grow up. The instant you do, you lose all ability to imagine great things, for fear of reality crashing in.

  4. #4
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,493
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by StarLion View Post
    er... be careful doing that, Guido - if there's more than one select box on the page (Like... a language dropdown?), that could end up giving some very bad responses.
    I know, I based my answer on the info in the OP
    on that page is a <select> field

  5. #5
    Hosting Team Leader silver trophybronze trophy
    cpradio's Avatar
    Join Date
    Jun 2002
    Location
    Ohio
    Posts
    5,053
    Mentioned
    152 Post(s)
    Tagged
    0 Thread(s)
    I believe there are DOM methods you could use in PHP to walk through the HTML hierarchy to get to the exact select box too.
    http://fr.php.net/manual/en/domdocum...sbytagname.php
    Be sure to congratulate Patche on earning July's Member of the Month
    Go ahead and blame me, I still won't lose any sleep over it
    My Blog | My Technical Notes

  6. #6
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    930
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by cpradio View Post
    I believe there are DOM methods you could use in PHP to walk through the HTML hierarchy to get to the exact select box too.
    http://fr.php.net/manual/en/domdocum...sbytagname.php
    Can DOM be used to parse HTML that is not XHTML?

  7. #7
    Hosting Team Leader silver trophybronze trophy
    cpradio's Avatar
    Join Date
    Jun 2002
    Location
    Ohio
    Posts
    5,053
    Mentioned
    152 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    Can DOM be used to parse HTML that is not XHTML?
    Based on the comments, I would say yes.
    Be sure to congratulate Patche on earning July's Member of the Month
    Go ahead and blame me, I still won't lose any sleep over it
    My Blog | My Technical Notes

  8. #8
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    930
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by cpradio View Post
    Based on the comments, I would say yes.
    OK, thanks.

  9. #9
    om nom nom nom Stomme poes's Avatar
    Join Date
    Aug 2007
    Location
    Netherlands
    Posts
    10,266
    Mentioned
    50 Post(s)
    Tagged
    2 Thread(s)
    If a browser can do it, you can too. With all the mistakes browsers also make when the HTML is bad :P

  10. #10
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,135
    Mentioned
    16 Post(s)
    Tagged
    3 Thread(s)
    I would recommend using query path which makes this and a whole lot more super simple when it comes to crawling strings of mark-up.
    The only code I hate more than my own is everyone else's.

  11. #11
    SitePoint Zealot
    Join Date
    Jan 2011
    Location
    Portland
    Posts
    148
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I would advise against using Regex for matching html attributes as that leaves you prone to many errors. As suggested use a DOM parser witch php does have many bolted on.

    http://php.net/manual/en/refs.xml.php
    coming soon sitejuju.com my new development portfolio

  12. #12
    om nom nom nom Stomme poes's Avatar
    Join Date
    Aug 2007
    Location
    Netherlands
    Posts
    10,266
    Mentioned
    50 Post(s)
    Tagged
    2 Thread(s)
    Using regex to parse HTML? Oh my. This calls for some Zalgo.

    http://stackoverflow.com/questions/1...732454#1732454

    See this as a ++ to jgetner's suggestion of using a parser to parse. Lives will be saved. Hair will remain on head. Orphan children will simply grow old without fulfilling prophesies of wizardry, and instead will marry overweight suburbanites and work in insurance until they retire.

    Though querypath reminds me of Python's libxml, also sounds good.

  13. #13
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,493
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by Stomme poes View Post
    Using regex to parse HTML? Oh my.

    See this as a ++ to jgetner's suggestion of using a parser to parse. Lives will be saved. Hair will remain on head. Orphan children will simply grow old without fulfilling prophesies of wizardry, and instead will marry overweight suburbanites and work in insurance until they retire.
    Yeah yeah, I got it...

  14. #14
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    6 Thread(s)
    Quote Originally Posted by Stomme poes View Post
    Using regex to parse HTML? Oh my. This calls for some Zalgo.

    http://stackoverflow.com/questions/1...732454#1732454

    See this as a ++ to jgetner's suggestion of using a parser to parse. Lives will be saved. Hair will remain on head. Orphan children will simply grow old without fulfilling prophesies of wizardry, and instead will marry overweight suburbanites and work in insurance until they retire.

    Though querypath reminds me of Python's libxml, also sounds good.
    Wow that is funny stuff. Talk about beating a dead horse
    ictus==""

  15. #15
    Floridiot joebert's Avatar
    Join Date
    Mar 2004
    Location
    Kenneth City, FL
    Posts
    823
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The first thing I'd do, since the element has a proper ID attribute, is use simple string methods to extract that <select> element from the source. strpos to find the start position of that particular <select>, strpos to find the position of the <select> element's closing tag, and substr to extract it.

    Then I'd pass the extracted string to one of the DOM libraries mentioned.

  16. #16
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,048
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your answers guys but one final question:

    How can I extract an iframe from HTML? I.e. I just want to return the src of the iframe, lets take the following example:

    <iframe title="paymentServicesiframe" id="paymentServicesiframe" src ="https://ips.ihost.com/hpp/checkout.hpp?sessionId=ADWSGET716SJWY2" frameborder="0" align="middle" scrolling="no" height="460px" width="709px"> Your browser does not support in-line frames or is currently configured not to display in-line frames. </iframe>

  17. #17
    om nom nom nom Stomme poes's Avatar
    Join Date
    Aug 2007
    Location
    Netherlands
    Posts
    10,266
    Mentioned
    50 Post(s)
    Tagged
    2 Thread(s)
    src is but an attribute of the iframe tag. You would grab it the same way you would grab any other element's attributes.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •