Need some help with scope of programming

Zaggs · December 12, 2012, 12:30pm

Hi Guys

I need some advice with regards to how to code something. I am using CURL to retrieve the html on a page and on that page is a <select> field. I would like PHP to extract the highest value from the select box, please take below as an example:

<select id="provide_vrm:prVRMfrag:prVRMCon:vrmRegistered" class="inputTextBox provideVrmWidth" size="1" name="provide_vrm:prVRMfrag:prVRMCon:vrmRegistered">
<option selected="selected" value="0">Select a vehicle</option>
<option value="1">REG1</option>
<option value="2">REG2</option>
</select>

How can I extract the highest option value? Ie. in this instance the value I want returned is 2 (value=“2”)

Please help! J

DarthGuido · December 12, 2012, 12:47pm

You can extract all values in an array with preg_match_all and a regular expression, something like


preg_match_all('%<option[.]* value="([^"]+)"%', $yourdata, $matches, PREG_PATTERN_ORDER);

where $yourdata contains the html code you got with curl.

Do a var_dump of $matches to see the result.

Then get the highest value from the array (take a look at rsort)

StarLion · December 12, 2012, 1:41pm

er… be careful doing that, Guido - if there’s more than one select box on the page (Like… a language dropdown?), that could end up giving some very bad responses.

Lets make sure we get the -specific- box we’re after.
Something a bit more like…

preg_match_all('%<select id="provide_vrm:prVRMfrag:prVRMCon:vrmRegistered".*?(<option.*? value="([^"]+)">.+?</option>)+</select>%', $yourdata, $matches, PREG_PATTERN_ORDER);

Perhaps?
(Note: This will change the location of your desired values in the $matches array, because we added another subpattern)

DarthGuido · December 12, 2012, 2:47pm

I know, I based my answer on the info in the OP

on that page is a <select> field

cpradio · December 12, 2012, 4:22pm

I believe there are DOM methods you could use in PHP to walk through the HTML hierarchy to get to the exact select box too.
http://fr.php.net/manual/en/domdocument.getelementsbytagname.php

oddz · December 12, 2012, 5:29pm

I would recommend using query path which makes this and a whole lot more super simple when it comes to crawling strings of mark-up.

jgetner · December 12, 2012, 6:36pm

I would advise against using Regex for matching html attributes as that leaves you prone to many errors. As suggested use a DOM parser witch php does have many bolted on.

http://php.net/manual/en/refs.xml.php

Stomme_poes · December 13, 2012, 10:20pm

Using regex to parse HTML? Oh my. This calls for some Zalgo.

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

See this as a ++ to jgetner’s suggestion of using a parser to parse. Lives will be saved. Hair will remain on head. Orphan children will simply grow old without fulfilling prophesies of wizardry, and instead will marry overweight suburbanites and work in insurance until they retire.

Though querypath reminds me of Python’s libxml, also sounds good.

joebert · December 15, 2012, 6:50pm

The first thing I’d do, since the element has a proper ID attribute, is use simple string methods to extract that <select> element from the source. strpos to find the start position of that particular <select>, strpos to find the position of the <select> element’s closing tag, and substr to extract it.

Then I’d pass the extracted string to one of the DOM libraries mentioned.

DarthGuido · December 15, 2012, 8:19pm

Yeah yeah, I got it…

ServerStorm · December 16, 2012, 2:08pm

Wow that is funny stuff. Talk about beating a dead horse

Lemon_Juice · December 16, 2012, 5:53pm

Can DOM be used to parse HTML that is not XHTML?

cpradio · December 16, 2012, 6:01pm

Based on the comments, I would say yes.

Lemon_Juice · December 16, 2012, 7:45pm

OK, thanks.

Stomme_poes · December 17, 2012, 8:52pm

If a browser can do it, you can too. With all the mistakes browsers also make when the HTML is bad

Zaggs · December 19, 2012, 10:20am

Thanks for your answers guys but one final question:

How can I extract an iframe from HTML? I.e. I just want to return the src of the iframe, lets take the following example:

Stomme_poes · December 19, 2012, 2:13pm

src is but an attribute of the iframe tag. You would grab it the same way you would grab any other element’s attributes.

Topic		Replies	Views
Problem in Extracting values PHP	6	714	May 23, 2013
Work with html dom for crawl websites PHP scripts	3	1574	May 21, 2016
Get a html tag and store in a php variable PHP	2	10452	October 27, 2009
Help me with php curl and DOM? PHP	4	6985	February 24, 2014
preg_match help please PHP	11	2086	August 9, 2010

Need some help with scope of programming

Related topics