DomDocument

Up until now I’ve been using Curl and a Regex to extract data from a webpage, and while it works well I believe it might run faster and more efficient in DomDocument. The HTML below is an extract of the data I am working with, and you’ll see this outputs “City: Cheshire, UK”.

		<div class="user-details-narrow">
			<div class="profileheadtitle">
				<span class="headline txtBlue size15">
					City
				</span>
			</div>
			<div class="profileheadcontent-narrow txtGrey size15">
				Cheshire, UK
			</div>
		</div>

However from all the tutorials I have seen, they have the name (in this case, city) within the class name which makes it easy to capture. However with all the data displayed on this site use all the same class names name and put the name as text. If that makes sense?

Here’s another example…

		<div class="user-details-narrow">
			<div class="profileheadtitle">
				<span class="headline txtBlue size15">
					Age
				</span>
			</div>
			<div class="profileheadcontent-narrow txtGrey size15">
				22
			</div>
		</div>

How can I capture the city name, or age, or whatever else is displayed ?

I see you’ve been working on this for the better part of a year.

Maybe you need to rethink how you’re going about doing this.

What site is providing the information?

Well I have already finished the application in question and it’s working 99% of the time. However due to using regex it only takes one user to enter someone stupid into their “city” and it can mess up what is being captured. If I could do this in Dom, then it should know exactly where to start and stop.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.