Getting Data from within the Document

alexp91k · February 7, 2010, 4:40am

I’m studying JavaScript right now and learned that you can, for example, read the data within all the <h1> tags on a page and create links to them somewhere else in the page.

Perhaps I’m having a brain burp, but can you do this with PHP also? Like, search all the <h1> tags in a page (the equivalent in JavaScript is document.getElementsByTagName(“h1”)), get their content (innerHTML), and then use those Strings somewhere else in the page (to maybe, make links)?

Hopefully I’ve been clear with my question, and this is probably possible, I think I’m just forgetting or not thinking properly on how to do it.

Help is appreciated; thank you.

Dan_Grossman · February 7, 2010, 4:51am

Yes, you can parse text in any programming language. You can either use an XML parser and treat the webpage as an XML document, or you can use regular expressions and just look for patterns in the text (like the letters <h1> followed by some text followed by </h1>).

http://php.net/manual/en/book.simplexml.php
http://php.net/manual/en/book.xml.php
http://php.net/manual/en/function.preg-match-all.php

alexp91k · February 7, 2010, 4:58am

Oh okay yea I was hoping there’d be an easier way to interact with the DOM than to actually scan all of the text in the whole document.

Thanks for the quick answer.

AlienDev · February 7, 2010, 5:35am

Don’t use regex for a task like this. Instead, use the DOMDocument class. It has lots of method the same as JavaScript.


$doc = new DOMDocument();
$doc->fromXHTML('.......');
$doc->getElementsByTagName('h1');

alexp91k · February 7, 2010, 5:42am

AlienDev:

Don’t use regex for a task like this. Instead, use the DOMDocument class. It has lots of method the same as JavaScript.
$doc = new DOMDocument();
$doc->fromXHTML('.......');
$doc->getElementsByTagName('h1');

Ah, now that’s what I was looking for. Thanks a lot!

Topic		Replies	Views
Get a html tag and store in a php variable PHP	3	10365	October 8, 2014
Need Help Please PHP	2	286	January 22, 2010
simpleXML Data PHP	50	4718	June 19, 2015
getElementById for php possible? PHP	6	509	March 12, 2010
Get information from html page PHP	5	1753	January 10, 2018

Getting Data from within the Document

Related topics