XMLReader coming to PHP5 / PECL

Tweet

Via Christian, looks like there’s going to be a fast XML Pull implementation coming to PHP5, based on libxml2 (the XML library behind PHP5).

Eventually this should be published via PECL (and I guess, for the near future, not something you can expect your shared host to support) but right now the source is online at http://cvs.php.net/cvs.php/pecl/xmlreader/.

Essentially it means an (arguably) easier way of parsing XML than with SAX. Philip’s article Back to Basics: XML In .NET shows .NET’s XMLReader in action here and it PHP it would be much the same.

Based on looking at the source code for the PHP xmlreader, using it would be something like;


// Create the parser
$reader = new XMLReader();

// Load some XML in a string
$reader->XML($xml);

// Loop through the contents
while ($reader->read()) {
switch ( $reader->nodeType ) {

// Tag start
case XMLREADER_ELEMENT:
echo 'Got opening tag for '.reader->name;
break;

// Tag start
case XMLREADER_END_ELEMENT:
echo 'Got closing tag for '.reader->name;
break;
}
}

In other words, rather than SAX call backs, you can handle the entire document in a single loop – great for hacking!

Also interesting, according the the original C examples of xmlreader, it has the ability to validate XML against DTD or a Relax NG schema. Right now I believe the latest DOM extensions can validate an entire XML document and give you a “yes / no” but this would seem to allow you to validate individual nodes in the document.

According to A Survey of APIs and Techniques for Processing XML, there are essentially 5 common techniques for parsing XML. Here’s how it will look with PHP5

- Push: the SAX extension

- Pull: this new xmlreader extension

- Tree: the DOM extension

- Object Mapping: Simple XML

- Cursor: nothing yet I believe

All of these are built on libxml2 in PHP5. According to benchmarks like http://xmlbench.sourceforge.net/, libxml2 is more or less the fastest XML parser out there.

Compared to the state of XML in PHP4, where developers have done amazing things with SAX, PHP5 looks like a dream come true…

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.realistanew.com Travis

    Yeah, that XPath class is incredible! I try to use it whenever I’m dealing with XML. I even snuck it into my latest project, phpTunes (don’t think i should give a link). It makes things much easier. Instead of the complex regex and SAX I was using its just a couple commands and I have all the info I need. I might have to check out this XMLReader though.

  • kosso

    these new functions just saved my life!!!

    I have been having great diffuculty with the new simplexml functions and reading namespaced elements (and attributes)

    now, with XMLReader, it is so easy to get the info I need! Yay!

    :)