XMLReader coming to PHP5 / PECL

    Harry Fuecks
    Harry Fuecks

    Via Christian, looks like there’s going to be a fast XML Pull implementation coming to PHP5, based on libxml2 (the XML library behind PHP5).

    Eventually this should be published via PECL (and I guess, for the near future, not something you can expect your shared host to support) but right now the source is online at http://cvs.php.net/cvs.php/pecl/xmlreader/.

    Essentially it means an (arguably) easier way of parsing XML than with SAX. Philip’s article Back to Basics: XML In .NET shows .NET’s XMLReader in action here and it PHP it would be much the same.

    Based on looking at the source code for the PHP xmlreader, using it would be something like;

    // Create the parser
    $reader = new XMLReader();

    // Load some XML in a string

    // Loop through the contents
    while ($reader->read()) {
    switch ( $reader->nodeType ) {

    // Tag start
    echo 'Got opening tag for '.reader->name;

    // Tag start
    echo 'Got closing tag for '.reader->name;

    In other words, rather than SAX call backs, you can handle the entire document in a single loop – great for hacking!

    Also interesting, according the the original C examples of xmlreader, it has the ability to validate XML against DTD or a Relax NG schema. Right now I believe the latest DOM extensions can validate an entire XML document and give you a “yes / no” but this would seem to allow you to validate individual nodes in the document.

    According to A Survey of APIs and Techniques for Processing XML, there are essentially 5 common techniques for parsing XML. Here’s how it will look with PHP5

    – Push: the SAX extension

    – Pull: this new xmlreader extension

    – Tree: the DOM extension

    – Object Mapping: Simple XML

    – Cursor: nothing yet I believe

    All of these are built on libxml2 in PHP5. According to benchmarks like http://xmlbench.sourceforge.net/, libxml2 is more or less the fastest XML parser out there.

    Compared to the state of XML in PHP4, where developers have done amazing things with SAX, PHP5 looks like a dream come true…