This is what I have, I was tasked to create an object that would take namespaced XML and remove all occurrences of the namespace.
It works, in fact, it works really well....I've extended the DOMDocument as to allow easier manipulation with it's methods.
I've tested it on internal, Youtube and Yahoo namespaced XML and it works a treat, but there's still work to be done I think.
Class.NamespaceNulledXMLDocument.php
PHP Code:
<?php
/**
* @author Anthony David Sterling
*
* @copyright Anthony David Sterling
*
* @version 1.0
*
* @desc Removes namespace prefixes
* programmatically based on the
* declared namespaces contained
* within the document itself.
*
* @todo
*/
Class NamespaceNulledXMLDocument Extends DOMDocument
{
/**
* @var Boolean Indicates whether we found any namespaces to remove.
*/
public $foundNamespaces = false;
/**
* @var Array Holds all namespaces we found and processed.
*/
public $removedNamespaces = array();
/**
* @param String $sXML An XML document in string form.
* @return Void
*/
public function __construct( $sXML )
{
//--> Call the DOMDocument constructor to instantiate.
parent::__construct();
//--> Load our 'cleaned' XML into the DOMDocument.
parent::loadXML( self::removeNamespaces($sXML) );
}
/**
* @param String $sXML An XML document in string form.
* @return String
*/
private function removeNamespaces( $sXML )
{
//--> Collecting all 'declared' namespace prefixes from within the XML document itself.
$aNamespaces = ( preg_match_all( '/(?<=xmlns:).*?(?==)/' , $sXML , $aMatches , PREG_PATTERN_ORDER ) ) ? $aMatches[0] : array() ;
//--> Before continuing, we'll check if we actually have any prefixes to remove.
if( count($aNamespaces) > 0 )
{
//--> Logging the fact we found some namespaces to remove.
$this->foundNamespaces = true;
//--> Walking through each of the matched namespace prefixes and removing the prefix from every element.
foreach ( $aNamespaces as $sNamespace )
{
//--> Log the namespace we are processing.
$this->removedNamespaces[] = $sNamespace;
//--> Replace any node with a namespace.
$sXML = preg_replace( "%(?<=<|</){$sNamespace}:%" , '' , $sXML );
//--> Replace any attribute with a namespace.
$sXML = preg_replace( "/(?<=\\s){$sNamespace}:(?=.*?=\".*?\")/" , '' , $sXML );
}
//--> Return the new, clean, all singing, all dancing namespace nulled XML.
return $sXML;
}
else
{
//--> Hmm, no namespaces to remove, so we'll just return the original XML that was supplied.
return $sXML;
}
}
}
?>
Usage.php
PHP Code:
<?php
//--> Load the class.
require_once( 'class.NamespaceNulledXMLDocument.php' );
//--> Obtain the namespaced XML.
$sXMLData = @file_get_contents('sample.xml');
//--> Create a NamespaceNulledXMLDocument, passing it the namespaced XML.
$oXML = new NamespaceNulledXMLDocument( $sXMLData );
//--> Using DOMDocument methods, format the output.
$oXML->formatOutput = true;
//--> Output the XML.
echo $oXML->saveXML();
?>
There's a couple of niggles...I'd prefer to use one single Regular Expression at the following:-
PHP Code:
//--> Replace any node with a namespace.
$sXML = preg_replace( "%(?<=<|</){$sNamespace}:%" , '' , $sXML );
//--> Replace any attribute with a namespace.
$sXML = preg_replace( "/(?<=\\s){$sNamespace}:(?=.*?=\".*?\")/" , '' , $sXML );
Maybe this will be of use to somebody, I think the other guys are importing the object into SimpleXML for easier traversing which is possible because it extends DOMDoc.
SilverB.
Bookmarks