SimpleXML and namespaces

Tweet

There’s a lot about SimpleXML, PHP5’s new API for accessing the contents of XML documents, in SitePoint’s recently-published book No Nonsense XML Web Development With PHP, but one thing it doesn’t cover is how to use SimpleXML with a document that makes use of XML Namespaces.

Take this document, for example–a simplified RSS 1.0 feed:


<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel rdf:about="http://www.sitepoint.com/">
    <title>SitePoint.com</title>
    <link>http://www.sitepoint.com/</link>
    <description>SitePoint is the natural place to go to grow your online business.</description>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://www.sitepoint.com/article/take-command-ajax" />
      </rdf:Seq>
    </items>
  </channel>
  <item rdf:about="http://www.sitepoint.com/article/take-command-ajax">
    <title>Take Command with AJAX</title>
    <link>http://www.sitepoint.com/article/take-command-ajax</link>
    <description>Want to get a bang out of your AJAX artillery?</description>
    <dc:date>2005-10-14T04:00:00Z</dc:date>
  </item>
</rdf:RDF>

In PHP5, here’s how you might think to use SimpleXML’s API to get at the date of every item in the feed:


$feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf');
foreach ($feed->item as $item) {
  echo $item->date;
}

But this won’t work, because the date element has a namespace prefix (<dc:date>), so it can’t be accessed by the usual means.

Here’s the solution. First, check what the URI is for the namespace. In this case, the dc: prefix maps to the URI http://purl.org/dc/elements/1.1/:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">

Then use the children method of the SimpleXML object, passing it that URI:


$feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf');
foreach ($feed->item as $item) {
  $ns_dc = $item->children('http://purl.org/dc/elements/1.1/');
  echo $ns_dc->date;
}

When you pass the namespace URI to the children method, you get a SimpleXML collection of the child elements belonging to that namespace. You can work with that collection the same way you would with any SimpleXML collection.

You can use the attributes method in the same way to obtain attributes with namespace prefixes.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Ian

    Could we please have an example for the attributes please? I have failed to use the attributes method to get at a name spaced attribute contained in a name spaced element (and caused my scalp to bleed in the process!!!). In the end I regex’ed out the attribute to its own element so I could get SimpleXML to work; not so simple!

  • http://www.sitepoint.com/ Kevin Yank

    Hmm okay. To get at the rdf:resource attribute in the rdf:li tag in the example above:

    
    $feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf');
    $items = $feed->channel->items;
    $items_rdf = $items->children('http://www.w3.org/1999/02/22-rdf-syntax-ns#');
    $Seq_rdf = $items_rdf->Seq->children('http://www.w3.org/1999/02/22-rdf-syntax-ns#');
    $lis = $Seq_rdf->li;
    foreach ($lis as $li) { 
      $li_attr_rdf = $li->attributes('http://www.w3.org/1999/02/22-rdf-syntax-ns#');
      $resource = $li_attr_rdf['resource'];
      echo $resource . "n";
    }
    
  • Ian

    Thanks for the pointers Kevin!

  • Pingback: the rasx() context » Blog Archive » KB Research Links

  • joeblow

    What about a line that has both a namespace and an attribute such as the following:

    [broken code sample removed -Ed]

    How would I grab the itunes:image reference in the preceding xml?
    I’ve tried the following code snippet to no avail(and many others):

    foreach($rss->channel as $channel)
    {
    $channel_itunes = $channel->children(‘http://www.itunes.com/dtds/podcast-1.0.dtd’);
    $image = $channel_itunes->image['url'];
    }

    Any help would be greatly appreciated.

  • http://www.sitepoint.com/ Kevin Yank

    joeblow,

    Could you post your code sample again and escape your special characters (e.g. &lt;)?

  • joeblow

    <channel>
    <itunes:image href=”some.link.com” type=”video”>
    <

    I actually figured out the solution using your documentation and some experimentation.

    foreach($rss->channel as $channel)
    {
    $channel_itunes = $channel->children(’http://www.itunes.com/dtds/podcast-1.0.dtd’);
    $image_items = $channel_itunes->attributes();
    $image = $image_items['href'];
    }

  • joeblow

    Is there any way to see what is contained in the buffer for $channel_itunes in the above example. I have tried the Zend debugger but is just states “Object of: SimpleXMLElement” and print_r gives me an empty SimpleXMLElement Object. If I could see the data that was contained in the buffer, I would be able to more accurately troubleshoot problems I was having without guessing as I am now.

    Thanks.

  • Anonymous

    Is there any way to (automatically) get the xmlns:dc URL from the code? I’d like to find the URI in code, but all examples have it hard-coded.

  • http://www.sitepoint.com/ Kevin Yank

    Anonymous,

    Sure — just use the getNamespaces method: http://www.php.net/manual/en/function.simplexml-element-getNamespaces.php

  • greg bass

    Kevin,
    Thank you for this article. I bought the book previously and this missing topic is just what I’m stuck on. If you could indulge my ignorance, I am still stuck on how to parse multiple namespace items with multiple attributes on the same level. I am trying to parse the yahoo weather rss feed:

    <channel>

    <yweather:location city="Tombstone" region="Arizona" country="US"><yweather:location>

    <yweather:astronomy sunrise="7:02am" sunset="4:51pm"><yweather:astronomy>

    <channel>

    how do I get to sunrise for instance? $sunrise = ?
    Thanks!

  • Anonymous

    SimpleXML and namespaces are sooo gay. This should be handled the same as non-namespaced attributes.

  • whatever

    is there a way to escape “” in xml file only. I want those characters to be recognized in html though.

  • Crashdaddy

    That’s definitely dugg. Thank you!

  • Tim

    Great article. I have been trying to get a value from a node with a namespace for quite some time now and couldn’t quite figure it out.

    Thanks.

  • http://www.philly.com mtorbin

    Kevin, great article! This helped a lot. Question for you: how would I handle multiple namespace prefixes, such as the following:

    If I wanted to get the values of attribute one and two of the widget, how would that be done?

    Thanks,

    – MT

  • http://www.philly.com mtorbin

    Kevin, great article! This helped a lot. Question for you: how would I handle multiple namespace prefixes, such as the following:

    [item:group]
    [widget:typeA attr1="" attr2=""/]
    [/item:group]

    If I wanted to get the values of attribute one and two of the widget, how would that be done?

    Thanks,

    – MT

  • Adrian Smith

    That’s brilliant, thanks a lot!