I am doing some experimenting with XPath via the PHP SimpleXMLElement class. I want to perform certain queries to retrieve various elements (clearly, since I'm using XPath queries to do it). However, I am running into an issue when I appear to have adjacent text and element nodes.
<p>The color <span class="color">orange</span> has always been my favorite color.</p>
I have looked at the W3C specification for XML (just to verify that this is valid markup, even though I know it is) and found this definition:
3.2.2 Mixed Content
[Definition: An element type has mixed content when elements of that type may contain character data, optionally interspersed with child elements.]
Ok, so maybe the problem is with my XPath queries? This is what I tried ($p is an instance of SimpleXMLElement representing the p element):
$content = $p->xpath('child::*'); //get all children of p. returns a SimpleXML object containing the text 'orange'
$content = $p->xpath('child::text()'); //get all text nodes which are children of p. returns 2 SimpleXMLElement objects representing just the span element!
Any similar queries targeting the same elements return the same thing. So, in the first case(get all children), only a text node containing 'orange' seems to be recognized, but in the second (all children that are text nodes) 2 copies of the span element itself seem to be the only things recognized! The rest of the text, which I thought would be contained in two text nodes, is never recognized. I am way confused right now. Thoughts?