Parsing xml using php class - small help!

Hai folks,

i am using this (http://www.phpclasses.org/package/4-PHP-Arbitrary-XML-parser-.html#description) php class to parse xml eliments.
below function in the class out puts entire xml document.
but i want only it to print the data that resides in the <wp:firstname> tag. For this, i am strugling where to put the condition and how will be the condition.

Function DumpStructure(&$structure,&$positions,$path)
{
	//echo "[".$positions[$path]["Line"].",".$positions[$path]["Column"].",".$positions[$path]["Byte"]."]";
	
	if(GetType($structure[$path])=="array")
	{
		
		echo "<".$structure[$path]["Tag"].">"; 
		
		for($element=0;$element<$structure[$path]["Elements"];$element++)
			DumpStructure($structure,$positions,$path.",$element");
	
		echo "</".$structure[$path]["Tag"].">"; 
	}
	else
		echo $structure[$path];
}

Sample xml data

    <wp:listing>
      <wp:people>
        <wp:person>
          <wp:firstname>Jeffrey</wp:firstname>Array 0,5,1,1,1,1
          <wp:middlename>E</wp:middlename>Array 0,5,1,1,1,3
          <wp:lastname>Thompson</wp:lastname>Array 0,5,1,1,1,5
        </wp:person>Array 0,5,1,1,1
        <wp:person>
          <wp:firstname>Kelsey</wp:firstname>Array 0,5,1,1,3,1
          <wp:middlename>A</wp:middlename>Array 0,5,1,1,3,3
          <wp:lastname>Thompson</wp:lastname>Array 0,5,1,1,3,5
        </wp:person>Array 0,5,1,1,3
        <wp:person>

Why not use SimpleXML? It is generally available in most PHP installs.

From: http://www.php.net/manual/en/function.gettype.php

"Warning

Never use gettype() to test for a certain type, since the returned string may be subject to change in a future version. In addition, it is slow too, as it involves string comparison.

Instead, use the is_* functions."

Use is_array() instead.

Recursively calling a function without a depth tracker is a bad idea. If a hacker figures out how to get your application to read really deep XML structures, your application could crash the web server. PHP has no defense against recursion gone crazy, unless an extension such as XDebug is installed.

http://ilia.ws/archives/5_Top_10_ways_to_crash_PHP.html
(Some of those aren’t valid any more but recursion has never been fixed.)

Anyway, based on the example, it looks like the package is generating an array mapping using numbers as the keys. That’s going to be really messy to try to pick out just the wp:firstname. With SimpleXML, you could write something like:

$data = file_get_contents("somefile.xml");
$xml = simplexml_load_string($data);
foreach ($xml->{'wp:people'}->{'wp:person'} as $person)
{
  $firstname = (string)$person->{'wp:firstname'};
  echo $firstname . "\
";
}

Don’t forget to cast elements from object types to native types (e.g. using the ‘string’ cast above). Otherwise weird stuff typically happens later.

90% of the stuff on the PHP Classes website either doesn’t work, doesn’t work well, or is just written badly. I only trust the code on PECL and PEAR. The use of gettype() alone sends warning bells to my mind to avoid that class.

Thanks for the awesome reply thruska!
ill try your suggessions !!!

it prints every piece of data correctly (not line by line though) with an error at last :slight_smile:

Warning: Invalid argument supplied for foreach() in /homepages/24/d232211843/htdocs/xxxx.com/xxxxxx/searching.php on line 54

54 .

 foreach ($xml->{'wp:people'}->{'wp:person'} as $person)

First of all your xml is not valid xml because some texts are out of the node valid tags like ‘Array…’. If your xml is valid then it is easier to parse the xml with simplexml xpath

I was somewhat guessing as to the XML file you are parsing. Apparently I got close because you have data displaying correctly. (Do a “View -> Source” to see it “line by line”.)

Sounds like you might have weird XML being parsed or I messed up a little. Could you post the EXACT XML somewhere? The “Array” stuff in your first example shouldn’t validate at all and I’m not sure where it came from. Also post some of the output you are getting and the code you are using (if possible). I know I’m asking for a lot but there’s not much to go on at this point. Provide whatever information you are comfortable with but having a working example to use goes a long way to figuring out a solution.

Thanks rajug,
ill consider simplexml xpath as well.

sorry i messed up,
your code in the first post did not produced any out put.
the out put was result of a echo or print_r in my code. just now only notice. sorry. :smiley:

your code gave the below message.

$data = file_get_contents("response.xml");

$xml = simplexml_load_string($data);

foreach ($xml->{'wp:people'}->{'wp:person'} as $person)

{

  $firstname = (string)$person->{'wp:firstname'};

  echo $firstname . "\
";

} 

Warning: Invalid argument supplied for foreach() in /homepages/24/d232211843/htdocs/numberscout.com/whitepages/searching.php on line 44

current update.


// get wp response data
$data = file_get_contents($url);

// Write the contents back to the file
$file = 'wpres.xml';
file_put_contents($file, $data,LOCK_EX);
$xml = simplexml_load_string("wpres.xml");
/*foreach ($xml->{'wp:people'}->{'wp:person'} as $person)
{
  $firstname = (string)$person->{'wp:firstname'};
  echo $firstname . "\
";
}*/

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 1: parser error : Start tag expected, ‘<’ not found in /homepages/24/d232211843/htdocs/x.com/x/searching.php on line 44

Warning: simplexml_load_string() [function.simplexml-load-string]: wpres.xml in /homepages/24/d232211843/htdocs/x.com/x/searching.php on line 44

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /homepages/24/d232211843/htdocs/x.com/x/searching.php on line 44

Can you show us the contents of the wpres.xml file (the top few lines should suffice)?

Also, you cannot access the wp: elements like you are trying to do since the wp: denotes a namespace (for the WhitePages.com API). You can access them via the [url=http://php.net/simplexmlelement.children]children() method or [url=http://php.net/simplexmlelement.xpath]xpath(), but first thing’s first lets take a look at the broken XML file.

Are you trying to parse this php code into your webpages, so you can display more search engine friendly content?

Hai salathe,

here is a view of the wpres.xml

wpres page source view :

<?xml version=“1.0” encoding=“UTF-8”?>
<wp:wp xmlns:wp=“http://api.x.com/schema/”>
<wp:result wp:type=“success” wp:message=" " wp:code=“Found Data”/>
<wp:meta>
<wp:linkexpiration>2010-11-30</wp:linkexpiration>
<wp:recordrange wp:lastrecord=“1” wp:firstrecord=“1” wp:totalavailable=“1”/>
<wp:apiversion>1.0</wp:apiversion>
<wp:searchid>21251392647260698283</wp:searchid>

&lt;wp:searchlinks&gt;
  &lt;wp:link wp:linktext="x.com" wp:type="homepage"&gt;http://www.x.com/16176/&lt;/wp:link&gt;
  &lt;wp:link wp:linktext="Link to this api call" wp:type="self"&gt;http://api.x.com/find_person/1.0/?firstname=.....

Did not get you Sir :slight_smile:

meaning…are you trying to create a special .php <?php> code, so you can embed RSS feeds into your web-pages, to make them more stick for search engines ?

Nono, what i am trying is, i get a xml response from a 3rd party site for a api data query. so i want to manipulate the response (grab the data from the xml doc) :smiley:

o,o,o…I see what you’re saying now. I didn’t study that kind of code, so i wouldn’t be able to tell ya.

Done!

my code now works charm folks!


$xml = simplexml_load_file($url);
$namespaces = $xml->getDocNamespaces();
$xml->registerXPathNamespace('wp', $namespaces['']);
$result = $xml->xpath('//wp:firstname'); 
//var_dump($result);
foreach ($result as $key => $value){
	echo $value . "<br>";
}
?> 

Got the start with Thruska’s code,

figured out more php/xml parsing things from salathe’s post. never know what is a namespace until this post.

Also, you cannot access the wp: elements like you are trying to do since the wp: denotes a namespace (for the WhitePages.com API). You can access them via the children() method or xpath(), but first thing’s first lets take a look at the broken XML file.

Thanks a LOt folks.

Thank you to drewrey media & rajung as well.