SimpleXML different results in PHP 5 & 7

I created an xml product feed for Google Merchant Centre some time last year. It pulls data from a sites database and formats as xml using the SimpleXML extension.
It all worked at the time, but I notice now some errors and the xml code is incorrect.
Testing locally I found the problem was the php version changing. In 5.6 it works as expected, in an 7+ the xml appears wrong.

This is an excerpt of Google’s sample data showing how the xml should look:-

<?xml version="1.0"?>
<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0">
	<channel>
		<title>Example - Online Store</title>
		<link>http://www.example.com</link>
		<description>This is a sample feed containing the required and recommended attributes for a variety of different products</description>
		
		<!-- First example shows what attributes are required and recommended for items that are not in the apparel category -->
		<item>
			<!-- The following attributes are always required -->
			<g:id>TV_123456</g:id>
			<g:title>LG 22LB4510 - 22" LED TV - 1080p (FullHD)</g:title>
			<g:description>Attractively styled and boasting stunning picture quality, the LG 22LB4510 - 22&quot; LED TV - 1080p (FullHD) is an excellent television/monitor. The LG 22LB4510 - 22&quot; LED TV - 1080p (FullHD) sports a widescreen 1080p panel, perfect for watching movies in their original format, whilst also providing plenty of working space for your other applications.</g:description>
			<g:link>http://www.example.com/electronics/tv/22LB4510.html</g:link>
			<g:image_link>http://images.example.com/TV_123456.png</g:image_link>

Note the g: namespace on the product properties.

What I get with php7 is like:-

		<item>
			<g:id/>
			<id>TV_123456</id>
			<g:title/>
			<title>LG 22LB4510 - 22" LED TV - 1080p (FullHD)</title>

Where namespaced properties have a preceding self-closing tag with the namespace, then the value inside a tag pair without namespace, which Google does not seem to like.

Any ideas how to get the expected xml format in php7?

This is a method I’m using to build the xml:-

		public function makexml(){
			$ns = 'http://base.google.com/ns/1.0' ; // Set namespace
			
			$xml = new \SimpleXMLElement('<?xml version="1.0"?><rss xmlns:g="'.$ns.'" version="2.0"></rss>');
			
			$channel = $xml->addChild('channel');
			
			$channel->addChild('title', 'Store Name');
			$channel->addChild('link', 'https://www.example.com');
			
			foreach($this->products as $prod){	// Add each product
				$prod->format();
				
				$item = $channel->addChild('item');
				
				foreach($prod as $prop => $value){		// Add each product property
					if((!is_null($value)) && (!is_array($value))){
						$item->addChild($prop, null, $ns);
						$item->$prop = htmlspecialchars_decode($value);
					}
				}
				if($prod->shipping){
					foreach($prod->shipping as $ship){	// Add Shipping info
						$shipopt = $item->addChild('g:shipping', null, $ns);
						foreach($ship as $prop => $value){
							$shipopt->addChild('g:'.$prop, $value, $ns);
						}
					}
				}
			}
			return $xml ;
		}

(Fair warning: I’m guessing here. I dont have an environment to test with until this evening)
Dont put the g: on your prop names. PHP takes care of that with the third parameter.

EDIT:
It’s something to do with this line:

referencing the property by name here…

maybe you need to do

$item["g:".$prop] = ... 

instead?

I see it does.
I did have the g: on the item properties before and tried removing it to see if it amkes any difference, but the same result.

I’ll give that a try, but it will have to wait until tomorrow.

Alternative thought: rather than trying to reference the property in the second line, make the second parameter of your addChild call the value. Because that’s what it’s used for. :stuck_out_tongue:

$item->addChild($prop, htmlspecialchars_decode($value), $ns);

Already tried that, I got errors. I Don’t recall exactly what error, this was at work, I’m not there now.
There was probably a reason I added the value in a second line, but I have slept a few times since writing it…
I think maybe it’s URLs as values that it doesn’t like.

When I do this it adds the properties as attributes to the item.

The error is “unterminated entity reference” where a URL has variables with escaped & in it.
I can’t htmlspecialchars all the values because Opencart for some reason escapes strings before saving to the DB, so I would get double escaped values.
I made an edit to the format() method in the product class (which formats the data as required) so it escapes the URL string and that seems to work now with just the single line to add properties: $item->addChild($prop, $value, $ns);

if you urlencode() the value, wouldnt that fix it? It shouldnt choke on %'s…

That should say unescaped.
It seems to work OK with the & escaped to &amp;
The products have all crawled now.

1 Like

@SamA74
I have not delved into the details of this topic but curious to know if it is possible to use the PHP builtin XML tp Array and Array to XML functions?

https://secure.php.net/manual/en/function.xml-parse-into-struct.php

It’s the other way around, i’m creating xml from data.
But i think this is solved now.

Yes, apparently it was a deliberate design decision by PHP that they would not automatically translate ampersands into their HTMLEntity form to prevent issues with already-converted strings.

This behavior was apparently already covered as a bug and dismissed by the documentation team.

1 Like

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.