simpleXML no longer reliably parses feed

Ok I have found the problem. One of the properties has no characteristics just a closing tag <characteristics/> It would seem that it is not required to have this information in the feed contents.

I now need to find a way to skip over any tags that do not contain data. Oh happy days, dont you just love working with something you have very little control over.

From the docs, it looks like you can use the isEmptyElement property to check for empty XML elements. Here’s a section from the add_characteristics function with the check added in:


foreach($property->characteristics->category as $category):
    if(!$category->isEmptyElement): // Added check for empty element
        $cat = (string)$category->name->uk;
        $sqlxml = "('".$propertyid."','".$cat."',";

        foreach($category->value as $value):
            $uid = "$propertyid-$number";
            $val = (string)$value->uk;

            $sql .= $sqlxml."'".$val."','".$uid."') , ";
            $number++;
        endforeach;
    endif;  // Close of new if statement
endforeach;

Good day!!

I did actually go down this route but it does not work because the element is not empty is just not there. The feed does not have an opening tag, just a closing tag, for any empty elements.

Lousy design but it is what I have to work with…

Hi Colin, how’s it going?

I’ve just realised that we can’t use the isEmptyElement check because we’re not working with XMLReader at that point - $property is a SimpleXML object. I’ve just done a quick test on my machine, and this should work:


foreach($property->characteristics->category as $category):
    if($category):
        $cat = (string)$category->name->uk;
        $sqlxml = "('".$propertyid."','".$cat."',";

        foreach($category->value as $value):
            $uid = "$propertyid-$number";
            $val = (string)$value->uk;

            $sql .= $sqlxml."'".$val."','".$uid."') , ";
            $number++;
        endforeach;
    endif;
endforeach;

Hi It is not going so well.

Yep tried that this morning, no good. I have tried isset also, still no good! I have just noticed that the feed does not contain an endtag it has <characteristics/> not </characteristics>

Sorry bud, my fault again… I misread what you said earlier and thought we were dealing with empty <category/> tags. So, try this instead:


if ($property->characteristics):
    foreach($property->characteristics->category as $category): 
        $cat = (string)$category->name->uk;   
        $sqlxml = "('".$propertyid."','".$cat."',";   

        foreach($category->value as $value):   
            $uid = "$propertyid-$number";    
            $val = (string)$value->uk;   
               
            $sql .= $sqlxml."'".$val."','".$uid."') , ";   
            $number++;   
        endforeach;   
        
    endforeach;
endif; 

If I take out the line ON Duplicate etc I get the following: There was an error adding characteristics [Duplicate entry ‘2208-1’ for key ‘uid’]

Are you replacing the entire function with just the code I posted? I was just showing the relevant section of the code… it still needs all the rest of the code:


function add_characteristics($db, $property){
    if ($property-&gt;characteristics):
        $sql = "INSERT INTO feed_characteristics (`propertyref`, `category`, `value`, `uid` ) VALUES ";

        $propertyid = (string)$property-&gt;id;
        $number = 1;

        foreach($property-&gt;characteristics-&gt;category as $category):
            $cat = (string)$category-&gt;name-&gt;uk;
            $sqlxml = "('".$propertyid."','".$cat."',";

            foreach($category-&gt;value as $value):
                $uid = "$propertyid-$number";
                $val = (string)$value-&gt;uk;

                $sql .= $sqlxml."'".$val."','".$uid."') , ";
                $number++;
            endforeach;

        endforeach;

        $sql .= rtrim($sql, ' , ') . ' ON DUPLICATE KEY UPDATE category = VALUES (category), value = VALUES (value)';

        return $db-&gt;query($sql);
    endif;

    return TRUE;
}

Note that I’ve moved the position of the DB query here, as there’s no point executing the query with no values.

[h=1]Error[/h]There was an error adding characteristics [You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘INSERT INTO feed_characteristics (propertyref, category, value, uid ) VA’ at line 1]

It still chokes on that property without any characteristics I cannot understand why?

I’m unable to replicate this on my machine - I have a very simple test set up as follows:


$xml = "
<properties>
    <property>
        <characteristics>
            <category><name>category1</name></category>
            <category><name>category2</name></category>
        </characteristics>
    </property>
    <property>
        <characteristics/>
    </property>
    <property>
        <characteristics>
            <category><name>category5</name></category>
            <category><name>category6</name></category>
        </characteristics>
    </property>
</properties>";

$properties = new SimpleXMLElement($xml);

// This represents the main loop
foreach($properties as $property)
{
    // This represents the add_characteristics function
    if ($property->characteristics) {
        foreach ($property->characteristics->category as $category)
        {
            echo $category->name . "<br>";
        }
    }
}

The empty <characteristics/> element doesn’t give me any trouble at all here. Are you able to share the entire XML section for the property that causes the problem?

<?xml version="1.0" encoding="UTF-8"?>
<root>
<property>
      <id>39279</id>
      <status_date>2012-07-24 13:22:05</status_date>
      <status>Available</status>
      <listed_date>2007-10-20 13:25:10</listed_date>
      <last_updated>2012-07-24 13:22:05</last_updated>
      <ref>1027</ref>
      <price>595000</price>
      <currency>EUR</currency>
      <type>
        <uk>Townhouse</uk>
        <es>Unifamiliar</es>
      </type>
      <subtype>
        <uk>Terraced</uk>
        <es>Adosada</es>
      </subtype>
      <country>Spain</country>
      <province>Málaga</province>
      <town>Marbella</town>
      <area>
      Costa del Sol
      </area>
      <beds>3</beds>
      <baths>2</baths>
      <levels>4</levels>
      <surface_area>
        <built>167</built>
        <terrace>110</terrace>
        <plot>0</plot>
      </surface_area>
      <energy_rating>
        <value>0</value>
        <letter/>
      </energy_rating>
      <own_property>0</own_property>
      <has_pool>1</has_pool>
      <has_garden>1</has_garden>
      <has_garage>1</has_garage>
      
<!-- THIS PROPERTY HAS NO CHARACTERISTICS -->
      <characteristics/>
      
      <description>
        <uk>Oasis de Nagüeles. </uk>
        <es>Oasis de Nagüeles.</es>
      </description>
      <images>
        <image id="1">
          <url>http://Image.asp</url>
        </image>
      </images>
    </property>
    
</root>

Abridged version without characteristics

<characteristics>

      <category id="1">
        <name>
          <uk>Setting</uk>
          <es>Posición</es>
        </name>
        <value id="1">
          <uk>Close To Sweet Shop</uk>
          <es>Cerca de Golf</es>
        </value>
      </category>

      <category id="2">
        <name>
          <uk>Orientation</uk>
          <es>Orientación</es>
        </name>
        <value id="1">
          <uk>South West</uk>
          <es>Suroeste</es>
        </value>
      </category>

</characteristics>

Characteristics

Thanks for sharing that Colin, it made it a lot easier to figure out what’s going on. It seems that using an empty() check works, so try this:


function add_characteristics($db, $property){
    if ( !empty($property-&gt;characteristics) ):
        $sql = "INSERT INTO feed_characteristics (`propertyref`, `category`, `value`, `uid` ) VALUES ";

        $propertyid = (string)$property-&gt;id;
        $number = 1;

        foreach($property-&gt;characteristics-&gt;category as $category):
            $cat = (string)$category-&gt;name-&gt;uk;
            $sqlxml = "('".$propertyid."','".$cat."',";

            foreach($category-&gt;value as $value):
                $uid = "$propertyid-$number";
                $val = (string)$value-&gt;uk;

                $sql .= $sqlxml."'".$val."','".$uid."') , ";
                $number++;
            endforeach;

        endforeach;

        $sql .= rtrim($sql, ' , ') . ' ON DUPLICATE KEY UPDATE category = VALUES (category), value = VALUES (value)';

        return $db-&gt;query($sql);
    endif;

    return TRUE;
}

YES! I believe we have a result.

That has done the trick, I had to remove the . from this though

 $sql .= rtrim($sql, ' , ')

else it threw an insert error. Did this work for you with the . in place?

I have also added this to images in case there is a property without images.

function add_images($db, $property) {
    if ( !empty($property-&gt;images) ): 
    $sql = "INSERT INTO feed_images (`propertyref`, `url`, `prime`, `pid`) VALUES ";
      
       
    $property_id = (string)$property-&gt;id;
    foreach ($property-&gt;images-&gt;image as $image):
    
        
        $url     = (string)$image-&gt;url;
        $prime   = (int)$image-&gt;primary;
        $imageid = (int)$image['id'];
        $pid     = "$property_id-$imageid";
        
        $sql .= "('".$property_id." ', '".$url."', '".$prime."', '".$pid."') , ";
        
    endforeach;
    
    $sql = rtrim($sql, ' , ') .  ' ON DUPLICATE KEY UPDATE url = VALUES (url), prime = VALUES (prime)';
    
    

    return $db-&gt;query($sql);
    
     endif;   
      
    return TRUE; 
}

Thanks so much for your help it is most appreciated. I will put it on the live server and check it out later.

Colin

That’s great news

I didn’t test the SQL parts, just SimpleXML and the loops, but I realise now why it should be without a dot: I was being monumentally stupid and not thinking about how rtrim works!

No worries at all bud - let me know how it performs in production, as I’d be interested to know :slight_smile:

Aarrgh! Now I have put it on the prod server the delete seems to be playing up it is not deleting the Off Market property from the feed_property table only.

Not a problem locally just on the Prod version. Now how can that be???

If you’re testing with just a subset of the data on your local machine, were there actually properties to be deleted so you could verify the function was working?
One last thing before I head off for the night - it occurs to me that you can simplify your delete function and remove all the off market properties with just one query:


function remove_property($db) {
    $sql = "DELETE `feed_property`.*, `feed_images`.*, `feed_characteristics`.*"
        . "FROM `feed_property`"
        . "INNER JOIN `feed_characteristics` ON feed_characteristics.propertyref = feed_property.propertyref"
        . "INNER JOIN `feed_images` ON feed_images.propertyref = feed_property.propertyref"
        . "WHERE feed_property.`status` = 'Sold' OR feed_property.`status` = 'Off Market'";

    return $db-&gt;query($sql);
}