SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    ko pročita magarac :) boccio's Avatar
    Join Date
    Oct 2003
    Location
    belgrade
    Posts
    354
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Parsing XML - problem with escaping

    Hi all,

    I have a problem with parsing some XML data. This is snippet of file I'm handling:
    Code:
    <Listing 
       rank="2"
       title="L'accès au DVD sans limite pour 1 EUR le premier mois" 
       description="Glowria.fr : la location de DVD sur Internet en toute simplicité. Plus de 4 000 titres livrés chez vous en 48 heures. L'accès au DVD sans limite pour 1 EUR le premier mois."
       siteHost="www.glowria.fr"
       biddedListing="true"
       adultRating="G">
    <ClickUrl type="body">http://www20.overture.com/d/sr/?xargs=05u3hs9yoaUE1uuzDD%2FAxommg0slFBxBoogD8ZbBu6rHJtNqr8B2TfzrVXpAsO3N48bx2KeWgFppJcW72Yt7O3ne4m%2BuXwOBwsQLBRlSOqEbDNcBa1YuzAOJdmRxISUG0EyhE4AjkeuWVZBnoV9wvNy8GvYgBTyuPJGfqV%2F8FQFLHJWVBR%2FyVvNr2fsDNVTbLlQqKcyfF1Y0BSDvMgIMGkZXzZ0JpjNGD7OxfDjGm3Le%2BnC0mzUSElCdiKAN1bOw8%2Fm85%2ByfC4l2WFwWz49r39Xevutd3XGY567FXyJcW9bk2n%2Blvdc3%2FBs0QfYj2n&yargs=www.glowria.fr</ClickUrl> 
    </Listing>
    The problem arises when I try to parse ClickUrl, it picks up only last part of attrib after '&' - in this example instead of whole URL: http://www20.overture.com/d/sr/?xargs=05u3hs(...)0j2n&yargs=www.glowria.fr, it takes only last part - &yargs=www.glowria.fr

    Is this happening due to un-escaping '&' sign? I pick ClickUrl with standard callback
    PHP Code:
    xml_set_character_data_handler$xml_parser"characterDataHandler");
    //
    //...
    function characterDataHandler ($parser$data) {
        
    // (some code...)
        
    if ($state=="CLICKURL") {$userdata[$usercount]["ClickUrl"] = $data;}
        
    // (other code...)
        

    Any ideas?
    Vivvo CMS - Web publishing at your fingertips
    Mile voli disko, a ja belo kolumbijsko

  2. #2
    ********* wombat firepages's Avatar
    Join Date
    Jul 2000
    Location
    Perth Australia
    Posts
    1,717
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    you tried str_replacing &amp; in the url ?

  3. #3
    ko pročita magarac :) boccio's Avatar
    Join Date
    Oct 2003
    Location
    belgrade
    Posts
    354
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by firepages
    you tried str_replacing &amp; in the url ?
    you mean like this?
    if ($state=="CLICKURL") {$userdata[$usercount]["ClickUrl"] = str_replace("&amp;","&",$data);}

    not working... I did some analysis, and seems like parser is splitting attrib, making "&" as delimiter... so instead of making one attrib, he makes:
    - http://www20.overture.com/d/sr/?xargs=05u3hs(...)0j2n
    - &yargs=www.glowria.fr

    and I'm picking only the last part...so, I can merge them, but that is quite cumbersome...Does anybody have an idea why is this happening? Should I post complete code?
    Vivvo CMS - Web publishing at your fingertips
    Mile voli disko, a ja belo kolumbijsko

  4. #4
    ********* wombat firepages's Avatar
    Join Date
    Jul 2000
    Location
    Perth Australia
    Posts
    1,717
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    is this a typo ? >> str_replace("&amp;","&",$data);
    should be >> str_replace("&","&amp;",$data);

  5. #5
    ☆★☆★ silver trophy vgarcia's Avatar
    Join Date
    Jan 2002
    Location
    in transition
    Posts
    21,235
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)
    Where are you getting this XML from? If it's from a third party, let them know that it has to be well-formed in the first place for anyone to use it properly.

  6. #6
    ko pročita magarac :) boccio's Avatar
    Join Date
    Oct 2003
    Location
    belgrade
    Posts
    354
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    yes, a typo, it doesn't work...

    I'm getting results from Overture, suppose it's well formated. The problem is definitely in fact that "&" sign is considered as a delimiter - no idea why
    Vivvo CMS - Web publishing at your fingertips
    Mile voli disko, a ja belo kolumbijsko


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •