SitePoint Sponsor

User Tag List

Results 1 to 6 of 6

Hybrid View

  1. #1
    SitePoint Addict say's Avatar
    Join Date
    Sep 2003
    Location
    At work
    Posts
    371
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    XML Parsing - &

    Hi,

    I currently have a xml document like so:

    PHP Code:
    <industries>
        <
    element value="">--- Select One ---</element>
        <
    element value="US">Advertising &ampMedia</element>
        <
    element value="AF">Finance/Accountancy</element>
    </
    industries
    I have a php script that parse the xml data from the above file, but the problem now is that the special characters &amp; is being truncated out. For example: "Advertising &amp; Media" becomes "Media"

    My Xml parsing script is like so:

    PHP Code:
    class XMLParser
    {


        function 
    XMLParser( )
        {
            
    $this->stack   = array( );
            
    $this->element = array( );
            
    $this->parser  xml_parser_create( );

            
    // Set XML parser to take the case of tags in to account
            
    xml_parser_set_option$this->parserXML_OPTION_CASE_FOLDINGfalse );

            
    xml_set_object$this->parser$this );
            
    xml_set_element_handler$this->parser'tagOpen''tagClose' );
            
    xml_set_character_data_handler$this->parser'cdata' );
        }
        

        function 
    loadFromFile$filepath )
        {
            if( !( 
    $fp = @fopen$filepath'r' ) ) )
            {
                
    $this->errno   '';
                
    $this->errmsg  "Invalid file: $filepath";
                
    $this->errline __LINE__;
                return 
    false
            }
            else
            {
                while( 
    $xmlData fread$fp4096 ) )
                {
                    if( !
    xml_parse$this->parser$xmlDatafeof$fp ) ) )
                    {
                        
    $this->errno   xml_get_error_code$this->parser );
                        
    $this->errmsg  xml_error_string$this->errno );
                        
    $this->errline xml_get_current_line_number$this->parser );
                        return 
    false;
                    }
                }
                
                
    fclose($fp);
                
    xml_parser_free$this->parser );
                return 
    true;
            }
        }
        
        
        function 
    tagOpen$parser$name$attrs )
        {
            if( isset( 
    $attrs['value'] ) && trim$attrs['value'] ) != '' )
            {
                
    $this->current $attrs['value'];
                
    $this->arrayList[$attrs['value']] = '';
            }
        }


        function 
    tagClose$parser$name )
        {
            if( 
    $name == 'element' )
            {
                
    $this->arrayList[$this->current] = $this->content;
            }
        }
        
        
        function 
    cdata$parser$cdata )
        {
            
    $this->content $cdata;
        }

    Would really appreciate if someone could help. Thanks.

  2. #2
    SitePoint Evangelist djdykes's Avatar
    Join Date
    Feb 2005
    Location
    Chester, Cheshire
    Posts
    565
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    have you tried using the SimpleXML module in PHP5?

    http://www.php.net/manual/en/ref.simplexml.php

  3. #3
    SitePoint Addict say's Avatar
    Join Date
    Sep 2003
    Location
    At work
    Posts
    371
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi, thanks for your reply but I'm using PHP4, PHP5 is not an option.

  4. #4
    ko pročita magarac :) boccio's Avatar
    Join Date
    Oct 2003
    Location
    belgrade
    Posts
    354
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    XML parsers usually "chop" parsing when they encounter entity like &amp; ('&') so try using .= instead of = assigning.

    i.e.
    $this->arrayList[$this->current] .= $this->content;
    $this->content .= $cdata

    Tell me if this worked.
    Vivvo CMS - Web publishing at your fingertips
    Mile voli disko, a ja belo kolumbijsko

  5. #5
    SitePoint Addict say's Avatar
    Join Date
    Sep 2003
    Location
    At work
    Posts
    371
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi, thanks for the reply. I got the ampersand working fine now! But another new problem now is, my data become like so:

    PHP Code:
    --- Select One ---
    --- 
    Select One ---Advertising &ampMedia
    --- Select One ---Advertising &ampMediaFinance/Accountancy 
    Seems to get duplicated, any ideas?

  6. #6
    SitePoint Addict say's Avatar
    Join Date
    Sep 2003
    Location
    At work
    Posts
    371
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    bump


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •