SitePoint Sponsor

User Tag List

Results 1 to 15 of 15
  1. #1
    SitePoint Enthusiast morgy's Avatar
    Join Date
    Nov 2005
    Location
    Sweden
    Posts
    60
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Problem with XML in CDATA tag (swf related)

    Hello all.
    I did a search but my problem is kind of complicated to explain so I'm not sure I used the correct keywords.
    So if there is a thread related to my - following - problem please, point me to that.

    So here it goes.
    I've been working ( as in: starting to learn how to work) in XML documents and I have a problem with the text inside the CData tag. I'm using Dreamweaver btw.
    So, this XML file is been "called" by a certain .swf file since the whole webpage is in flash, and the text appears.
    The problem is that as I type text inside the CData tag, when I press "enter" to change line, in the webpage it appears as if I've inserted 2 <br> but I don't want that because it ruins the layout. (I'm using the <br> metaphorically to describe how it appears, I'm not inserting anything like that)

    So do you guys have any idea what might cause the problem? Any hints?

    I hope I explained things correctly. If not, please tell me so and I'll try to rephrase it.

    Thank you in advance

  2. #2
    SitePoint Member
    Join Date
    Nov 2006
    Posts
    14
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Have you checked the html source to see what is actually being inserted?

  3. #3
    SitePoint Addict
    Join Date
    Nov 2005
    Location
    Moss, Norway.
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Note the format of CDATA is:

    <![CDATA[... content here ...]]>

    CDATA allows you to use all valid Unicode characters in literal forms. The CDATA contents bypass parsing so are great to use when trying to include content containing markup that should be taken in its literal form and not processed as part of the document. But you should not use CDATA for binary data, since there is no guarantee that binary data will not contain the characters ]]> (see above) that marks the end of CDATA. For that reason, binary data that must be encoded should use a format such as Base64.

    If that did not solve your problem, please paste in code and explain more detailed what the problem is.

  4. #4
    SitePoint Enthusiast morgy's Avatar
    Join Date
    Nov 2005
    Location
    Sweden
    Posts
    60
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hey guys, thanks for the replies and sorry for my delayed one .

    kgun you talk about binary data but I'm not using binary data. As you say i'm using valid unicode characters. If I'm understanding it correctly you say that unicode characters allows for the text to appear like I actually type/format it inside the CDATA tag? If so, well it normally does but not when I press enter. To give another example, if I want to finish one sentence and start a new one right below I would have to press spacebar till I get to the exact spot where the line goes right below and not with a gap between.

    <!CDATA[text text text (I press enter here to change the line)
    text text text
    text text text]]>



    it will appear like:

    text text text

    text text text

    text text text

    and not:

    text text text
    text text text
    text text text

    AppSol if you "view source" you won't get anything helpfull since the website it's all in flash.

  5. #5
    SitePoint Addict
    Join Date
    Nov 2005
    Location
    Moss, Norway.
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by morgy View Post

    <!CDATA[text text text (I press enter here to change the line)
    text text text
    text text text]]>



    it will appear like:

    text text text

    text text text

    text text text

    and not:

    text text text
    text text text
    text text text
    I have never tried this but know that in software like Word, there is a difference between a hard (enter) and soft (shift + enter) lineshift.

  6. #6
    SitePoint Enthusiast morgy's Avatar
    Join Date
    Nov 2005
    Location
    Sweden
    Posts
    60
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes I know and I've also tried it but unfortunately that doesn't work in Dreamweaver.
    I've also tried typing the text in Word doing the shoft lineshift and copy/paste the text from there to DW but still nothing.
    It's ok kgun, you've been really nice but I guess the problem is even deeper.
    My fear is that there maybe some variable or a specific setting in the .fla template of the wepbage (or whatever.. , i'm not familiar with flash) that ruins the layout.

  7. #7
    SitePoint Addict
    Join Date
    Nov 2005
    Location
    Moss, Norway.
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Then try to contact Robert Richards, author of the book: "Pro PHP XML and Web Services." He is an expert.

    Email: rrichards@php.net

    Post the solution here if you find it.

    My last thought: It can not be a version or configuring problem?

    Quote Originally Posted by morgy View Post
    My fear is that there maybe some variable or a specific setting in the .fla template of the wepbage (or whatever.. , i'm not familiar with flash) that ruins the layout.

    Flash is not my strong side either. You may be correct.

  8. #8
    SitePoint Addict
    Join Date
    Nov 2005
    Location
    Moss, Norway.
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by morgy View Post
    So, this XML file is been "called" by a certain .swf file since the whole webpage is in flash, and the text appears.
    The problem is that as I type text inside the CData tag, when I press "enter" to change line, in the webpage it appears as if I've inserted 2 <br> but I don't want that because it ruins the layout. (I'm using the <br> metaphorically to describe how it appears, I'm not inserting anything like that)
    Thought of this later this afternoon:

    I do not know how much you know of the XML family, but note that without XPath, XLink, XPointer, XInclude, XSLT etc. etc. you do not use it effectively. Many problems could have been avoided if people knew the XML family better.

    Example:
    By using XSL(T) (XML styling) it is fairly easy to convert an XML document to the most popular formats like PDF, (X)HTML, and other XML documents like RSS.

    If you are also parsing XML documents by using a scripting language like PHP, you should be aware of the libxml library that I have written about in another post in this forum. You should use the last version of PHP and the libxml library. And you should write code so that it is easy to upgrade to newer versions, since so much is happening around namespaces in XML (extremely important to understand and use them correctly) and in future verions of PHP, that may reduce a lot of library collisions and name clashes etc. in PHP.

    It is when you start working with namespaces for different URI's that XML becomes really efficient, even if Robert Richards have the following important thing to remember about name spaces in XML:
    • If you don't need name spaces, don't use them.
    • If you have the choice, use QNames rather than default namespaces.
    • Attributes are not bound to default name spaces.
    • DTDs and namespaces are not all that compatible and can lead to invalid documents.


    My bolding, in my view very important if you need a forward compatible document.

    As a minimum I can reccomend the above book. There is also a good SitePoint book that shows how to make an CMS in XML.

    Note the following constant that is used in parser options in the last version of libxml.

    LIBXML_NOCDATA: Merges CDATA nodes into text nodes. A document using CDATA sections will be created with no CDATA nodes, because these will now be converted into plain-text nodes. This glag is useful when loading a document to be used for an XSL transformation.

    Question: If you use PHP and libxml to parse documents, is it possible to go around the problem by converting your prbplematic document to another XML document and convert it back to the format you want by using XSLT.

    If you want more litterature go to Amazon and search for books with the following KW's

    XML family

    XSLT

    XPath

    XQuery

    XLink

    XMLSchema

    XML

    etc.

    You have just started on your xml journey. This man from Holland, know's how to use XML.

    Python and XML Processing for those that use this OO language.

    Look on my own site DigitalStart.net and KjellBleivik.com for additional information.

    By using the correct technology, you can filter an XML document for nearly everything like whitespace and then tansform it to the format you desire. By using an XML Schema, you can for example preserve, replace and collapse white space.

    Example from RRichards page 82:

    <xsd:element name ="description">
    <xsd:simpleType>
    <xsd:restriction base ="xsd:string">
    <xsd:whiteSpace value="collapse" />
    <xsd:restriction>
    <xsd:simpleType>
    <xsd:element>

    Leave it to you to figure out how the XML document and the description element looks.

    Note: an URI (Uniform Resource Identifier) is much more than an URL and will in my view be extremely important in web 2.0 applications and XLink.

    Had a fantastic teacher from Greece on my course in "Nonlinear, chaos and fractal mathematics" at the graduate level at the Univeristy of Oslo. I hope that I have not confused you more. If you study what I have written here, may be the XML family of technologies will be less chaotic to you :-)

    I strongly reccomend that you start using libxml if you use PHP, since there is no need to reinvent the wheel.
    Last edited by kgun; Jul 11, 2007 at 05:28.

  9. #9
    SitePoint Enthusiast morgy's Avatar
    Join Date
    Nov 2005
    Location
    Sweden
    Posts
    60
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Wow kgun!!! .
    Thank you very much for all the information.
    I will definately search through all the things you recommend and yes a book is always a good thing.

    Well, let me see what I can do with all the tips you gave me. But it's too much information right now and I need to take one step at a time, hehe.
    Of course I will post back with - hopefully - the solution to my problem

    Again, thank you so much

    /me has lots of reading

  10. #10
    SitePoint Addict
    Join Date
    Nov 2005
    Location
    Moss, Norway.
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No problem.

    Note by using The Document Object Mode and its classes you have full control over a document. Under DOM a document is loded into memory as a tree of nodes that can be manipulated, nodes moved, deleted and new inserted into the tree. White spave like line feeds, tabs etc. etc. can be removed on loading the document and reintroduced when saving by the setting "formatoutput = TRUE". You should use the latest version of PHP to do this, since it has the best options and functionality. When the document is loaded into the tree, the default is that insignificant whitespace like line feeds and tabs are loaded into the tree. Let us say that your document is my.xml. Then you can load that file into the DOM tree (memory) and at the same time deleting insiginficant whitespace like this

    $dom = DOMDocument::load('my.xml', LIBXML_NOBLANKS);

    assuming that the file is located in the same directory as the script.

    The last parameter in the method deletes all unsignificant whitespace.

    Now you have complete control over the document.

    Other classes from the DOM extension that may be of interest to solve your problem is

    DOMCharacterData, DOMCDATASection.

    You must be aware of encoding as I have mentioned above. The tree is internally stored as UTF-8, so you must encode and decode data properly when manipulating the tree. It is also imporant to be aware of the parser options, like LIBXML_NOBLANKS used above and LIBXML_NOCDATA etc. The DOM API has today, as far as I know, most functionality.

    There is a simpler API, SimpleXML that is more XML specific. Together, this API and the Document Object Model (DOM) give you complete control over a (X)HTML and / or XML document. It is up to your fantasy how you will manipulate the documen(s). Yes, you can manipulate multiple documents at the same time.

    I will reccomend the floowing books in order of reading:

    Stuart Langridge: "DHTML Utopia: Modern Web Design Using JavaScript & DOM"

    Thomas Meyer: "No Nonsense XML Web Development With PHP"

    Be sure that you get the last editions.

    When you manage these, I would reccomend Robert Richard's: "Pro PHP XML and Web Services" mentioned above.

  11. #11
    SitePoint Enthusiast morgy's Avatar
    Join Date
    Nov 2005
    Location
    Sweden
    Posts
    60
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, kgun you have been more than helpful.
    However, I can't handle all of these at the same time
    it's too much infomation for little me.
    However I'll note the "complete control over the document"
    That's what I need.

    Unfortunately, I'm not actually creating the code here, it's already been done and I have to make some alterations.
    So I need to understand what the previous coder did, I need to get into his/her mind, and try to solve this particular problem.
    - still working on it though, haven't figure it out yet -

    However, again, I can't thank you enough for all these resources

    oh and btw you said
    I hope that I have not confused you more
    hehe, of course you did, but it's ok, I think that's the point.
    getting confused and trying to figure things out.

  12. #12
    SitePoint Addict
    Join Date
    Nov 2005
    Location
    Moss, Norway.
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I will indicate how easy it is to create a CDATA element if you use the DOM API.

    Method 1. Indirectly by first instantiating the DOMDocument.

    $dom = new DOMDocument; //Creates an istance of the DOMDocument class.

    $cdata = $dom->createCDATASection("<html> ... tagging here ... </html");

    Method 2. Directly creating an instance of the DOMCDATASection.

    $cdata = new DOMCDATASection("<html> ... tagging here ... </html");

    Result:

    <![CDATA[<html> ... tagging here ... </html]]>

    Then you have to insert this $cdata element (variable) into the DOM tree by using the metods that follow with the API's classes.

    If you use the much simple SimpleXML API, here is a soft start:

    PHP Code:
    <?php
    print 'PHP Version on this web server is: ' PHP_VERSION '<br /> <br />';
    //Parsing starts here
    $tagsXML "<root><node1>This is parsed in the same way on the latest versions of FireFox, IE and Opera</node1></root>";
    $sxe = new SimpleXMLElement($tagsXML);
    print 
    $sxe->asXML(); 
    print 
    '<br /> <br />';
    print 
    'You need permission to save a file to the disk';
    print 
    '<br />';
    $sxe->asXML('morgy.xml');  //No permission to save to file
    ?>
    View result

    http://www.kjellbleivik.com/SpHelp/morgy1

    on my site, where I have no permission to save a file on the shared server, so this error is generated Juli 15 2007:

    Warning: SimpleXMLElement::asXML(morgy.xml) [function.SimpleXMLElement-asXML]: failed to open stream: Permission denied in /usr/home/web/wno134614/SpHelp/morgy1.php on line 8

    "Because you have yet to learn about navigating the tree, I will not explain how to call this method using a node from the tree from now."

    Robert Richards page 242.

    This will be explained in my next post, when I have written the code. If somebody read this thread, it will be done when I have had a dinner.

    For now, if you are impatient, you can start here:
    http://www.kjellbleivik.com/Books/ProPHP/content.htm

    Click on listing7-4.php and view the result.

    Here is the code for listing7-4.php

    PHP Code:
    <html>
    <body>
    <?php
    /* BEGINNGING OF USER VARIABLES */
    /* Location of PAD Specification File */
    $padspec "http://www.padspec.org/pad_spec.xml";

    /* Location of PAD Template Generated by DOM */
    $padtemplate "padtemplate.xml";

    /* Name of PAD File to Save Results to */
    $savefile "padout.xml";
    /* END OF USER VARIABLES */

    /* Output field name/values for input and preview based on state of $bPreview */
    function printDisplay($sxe$sxetemplate$bPreview) {
       
    $section "";
       
    /* Loop through the Field nodes of the specification */
       
    foreach ($sxe->Fields->Field as $field) {
          
    /* Get the node path used in the template */
          
    $arPath explode("/"trim($field->Path));
          
    array_shift($arPath);
          
    /* Skip MASTER_PAD_VERSION_INFO nodes.
             Values for these are set by template generator */
          
    if ($arPath[0] != "MASTER_PAD_VERSION_INFO") {
             if (
    $arPath[0] != $section) {
                
    $section $arPath[0];
                print 
    "<p>".str_replace("_"," "$section)."</p>";
            }
            
    $input_value getStoredValue($sxetemplate$arPath);
            
    array_shift($arPath);
            print 
    "\n".$field->Title.': ';
            if (
    $bPreview) {
               print 
    $input_value."<br>";
            } else {
               
    $input_name $section;
               
    /* Generate the field name using named based keys for an array */
               
    foreach ($arPath AS $key=>$value) {
                  
    $input_name .= "[$value]";
               }
               print 
    '<input type="text" name="'.$input_name.
                     
    '" value="'.$input_value.'"><br>';
            }
          }
       }
    }

    /* Retrieve text content for node from working template */
    function getStoredValue($sxe$arPath) {
      if (
    $sxe) {
          
    /* Loop through node path to find SimpleXML element from working template */
          
    foreach($arPath AS $key=>$value) {
             
    $sxe $sxe->$value;
          }
          return (string)
    $sxe;
       }
       return 
    "";
    }

    /* Set the text content for a node from working template */
    function setValue($sxe$field$value) {
       if (
    is_array($value)) {
          
    /* Loop through node path to find SimpleXML element from working template */
          
    foreach ($value AS $fieldname=>$fieldvalue) {
             
    setValue($sxe->$field$fieldname$fieldvalue);
          }
       } else {
          
    /* Encode the value to einsure content will be valid XML */
          
    $sxe->$field htmlentities($value);
       }
    }

    /* Validate fields in working template using the RegEx defined in specification */
    function validatePAD($spec$template) {
       
    $arRet = array();
       foreach (
    $spec->Fields->Field as $field) {
          
    $arPath explode("/"trim($field->Path));
          
    array_shift($arPath);
          if (
    $arPath[0] != "MASTER_PAD_VERSION_INFO") {
             
    $sxe $template;
             
    $regex "/".trim($field->RegEx)."/";
             foreach(
    $arPath AS $key=>$value) {
                
    $sxe $sxe->$value;
                if (! 
    $sxe) {
                   break;
                }
             }
             if (
    $sxe) {
                
    $value = (string)$sxe;
                if (! 
    preg_match($regex$value)) {
                   
    /* Capture fields failing validation for later display */
                   
    $arRet[] = array($field->Title$field->RegExDocumentation);
                }
             }
          }
       }
       
    /* Return array containing any captured errors */
       
    return $arRet;
    }

    /* Initial states for application variables */
    $sxetemplate NULL;
    $bPreview FALSE;
    $bError FALSE;
    $bSave FALSE;

    /* BEGIN ACTUAL PROCESSING */
    if ($sxe simplexml_load_file($padspec)) {
       if (isset(
    $_POST['Save']) || isset($_POST['Preview']) || isset($_POST['Edit'])) {
          
    /* Working template in hidden field is Base64 encoded and must be decoded */
          
    $sxetemplate = new SimpleXMLElement(base64_decode($_POST['ptemplate']));
          
    /* Loop through $_POST vars. vars that are arrays are PAD fields to be set */
          
    foreach($_POST AS $name=>$value) {
             if (
    is_array($value)) {
                
    setValue($sxetemplate$name$value);
             }
          }
          if (isset(
    $_POST['Save'])) {
             
    /* Save finalized working template to file */
             
    $sxetemplate->asXML($savefile);
             
    $bSave TRUE;
          } elseif (isset(
    $_POST['Preview'])) {
             
    /* Validate the working template */
             
    $arRet validatePAD($sxe$sxetemplate);
             if (
    count($arRet) > 0) {
                
    $bError TRUE;
                print 
    "<B>ERRORS FOUND</B><br>";
                
    /* Print out errors returned from validatePAD() */
                
    foreach ($arRet AS $key=>$value) {
                   print 
    $value[0].": ".$value[1]."<br>";
                }
             } else {
                
    /* Working template was validated so allow data to be previewed */
                
    $bPreview TRUE;
             }
          }
       } else {
          
    /* Initial entry point so load the PAD template created from DOM */
          
    $sxetemplate simplexml_load_file($padtemplate);
       }
       
    /* If in working state display the working template for editing or preview */
       
    if (! $bSave) {
          print 
    '<form method="POST">';
          
    /* Base64-encoded working template to allow XML to be passed 
             in hidden field */
          
    print '<input type="hidden" name="ptemplate" value="'.
                
    base64_encode($sxetemplate->asXML()).'">';
          
    printDisplay($sxe$sxetemplate$bPreview);
          print 
    '<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'.
                
    '<input type="Submit" name="Preview" value="Preview and Validate PAD">';
          if (!
    $bError && isset($_POST['Preview'])) {
             
    /* Working template is valid and in preview mode.
                Allow additional editing or final Save */
             
    print '&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'.
                   
    '<input type="Submit" name="Edit" value="Edit PAD">';
             print 
    '&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'.
                   
    '<input type="Submit" name="Save" value="Save PAD">';
          }
          print 
    '</form><br><br>' ;
       } else {
          
    /* Final PAD file has been saved - Just print message */
          
    print "PAD File Saved as $savefile";
       }
    } else {
       
    /* Application unable to retrieve the specification file - Error */
       
    print "Unable to load PAD Specification File";
    }
    ?>
    </body>
    </html>
    Source: Robert Richards, Chapter 7.

    Note: There is an eBook version of the Book: http://www.apress.com/book/bookDisplay.html?bID=10092

    You can download the soruce code for the book.

    May be, by looking at this example, you have solved the problem before I return with the next post.

    Since this is so important, I post a new thread:

    Accessing content and using iterated Objects in SimpleXML.
    Last edited by kgun; Jul 15, 2007 at 08:12. Reason: Adding SimpleXML example

  13. #13
    SitePoint Enthusiast
    Join Date
    Dec 2006
    Posts
    49
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    EDit: nevermind...
    Last edited by LetterAfterZ; Jul 17, 2007 at 22:34.

  14. #14
    SitePoint Enthusiast morgy's Avatar
    Join Date
    Nov 2005
    Location
    Sweden
    Posts
    60
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I know it's been a long long time since I've started this thread but up to now there haven't been any solutions yet.

    The thing is that nothing has worked so far and we kind of postponed it, so now I've been "transferred" to another project.

    However this "double lines" thing is still a puzzle for me.

    My guess is that we are going to leave it as it is, since I guess only the FIRST programmer who actually coded it, knows the answer.
    :/

  15. #15
    SitePoint Addict
    Join Date
    Nov 2005
    Location
    Moss, Norway.
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    My answer is that you do not know the technology well enough. You have to learn XSL, XSLT etc.

    Related thread.
    The XML family of technologies will revolutionize web linking etc.

    Sometimes you have to go back to basics and learn the alphabet again. By the way, have you tried the XML forum at the Norwegian site W3 Schools?

    You find the link if you scroll down the links in RedCarpetRank.com in my signature.

    Hint:
    Use / switch to PHP and some of the many parses like SimpleXML and XML Reader.

    I am in a hurry and can not help you more than I have already done.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •