SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    masquerading Nick's Avatar
    Join Date
    Jun 2003
    Location
    East Coast
    Posts
    2,215
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    DomDocument -> loadHTMLFile problems

    Hey everyone,

    I've been experimenting with DomDocument and playing around with files, but i've run into a problem. I'm trying to remotely load one of my webpages, but I'm having some errors thrown at me. My code is:

    PHP Code:
    <?php
    $remote 
    file_get_contents('remotefile.html');

    $doc = new DomDocument();
    $file $doc->loadHTML($remote);
    $cells $doc->getElementsByTagName('td');

    foreach(
    $cells AS $cell)
    {
        if(
    $cell->getAttribute('class') == 'title')
        {
            
            echo 
    $cell->nodeValue '<br />';
        }
    }
    ?>
    remotefile.html is not actually a remote file right now because when I was getting these errors I decided to try just downloading the file and placing it in the same directory and trying to load it, to see if that took care of the errors (it didn't). Anyways, the errors I am getting are:

    Code:
    Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: Unexpected end tag : img in Entity, line: 37 in test.php on line 27
    Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: htmlParseEntityRef: expecting ';' in Entity, line: 317 in test.php on line 27
    Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: htmlParseEntityRef: expecting ';' in Entity, line: 323 in test.php on line 27
    Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: Unexpected end tag : img in Entity, line: 603 in test.php on line 27
    Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: htmlParseStartTag: invalid element name in Entity, line: 603 in test.php on line 27
    Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: Opening and ending tag mismatch: a and b in Entity, line: 604 in test.php on line 27
    Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: Unexpected end tag : img in Entity, line: 610 in test.php on line 27
    Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: htmlParseStartTag: misplaced <body> tag in Entity, line: 665 in test.php on line 27
    The PHP docs says that "HTML does not have to be well-formed to load" - are these errors I am getting saying that the page is too badly formed to load? Or is there something else going on?
    Nick . all that we see or seem, is but a dream within a dream
    Show someone you care, send them a virtual flower.
    Good deals on men's watches

  2. #2
    hi galen's Avatar
    Join Date
    Jan 2006
    Location
    New Haven, CT
    Posts
    1,228
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Can you post the html file you're trying to load?

  3. #3
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    loadHTML expects valid markup, i'm afraid most page's arn't.

    You can alter the code to suppress markup errors:-
    PHP Code:
    $file = @$doc->loadHTML($remote); 
    SilverB.

  4. #4
    hi galen's Avatar
    Join Date
    Jan 2006
    Location
    New Haven, CT
    Posts
    1,228
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by SilverBulletUK View Post
    loadHTML expects valid markup, i'm afraid most page's arn't.

    You can alter the code to suppress markup errors:-
    PHP Code:
    $file = @$doc->loadHTML($remote); 
    SilverB.
    From PHP.net
    Unlike loading XML, HTML does not have to be well-formed to load
    Are you saying it does require valid markup from experience?

  5. #5
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    Are you saying it does require valid markup from experience?
    Indeed, "HTML does not have to be well-formed to load" which would suggest it would still load, yet only throw warnings as opposed to errors with XML.

    I have always suppressed these warning in the past and successfully traversed the DOM.

    Every error you have posted is a mark-up warning. However, if you cannot traverse the DOM, we have another issue altogether.

    SilverB.


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •