SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Evangelist
    Join Date
    Jun 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Loading external website and extracting an ID's innerHTML - DOM?

    Whats the best way to go about this? I would like to use DOM, but its spitting errors at me because I have validateOnParse set to true. If I set it to falso, my code echos a blank page. What can I do?

    Code PHP:
    $doc = new DomDocument;
    $doc->validateOnParse = true;
    $doc->Load('site.html);
    echo $doc->getElementById('id_on_site');

    Thanks,
    e39m5

  2. #2
    rajug.replace('Raju Gautam'); bronze trophy Raju Gautam's Avatar
    Join Date
    Oct 2006
    Location
    Kathmandu, Nepal
    Posts
    4,013
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Are you sure this line:
    PHP Code:
    $doc->Load('site.html); 
    is not the error line? Because a single quote is missing there. It should be like this:
    PHP Code:
    $doc->Load('site.html'); 
    Mistakes are proof that you are trying.....
    ------------------------------------------------------------------------
    PSD to HTML - SlicingArt.com | Personal Blog | ZCE - PHP 5

  3. #3
    SitePoint Evangelist
    Join Date
    Jun 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I have this working properly:
    Code PHP:
    $dom = new DomDocument;
    $dom->validateOnParse = true;
    @$dom->loadHTMLFile('http://www.externalsite.com/site.html');
    $data = $dom->getElementById('div_id');
    echo $data->tagName;

    but how can I echo the content of the div instead of the tagName? I can't find documentation on it.

    Thanks,
    e39m5

  4. #4
    rajug.replace('Raju Gautam'); bronze trophy Raju Gautam's Avatar
    Join Date
    Oct 2006
    Location
    Kathmandu, Nepal
    Posts
    4,013
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I haven't tested this and worked with even PHP DomDocument class but the text should return the text. So try :
    PHP Code:
    $dom = new DomDocument;
    $dom->validateOnParse true;
    @
    $dom->loadHTMLFile('http://www.externalsite.com/site.html');
    $data $dom->getElementById('div_id');
    echo 
    $data->textContent
    Edit:
    I found this works:
    PHP Code:
    echo $data->textContent
    Mistakes are proof that you are trying.....
    ------------------------------------------------------------------------
    PSD to HTML - SlicingArt.com | Personal Blog | ZCE - PHP 5

  5. #5
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    I would like to use DOM, but its spitting errors at me because I have validateOnParse set to true. If I set it to falso, my code echos a blank page. What can I do?
    I have read recently that you should install the Tidy extension and run it through that first forcing the input into valid xhtml markup first, then run DOM over it.

  6. #6
    SitePoint Evangelist
    Join Date
    Jun 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ahh thats really close - but I lose the formatting tags within the divs. So if there were any html tags within the div they're not in the echo.

    Any Ideas?

    Thanks,
    e39m5

  7. #7
    SitePoint Evangelist
    Join Date
    Jun 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Heh I dont think its possible. I just ran through some more dom functions to isolate the content and echo it how I wanted. Probably better this way.

    Thanks,
    e39m5

  8. #8
    SitePoint Evangelist
    Join Date
    Jun 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OK actually... It would still be helpful if there was a function that would return the text content including the tags. Is there anything?

    Thanks,
    e39m5


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •