SitePoint Sponsor

User Tag List

Results 1 to 1 of 1
  1. #1
    SitePoint Wizard
    Join Date
    May 2003
    Berlin, Germany
    0 Post(s)
    0 Thread(s)

    Loading via the XML Dom Extension

    Hi all, I am building a small application analyse the links of a given webpage - determining external and internal links.

    When I am trying to load via the DOM XML extension, the encoding seems to be wrong, because I am only getting very weird characters. Could you please help me?

    PHP Code:
    function bbGetPageLinks2($url) {
    $f LL_Admin::getPageContentOverProxy($url); // retrieve website via fsockopen
    $f preg_replace("@<!--.*?-->?@is"' '$f);
    $f preg_replace('@&(.*?);@','',$f);
    $f preg_replace('@<(.*?)\?php@','',$f);
    $f preg_replace('@\?>@','',$f);
    $url parse_url($url);
    $url $url['host'];
    $dom = new DomDocument;
    $dom->preserveWhiteSpace false;
    //$f = mb_convert_encoding($f, 'HTML-ENTITIES', "UTF-16"); 
        //$f = utf8_decode($f);
        //$f = utf8_encode($f);

    strpos($url,'microsoft') !== false)


    Thanks in advance!
    Last edited by DarkAngelBGE; Jun 6, 2007 at 06:46.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts