SitePoint Sponsor

User Tag List

Results 1 to 3 of 3

Thread: urls from page

  1. #1
    SitePoint Addict kunal's Avatar
    Join Date
    Oct 2000
    Posts
    307
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    urls from page

    well... im stumped.. ive been trying to right the correct regex code to get all the urls on an html but havent had any luck basically, im trying to parse all the content of a page, and get all the urls in a neat table...

    anyone got any ideas?

    kunal
    i dunno...

  2. #2
    Dumb PHP codin' cat
    Join Date
    Aug 2000
    Location
    San Diego, CA
    Posts
    5,460
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here is a function to get the links I just wrote. I am sure you can adapt it to print into a nice table. I just wrote it so if there are bugs please post them here or PM me and I'll try to straighten them out.

    PHP Code:
    <?
        
    function getLinks($url) {
            if(!
    $url) {
                return 
    "Error opening $url";
                }
            else {
                
    $url eregi_replace("http://"""$url);
                
    $fp fsockopen("$url"80, &$errorstr, &$errorno30) or DIE("could not connect $errorstr ($errorno)");
                
    fputs ($fp"GET / HTTP/1.0\r\n\r\n");
                    if(
    $fp) {
                        while(!
    feof($fp)) {
                            
    $data .= fgets($fp4096);
                            }
                        
    fclose($fp);
                        }
                    else {
                        return 
    "Error opening $url";
                        }
                }
            
    preg_match_all("/<a href=(\"|\')?([^\"|\'|>]+)/i"$data$matches);
            return 
    $matches;
            }

        
    $links getLinks("www.barnesandnoble.com");
        foreach(
    $links[2] as $key => $link) {
            print 
    "$link<br>";
            }
            
    ?>
    Please don't PM me with questions.
    Use the forums, that is what they are here for.

  3. #3
    SitePoint Addict kunal's Avatar
    Join Date
    Oct 2000
    Posts
    307
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    hmmm.. how would i get the description of the links to? basacially.. get eveything starting from <a> to </a> ?
    i dunno...


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •