SitePoint Sponsor

User Tag List

Results 1 to 2 of 2
  1. #1
    SitePoint Zealot txt3rob's Avatar
    Join Date
    Jul 2013
    Location
    Liverpool UK
    Posts
    199
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    scrape for online or offline status

    Hi

    i would like to scrape some info from a page only the status if it's online or offline.

    it has the following code
    Code:
    <span title="Online
    that i would like to check.

    so if the Online is found i want to show a page and if the page is offline it shows a message.

    what would the best way to do this

  2. #2
    SitePoint Zealot txt3rob's Avatar
    Join Date
    Jul 2013
    Location
    Liverpool UK
    Posts
    199
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    found something now
    PHP Code:
    <?php   
        
    // Defining the basic cURL function
        
    function curl($url) {
            
    // Assigning cURL options to an array
            
    $options = Array(
                
    CURLOPT_RETURNTRANSFER => TRUE,  // Setting cURL's option to return the webpage data
                
    CURLOPT_FOLLOWLOCATION => TRUE,  // Setting cURL to follow 'location' HTTP headers
                
    CURLOPT_AUTOREFERER => TRUE// Automatically set the referer where following 'location' HTTP headers
                
    CURLOPT_CONNECTTIMEOUT => 120,   // Setting the amount of time (in seconds) before the request times out
                
    CURLOPT_TIMEOUT => 120,  // Setting the maximum amount of time for cURL to execute queries
                
    CURLOPT_MAXREDIRS => 10// Setting the maximum number of redirections to follow
                
    CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",  // Setting the useragent
                
    CURLOPT_URL => $url// Setting cURL's URL option with the $url variable passed into the function
            
    );
             
            
    $ch curl_init();  // Initialising cURL 
            
    curl_setopt_array($ch$options);   // Setting cURL's options using the previously assigned array data in $options
            
    $data curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable
            
    curl_close($ch);    // Closing cURL 
            
    return $data;   // Returning the data from the function 
        
    }

        
    // Defining the basic scraping function
        
    function scrape_between($data$start$end){
            
    $data stristr($data$start); // Stripping all data from before $start
            
    $data substr($datastrlen($start));  // Stripping $start
            
    $stop stripos($data$end);   // Getting the position of the $end of the data to scrape
            
    $data substr($data0$stop);    // Stripping all data from after and including the $end of the data to scrape
            
    return $data;   // Returning the scraped data from the function
        
    }


        
    $scraped_page curl("http://thewebpage");    // Downloading IMDB home page to variable $scraped_page
        
    $scraped_data scrape_between($scraped_page"<title>""</title>");   // Scraping downloaded dara in $scraped_page for content between <title> and </title> tags
         
        
    echo $scraped_data// Echoing $scraped data, should show "The Internet Movie Database (IMDb)"
    just gotta try sort the tags out properly


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •