SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    257
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    file_get_contents speed

    Hello everyone!

    I was wondering if there is anyway to speed up this code below. Basically what it does is connect to mysql database and tosses around 2400 filenames right now into the $search_list array. Then it loops thru the array and checks the existance of those filenames according to my base_url. The problem I am having is it takes about 24 minutes to loop thru and check 2400 filenames. Which is good and bad.... but if you had any input on how to to speed up this process it would be greatly appreciated.


    PHP Code:

    //this function checks url to see if it exists
    function load_file($address

        
    $contents = @file_get_contents($address); 
        if(
    $contents) return true;
        else return 
    false;       
    }

    //array that stores everything to be searched
    $search_list = array();
    $select_results count($search_list);

    //loops thru the array
    for($row 0$row $select_results$row++)
    {
        if(
    load_file($base_url.$search_list[$row]))
        {
            echo 
    $base_url.$search_list[$row]."<br>";
            
    flush();
        }


  2. #2
    SitePoint Wizard Ren's Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    1,060
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If your just 404 testing, then you dont need to get the contents, just use fopen()

    PHP Code:
    function load_address($url)
    {
        
    $handle fopen($url'r');
        if (!
    is_resource($handle))
            return 
    false;    
        @
    fclose($handle);
        return 
    true;


  3. #3
    SitePoint Wizard silver trophy
    Join Date
    Mar 2006
    Posts
    6,132
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    since it looks like your checking urls, not local filenames this operation is going to be slow no matter what, but you can speed it up.

    instead of using file_get_contents(), which is downloading the entire server response, you can just open a connection to get the servers http response, then drop the connection to save time.


    PHP Code:
    @fclose(fopen($base_url.$search_list[$row], 'r'));
    // $http_response_header is a variable which magically 
    // appears once you open a file pointer to a url
    print_r($http_response_header); 
    one of the elements of the $http_response_header array will contain the code you need. just parse that array and look for 404 or 200 OK etc... to decide whther the file exists or not


    if your using php5 you can just use file_exists(). it didnt support url wrappers until php5

  4. #4
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    257
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I will try both of yours and see which one ends up going faster, thank you for a quick response. Will both of these work even if the page that loads is not a 404 like it redirects if the file is not there, or displays a custom page other than the 404?

  5. #5
    SitePoint Wizard silver trophy
    Join Date
    Mar 2006
    Posts
    6,132
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    a custom error page will usually have the 404 OK header in it, but not always. it also doesnt guarantee you that that page is an error page, ive seen tutorials on the web of how to use 404 errors to rewrite your urls.....terrible, and ive seen quite a few websites using it.

    if the page still exists but has moved, you could technically follow the new URI that will be in the response headers and check for a 200 on the new URI, but you have no idea if its directing you to the proper page, it could be directing you to some custom error page and not sending a 404 header etc...

    i would recomend checking for the 200 response code. if you find it, theres a good chance thats the page you want. if you get any other code, i would just consider it a 404.

    your gonna have to write a lot of code if you want to get more accurate than that.

  6. #6
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    257
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ah I see thank you,

    i just want a quick and easy way to see if that url exists

    thank you for your code!


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •