SitePoint Sponsor

User Tag List

Results 1 to 12 of 12
  1. #1
    SitePoint Enthusiast
    Join Date
    Jun 2009
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Count lines in a file

    If I have a 10 row file and I only want rows 2-5, how would I be able to do that?

    I have some ideas from Google searches such as loading the lines into an array and counting the array etc but is that the best way?

    Thanks in advance.

  2. #2
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,495
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    PHP Code:
    $lines file($filename);
    echo 
    $lines[1]; // row 2
    echo $lines[2]; // row 3
    echo $lines[3]; // row 4
    echo $lines[4]; // row 5 

  3. #3
    SitePoint Enthusiast
    Join Date
    Jun 2009
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by guido2004 View Post
    PHP Code:
    $lines file($filename);
    echo 
    $lines[1]; // row 2
    echo $lines[2]; // row 3
    echo $lines[3]; // row 4
    echo $lines[4]; // row 5 
    If file() is used the entire file is loaded into memory as an array correct? Lines 2-5 in a 10 line file is just an example. I am trying to work with a 3GB file with 30 million records. If I am right about the loading into memory part that would be kill.

    I was thinking something like:

    PHP Code:
    $i=0;
    $fd fopen("myfile.txt""r");
    while(!
    feof($fd)){
        
    $i++;
        if((
    $i 1)&&($i 6)){
            
    //read file using fread()
            //insert into db
        
    }
    }
    fclose($fd); 
    But the above would read through extra lines. I am trying to avoid that if possible.

  4. #4
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,495
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by Rash516 View Post
    If file() is used the entire file is loaded into memory as an array correct? Lines 2-5 in a 10 line file is just an example. I am trying to work with a 3GB file with 30 million records. If I am right about the loading into memory part that would be kill.
    Of course. But you were talking about a 10 line file. No need to complicate things for that

    How about http://www.php.net/manual/en/function.fgets.php ?

  5. #5
    SitePoint Enthusiast
    Join Date
    Jun 2009
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by guido2004 View Post
    Of course. But you were talking about a 10 line file. No need to complicate things for that

    How about http://www.php.net/manual/en/function.fgets.php ?
    Sorry, should have stated this from the beginning. I will be working with 30 million line files that are about 3GB each. The 10 liner was an example.

    Another idea, would it be easier if I broke down the file into files with 50K lines each? Just brainstorming here, let me know if I hit something good.

  6. #6
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,495
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by Rash516 View Post
    Another idea, would it be easier if I broke down the file into files with 50K lines each? Just brainstorming here, let me know if I hit something good.
    How would you do that? Do you know the length of the lines? If you do, maybe you could use http://www.php.net/manual/en/functio...t-contents.php : calculate the starting point and the total length of the rows you want to read, and read only those.

  7. #7
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Rash516 View Post
    Sorry, should have stated this from the beginning. I will be working with 30 million line files that are about 3GB each. The 10 liner was an example.
    How much? Is it the lines that are this size, or the files: if the former, what kind of media is this data stored on?
    And if that is your data volumes, why aren't you using a database?
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint

  8. #8
    SitePoint Enthusiast
    Join Date
    Jun 2009
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    They are 3GB filesize not line size stored as a csv file. This is an export from another software that we need to import into our database.

  9. #9
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You're just going to have to keep reading each line until you get to the line you want. You might want to see if fread() (and counting the lines yourself) is faster than fgets(). Might be, might not be... since you're writing this in PHP and not C.

  10. #10
    SitePoint Enthusiast
    Join Date
    Jun 2009
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I used the example I provided above. From researching, it looks like it is a good way. Please post if someone has a better way. Thanks for everyone's input!

  11. #11
    Grumpy Minimalist
    Join Date
    Jul 2006
    Location
    Ontario, Canada
    Posts
    424
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Apparently, on newer versions of PHP5, stream_get_line is much faster than fgets.

  12. #12
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,495
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    Once you've read the lines you want to read, you can break out of the loop.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •