SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    SitePoint Addict
    Join Date
    Dec 2001
    Location
    Market Harborough, UK
    Posts
    206
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    File comparisons

    Hi,

    I'd like to be able to compare two files in PHP (running on Apache under Windoze, if that makes a difference). To see if they are the same. How would I set about doing this?

    I'm trying to create a document management system where users can upload files. What I need to do is to reject (or at least warn) a user if the file they are uploading already exists in the system.

    Obviously, I can start with files sizes to narrow down the number of comparisons required, but where to from there?

    Anyone got any ideas?

    Thanks,

    Paul
    Paul Simpson, BSc, MCNI, MCNE

  2. #2
    Prolific Blogger silver trophy Technosailor's Avatar
    Join Date
    Jun 2001
    Location
    Before These Crowded Streets
    Posts
    9,446
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That is a pretty intensive comparison, I imagine - one that I dare not try on my machine for fear it will crash the system...
    Aaron Brazell
    Technosailor



  3. #3
    SitePoint Wizard Mincer's Avatar
    Join Date
    Mar 2001
    Location
    London | UK
    Posts
    1,140
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, it depends how big the files may be. If they're only small, just read them in using file_get_contents (which is binary safe) and compare them. If the files are big, I'm not entirely sure how you could do this without bringing php to a grinding halt.

    Matt.

  4. #4
    "Of" != "Have" bronze trophy Jeff Lange's Avatar
    Join Date
    Jan 2003
    Location
    Calgary, Canada
    Posts
    2,063
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    if you want to know if they are EXACTLY the same, you could use md5_file().
    Who walks the stairs without a care
    It shoots so high in the sky.
    Bounce up and down just like a clown.
    Everyone knows its Slinky.

  5. #5
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you find a match, why not just append a number (or simular) to the end of the filename of the file that already exists, and upload the new file in it's place ?

    Or you could move the existing file into a seperate folder....

    Anything is better than the trouble of comparing 2 files which could have the same filename and extension though have different data formats...

    For example, how are you going to compare a text document against an image ?

    Your looking for trouble IMO

  6. #6
    SitePoint Addict
    Join Date
    Dec 2001
    Location
    Market Harborough, UK
    Posts
    206
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the various responses!

    The problem is that it is highly likley that users will U/L the same file multiple times, so I want to ensure that the new uploads are thrown away. The files may, or may not, have the same name etc.

    What I'm really looking for is an equivalent to DOS's FC utility.

    The MD5 solution looks interesting, does it compute the hash on the file itself or on the filename (which the docs I have imply)?

    Thanks
    Paul Simpson, BSc, MCNI, MCNE

  7. #7
    Prolific Blogger silver trophy Technosailor's Avatar
    Join Date
    Jun 2001
    Location
    Before These Crowded Streets
    Posts
    9,446
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well why don't you simply do this
    PHP Code:
    <?php
    //Do all your stuff to submit the file
    //file on server == $fileold
    //new file == $filenew
    if(file_exists($filenew))
      {
      
    unlink($fileold);
      }
    //do upload
    ?>
    Aaron Brazell
    Technosailor



  8. #8
    "Of" != "Have" bronze trophy Jeff Lange's Avatar
    Join Date
    Jan 2003
    Location
    Calgary, Canada
    Posts
    2,063
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    md5_file() works on the contents of a file.
    Who walks the stairs without a care
    It shoots so high in the sky.
    Bounce up and down just like a clown.
    Everyone knows its Slinky.

  9. #9
    Dumb PHP codin' cat
    Join Date
    Aug 2000
    Location
    San Diego, CA
    Posts
    5,460
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    cyborg has the right idea here. Comparing the hashes of the two will tell you if they are the same.

  10. #10
    Idea Developer
    Join Date
    Sep 2000
    Location
    Bethlehem, PA
    Posts
    521
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by pauls
    Hi,

    I'd like to be able to compare two files in PHP (running on Apache under Windoze, if that makes a difference). To see if they are the same. How would I set about doing this?

    I'm trying to create a document management system where users can upload files. What I need to do is to reject (or at least warn) a user if the file they are uploading already exists in the system.

    Obviously, I can start with files sizes to narrow down the number of comparisons required, but where to from there?

    Anyone got any ideas?

    Thanks,

    Paul
    md5 encode all the files on the system and keep it in a db and just md5 encode he uploaded file, and compare
    Professional PHP programing / Hosting
    aim: downtoi3iz icq: 74637813


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •