SitePoint Sponsor

User Tag List

Results 1 to 16 of 16
  1. #1
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,914
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    Keeping Images Unique

    I am building a "User Profile" feature on my website, including allowing Users to uploads a Profile Picture.

    The problem is that two separate Users could have a Picture with the same name (e.g. "me.jpg").

    What is the best way to handle this in my Upload Script?

    One person I know suggested having a "User Folder" for every User. But what happens if my website grows to 20,000 Users? (That has got to be enough to make even a Linux Server choke?!)

    I could append the "UserID" to each Image, but since it starts at "1" currently, that would look weird. Plus, you would want it fixed-width like "000001".

    I could append the "Email", but that isn't very reliable. (I am wondering if I should have made people create a "Username" too...)

    What do you think I should do?

    Thanks,


    Debbie

  2. #2
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    As users aren't general bothered how you store/name profile pictures, I'd SHA1 hash the current time along with their user id and use that.
    @AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.

  3. #3
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,914
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by AnthonySterling View Post
    As users aren't general bothered how you store/name profile pictures, I'd SHA1 hash the current time along with their user id and use that.
    Why Hash the Filename?

    Doesn't it make sense to have the Filename be "self-identifying" (e.g. "doubledee_01.jpg", "anthonysterling_02.jpg")?

    Otherwise what happens if the Pictures ever got mixed up?!


    Debbie

  4. #4
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    Maybe, but now your mixing things up a little. Which is good, but you need to define this behaviour first so you know *what* to code.

    Is this what you want to happen? Do you want the username/id/original-name/time in there?
    @AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.

  5. #5
    SitePoint Enthusiast
    Join Date
    Sep 2011
    Location
    Utah
    Posts
    48
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    What if you put the user id at the end of the image filename when they are saved? So your example with me.jpg would be me-123.jpg and another one me-124.jpg.

  6. #6
    Community Advisor silver trophy

    Join Date
    Nov 2006
    Location
    UK
    Posts
    2,554
    Mentioned
    40 Post(s)
    Tagged
    1 Thread(s)
    Apart from name collisions, you shouldn't allow a user designated file name from an uploaded file, if there were any security weaknesses in your upload processing, then this would make it easier for an attacker access a malicious file

  7. #7
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,914
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by EastCoast View Post
    Apart from name collisions, you shouldn't allow a user designated file name from an uploaded file, if there were any security weaknesses in your upload processing, then this would make it easier for an attacker access a malicious file
    So just create some random - yet unique "Photo ID" - and leave it at that?

    These Photo ID's will be store din the Member Record, but like I said, I just always worry about what would happen if things ever got mixed up, and then you'd have 10,000 files with name that wouldn't let you easily sort things out.

    I guess my system just needs to not mess up?!


    Debbie

  8. #8
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    If the image in question has its name stored in the DB then it doesn't matter. The DB links it to the user, the file name of the image should be irrelevant. I personalty use "sha1_file" thus any duplicate files that have the same hash only one is saved.
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  9. #9
    Non-Member
    Join Date
    Apr 2010
    Posts
    298
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Why not just have the system create a new directory for each user? That way it will be user1/me.jpg and user2/me.jpg. They are the same file names but different directories which will differentiate the addresses.

  10. #10
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,914
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by logic_earth View Post
    If the image in question has its name stored in the DB then it doesn't matter. The DB links it to the user, the file name of the image should be irrelevant. I personalty use "sha1_file" thus any duplicate files that have the same hash only one is saved.
    Except if you have a collision, then someone loses their photo?!

    If you upload your Photo, and my system assigns the hash 6510723, and then later I come along, and by pure coincidence, my system also assigns me the hash 6510723 then your Photo goes *poof*!!

    Even if I hashed by e-mail or something like that, there could be a collision, so how can I avoid that?!


    Debbie

  11. #11
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    Huh? How does their photo go poof? I don't understand...

    If a files that is ran though "sha1_file" returns the same value as another file, they are identical byte for byte. So why save two? Just reference the same file in the DB for both users. * So it is clear I use the return value of "sha1_file" as the name of the file. This makes sure the file name is unique but also avoids duplicate files.
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  12. #12
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,914
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by logic_earth View Post
    Huh? How does their photo go poof? I don't understand...

    If a files that is ran though "sha1_file" returns the same value as another file, they are identical byte for byte. So why save two? Just reference the same file in the DB for both users.
    * Make sure it is clear I use the return value of "sha1_file" as the name of the file.
    *Debbie needs Food and Sleep*


    I went to all of this trouble to make sure the File Some-User uploaded is a valid image, and it is. So the last thing I need to do is come up with a Unique Name for Some-User's Image.

    I thought you were saying to SHA1 something (??) and get a random Filename, right?

    And I said, "What happens if you have a collisions between two hashed whatevers and you get 12345 and I get 12345?!"

    I can't have two separate User Pictures called "12345.jpeg"

    So 100,000 Users from now, how do I ensure that NEVER happens??

    Follow me?


    Debbie

  13. #13
    SitePoint Wizard wonshikee's Avatar
    Join Date
    Jan 2007
    Posts
    1,223
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    logic_earth's solution is good but do understand that if you have any kind cleaning up, you need to be careful not to delete any files that still have references in the database.

  14. #14
    SitePoint Wizard wonshikee's Avatar
    Join Date
    Jan 2007
    Posts
    1,223
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    *Debbie needs Food and Sleep*


    I went to all of this trouble to make sure the File Some-User uploaded is a valid image, and it is. So the last thing I need to do is come up with a Unique Name for Some-User's Image.

    I thought you were saying to SHA1 something (??) and get a random Filename, right?

    And I said, "What happens if you have a collisions between two hashed whatevers and you get 12345 and I get 12345?!"

    I can't have two separate User Pictures called "12345.jpeg"

    So 100,000 Users from now, how do I ensure that NEVER happens??

    Follow me?


    Debbie
    The chance of a collision is 1/(2^51).

    If you are THAT worried, then simply do a file_exists() check, and if it does exist, append/prepend something.

  15. #15
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    And I said, "What happens if you have a collisions between two hashed whatevers and you get 12345 and I get 12345?!"

    I can't have two separate User Pictures called "12345.jpeg"
    If the two user upload an image that returns the same hash then yes you can.
    If the hash is the same, its means they are identical down to the last bit.
    Saving both would be a waste of storage space.

    sha1_file reads the file and hashes the contents of the file, its not random.
    The same file returns the same hash until it is altered.

    The DB is what connects a user to an image, the file name is irrelevent.
    Have two users use the same image is perfectly acceptable.
    For example, on Sitepoint a lot of users have the same avatar.
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  16. #16
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,914
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    My understanding is that while very unlikely, you could hash "DoubleDee" twice and get the same resulting hash of "6666666", right?

    Since a collision in this case would overwrite another User's Photo, that can't ever happen.

    After thinking about it, having file names be the same as something personally identifiable like e-mail or username seems like a bad idea!

    So I am okay if a User's Photo is given a Filename that is some random string of letters and/or numbers, but again, it must be unique!


    Debbie


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •