IIRC MySql handles this scenario the worst compared to your other options due to the way it handles connections but I don’t do that much with mysql so YMMV. That said, I would look at storing them in the database with some caveats. First keep in mind that you can pretty easily use the database as the backing store and cache the data off to disk as it is served, eliminating one bottleneck. Second, make sure to keep the blobs segmented out to a separate table if possible and be careful with select * over there.
I can’t think of the specifics, but I think I know what you are talking about.
Unless some MySQL guru cares to step in and tell me “Why storing Avatars in MySQL is no big deal”, conventional wisdom stills tells me, “Not a good idea.”
[ot]
FWIW, this topic started as follows…
I let people upload an Image for their Profile Pic/Avatar, but my code only allows one occurrence of any given Image File. (The only thing worse than one person using Miley Cyrus as their Avatar, is 500 people doing it. Yuck!!)
To stop everyone from doing that, my script uses the function hash_file() to take “Debbie At Beach.jpeg” and convert it to “98543890918.jpeg”, then store the renamed file to the “uploads” directory while also storing the new name into the Member’s record in the “member” table.
The problem is that if I upload a new photo, “Debbie At Mall.jpeg” then the original file is getting left behind?!
There is a PHP function called unlink() which deletes files, but I see that as a major security risk - especially since it would be in my “upload.php” script which is outward facing.
So it was suggested that if I store the Images in the database, then I have extra security, and less fears of some hacker deleting every Image on my website!!
Likely a true point, BUT is it worth the extra coding and Database strain??
From what I have heard over the years, this wouldn’t be a first choice…[/ot]
I have heard of large sites where the images were not stored in the database and the business staff forgot to ask to have anything other than the database backed up. When the hard drive crashed the support staff recovered the database but all the images were gone.
If you have transaction processing on your database and it crashes part way through then the database will get rolled back. Any image that is being added or deleted as a part of the transaction will only get correctly handled if it is stored in the database.
From what I have seen, the bigger the system the greater the benefit to having all the images inside of the database. For most small systems it is more efficient to keep the images separate.
So if you are talking about a system involving ten million images then those images should probably be stored in the database. If you only have a couple of thousand images then they probably ought to be kept separate. There are other factors that affect which is the better choice but system size is often a good indicator.
flickr is one of the biggest photo storage sites that I can think of and according to that article the actual images are stored in the file system.
Photos are stored on the filer. Upon upload, it processes the photos, gives you different sizes, then its complete. Metadata and points to the filers, are stored in the database.
There has been a thread in the past, i can’t remember the thread title. One thing you’ll need to keep in mind is the file space taken up, say images are bing stored in the database, with for example an average of 15MB each, and you have 15,000 images, that could end up being 200GB+ in size, which limits your options for backing up (an external hard drive would probably the only viable storage medium for a local backup of a database that size.
Also the database server might not necessarily be on the same server box as PHP, they could be at opposite ends of a data centre with an unknown number of routers and switches between them. You have to consider the bandwidth that would take up between the servers.
Another reason to store images and blobs in the database is if you have to enforce strict access control, including preventing access to images from non-web users who may have access to the filesystem on the server.
Thanks for the responses, but you guys give me too much credit!!
My website is being built entirely by me, run entirely by me, backed up entirely by me, and managed entirely by me.
Version 1 of my website averages about 30 visitors a day. (If I hit 100 visitors per day, I’ll run naked across the U.S.!!!)
Also, for now, all of this is going on a single Virtual Private Server (VPS) with my hosting company, which is currently GoDaddy.
Small potatos for now…
True.
I don’t understand you.
I use InnoDB, but I don’t use any “Database Transactions” as part of my “Upload an Image” script.
So, at a high level, how exactly I code things to take an Image submitted on an HTML Form and stick it into MySQL?!
Currently, only one image is allowed per person.
I don’t have my code in front of me, but I think I allow something like up to a 4 MB Image which then gets shrunk down to something like 75 X 75 pixels, so the actual File size is maybe 5-10 KB. (As I recall.)
How many Users?
Who knows?
I am hoping to grow maybe 10,000 unique Users. (Could be more, and certainly could be much less!)
So you should have no trouble with maintaining the site with the images stored as files.
The purpose of using transactions is that you can run a related set of database calls and if any one of them fails for any reason then all of them are reversed out. This ensures that the integrity of the database is maintained. For example a transaction that is inserting into three separate tables can use three separate SQL calls to do it and if the third insert failed because a record with that key was already there then the prior two inserts would automatically be undone as well when the ROLLBACK command runs. The changes are only made premanent when the COMMIT is run.
If your "transaction consists or inserting one record into the database and saving an image and the image is kept as a file or is being inserted into the database but without using a transaction then failing to save the image would not undo the prior database insert (or if you ran them in the opposite order having the database insert fail would not remove the image).
With the small volume you are talking about these can easily be taken care of by you if they occur (which would be very infrequent with such a low volume).
Interesting and valid reasons for enclosing the images in a database and I sympathise with the company that failed to backup the images in a directory.
Off Topic:
Website Design & Content - Reviews & Critiques
Perhaps you should submit your Version 1 website to the above forum. There are some knowledgeable members who could no doubt suggest improvements which may influence Version 2.
One more thing – having transactions available is one of the big advantages of storing the image file in the database. You don’t get transactions with filesystems.
At your scale I’d just pick whatever was the easiest to code and run with it.
Interesting, I would have thought that the opposite is true. Storing images as files will always be more efficient than in the database so if a site has low traffic then it doesn’t really matter (performance-wise) how they are stored but when a site reaches huge numbers of visitors then the database will be under heavy strain - and the web server as well, because some scripting language will be needed to serve them. So what are your advantages of using db in large systems? Is it data integrity?
Personally, I would only use db for images if the collection of the images being in sync with the rest of the db were absolutely critical for the application (like important document scans, etc.). Usually this is not the case - and in a case of a rare crash if there are a few images missing (like avatars, gallery photos, etc.) then this is not a big deal. Security, in most cases, can be achieved by putting the images outside the web directory.
Of course, backing up files is equally important, so if a company forgot to set up proper backups then this by no means indicates that we should store images in the db
Well as others have said, you would get the benefits of ACID. Either images are uploaded/moved/saved/deleted, or they aren’t. Avatars don’t sound so important but if an image is an integral part of the whole user account, then I’d imagine doing anything with a user account would benefit from complete transactions: creating or deleting it, for example.
but when a site reaches huge numbers of visitors then the database will be under heavy strain
I wonder if anyone does something like, images are first stored in a db and afterwards, frequently-requested images are copied to a temp filesystem area for quicker access, with short “expirey” dates where a cron job deletes them??
At work we use REDIS to store stuff like last-products-seen, related-products, brand-groups and sorting stuff like most-popular. It seems the image data (URL) is kept with the product information in the cache but it doesn’t seem the image itself is, makes sense since we only have so much memory (I can’t tell on the front-end though because the images are often stupidly large and I see the browser struggling to render those b*tches at a smaller size… arg, frustrating, painting is a known browser bottleneck! Even when the image is 304).
Debbie picks some impossible task (e.g. “I bet I can’t get over 100 visitors per day to my website!”), and then if SitePoint members help her accomplish it, then she has to run naked some great distance?!
Sounds like an interesting topic to research after I get version 2.0 done…
I think this would make a good subject for a test if anyone has got the right setup, say 10 images, then 50, 100, 250, 1000 and perhaps then 5,000 images, each amount tested with one having the images in the file system and the other having the images in the database. Trying out things like adding say 20 images, deleting 20 images, altering 20 images, generating a page with thumbnails of all the images (all on one page - no paging allowed), to see how long it takes and the peak amount of memory used. Perhaps also another set of the same tests with MySQL (or whatever DB server is being used) being located on a separate server box.
It’ll probably make a difference as to what size the images are, ie avatars will be a lot smaller file size then photos.
Yes, certainly ACID is nice, however I haven’t really yet dealt with a situation where images were so critically important to go that route. With the db it’s also nice to have all data in one place so the db backup is a backup of everything. However, this could make db dumps very large. Each solution has it’s pros and cons so it’s not easy to say which one is better.
That could work, however you would have to make some mechanism to know which images are needed for quick access and each request for an image would also need to check if the file has not expired and needs to be re-fetched from the db. This would need to be done in php probably and also increase the server load. Or, you could create cron jobs that will scan server logs and handle it “externally” somehow. All of this sounds too complicated to me.
A workable solution, I think, would be to simply have a mirror of all images from the db in the filesystem - this way there is no overhead when serving them to users and we have the benefit of ACID (in case of problems it would be enough to run some sync script that would update the files on the web server). However, this also means doubling the space required for the images.
I think it all comes down to the question which images are critical enough to need to have them in the db. For me, user profile images aren’t. Photo gallery images aren’t.
For me, user profile images aren’t. Photo gallery images aren’t.
Only thing that comes to mind right away for me are medical images, which nowadays are digitised and are an integral part of a user’s (patient’s) account.
They’d be all compressed to hell too… an MRI with multiple views can be hundreds of finely-detailed large images.
On the other hand usually those images are built in the computer through calculations, so I wonder then if it would make sense to store the basic data in the db and the resulting images elsewhere…