I’m creating a MySQL database that needs to be able to store multiple images from each user and am unsure how I should store them. I’ve heard that you can store the image itself as binary data right into MySQL or store the images in a directory structure with references to their location in the database.
Any information on which option is better and how you would go about do this would be greatly appreciated.
There are advantages and disadvantages to both of the alternatives. Storing the actual images in the database itself means that you have as much control of the images as you do of the rest of the data and the images also get included in the database backups. Storing them separately means they take up less space but are accessible to anyone and not controlled by the database. So deleting everything in the database that the image is associated with would not necessarily delete the image.
You need to work out which of the advantages and disadvantages of the two alternatives are the more important to your situation in order to decide which way to go.
Basically neither alternative is always better than the other.
One thing you need to keep in mind is the sizes of the files, say the files being stored are 10MB in size, now just a few that size you can get away with storing in the database but if you’ve got 1,000s of them at 10MB a time, then the database is going to get huge and could make backing up the db a PITA
I’ve hears of a company that decided that because the thousands of images would take up so much space in the database they kept them outside in a separate folder. Only the database was backed up so when the system crashed they lost all their images. It wasn’t quite as bad as they thought though because only about 10% of the images were still referenced from within the database so the 90% of what they lost they could have got rid of much earlier anyway (and would have been if the images were stored in the database).
It wasn’t quite as bad as they thought though because only about 10% of the images were still referenced from within the database so the 90% of what they lost they could have got rid of much earlier anyway (and would have been if the images were stored in the database).
This is true. Of course, it isn’t that difficult to write the code in a way it deletes the images from the folder as well.
And sooner or later you’ll have to put those images in a folder anyway, don’t you? You can’t show an image that is stored in a database.
My personal prefernce would be to store images in the file system, recording their details (name, extension, size, folder etc) in the database using unlink() when deleting a file (and obviously removing the relevant entry from the db.
If the disk that a db with images stored in it fails then your going to loose the images, just as easily as if they are stored in the file system on a disk that has failed (though some of the images could potentially be recovered, there are companies about who can recover files), it is probably easier to try and recover individual images which have been stored in the file system from a failed disk then it would be to recover them when they are stored in a database.
If images are stored in a db and the db becomes corrupted beyond repair then all images are lost, where as if they are stored in the file system and the db becomes corrupted then the relevant script(s) could be run to rebuild the db. If whatever has happened to a disk has corrupted the images when they are stored in the file system then chances are that the db may also have been corrupted unless the db is stored on a separate disk.
In either case whether the images are stored in the db or in the file system, in addtion to backing up the db, the file system should also be backed up. If the images are stored in the db, once the db reaches a certain point then your options for backup methods are more limited, if the db reaches whatever the maximum size capacity is for bluray disks then your only option for backups would be to another hard drive.
Is the server that you intend to use a shared server or a dedicated server? Is it a virtual private server?
In both cases you’ll need to consider permissions (who has access and what rights).
No - you can display then straight from the database into the web page by using the appropriate code.
For example just link to the following code from an <img> tag in your web page to display an image straight from the database without saving it anywhere on the server as a file:
header('Content-type: image/jpg');
$stmt = $pdo->query("select myimage from image_table where imageid =1");
$row = $stmt->fetch();
print $row[0];
To load images into the database and retrieven them back into a file are also simple:
update image_table set myimage = LOAD_FILE('animage.jpg')
select myimage into DUMPFILE 'newimage.jpg'
Also the problem of losing the 10% of the images wasn’t incompetence - it was lack of communication between the team writing the application and the team responsible for the backups - the one team was never informed that anything needed backing up except the database and the other team was never informed that only the database was being backed up. As for the 90% of images that weren’t needed anyway - that was because the images were not stored in the database and so deleting the last of the thousands of database references to an image didn’t delete the image as well. Some were also leftovers because the database query did a ROLLBACK and the image didn’t understand that meant that it diidn’t need to save itself.
Also images outside the database can be easily stolen whereas ones in the database are as secure as everything else in the database.
There are lots of reasons for NOT storing images in the database and a number of reasons FOR storing them in the database. Anyone who automatically chooses one option or the other because that’s the way they have always done it is the one demonstrating that they are incompetent. This might just be the exception case where doing it the other way will provide more advantages and less disadvantages.
Personal preference shouldn’t come into it - both ways have advantages and disadvantages and when you weigh them all up in a given situation one way or the other will be obviously better regardless that you prefer the way you know - having not learnt how the alternative works and the situations where that alternative perhaps gets rid of 90% of your code and maybe runs 20% faster.
I assume that the OP asked the question in the first place because they wanted to consider the pros and cons of both alternatives and how they apply to their particular situation and are not holding a popularity contest.
Thanks, didn’t know that. I learn something new here everyday.
Also the problem of losing the 10% of the images wasn’t incompetence - it was lack of communication between the team writing the application and the team responsible for the backups - the one team was never informed that anything needed backing up except the database and the other team was never informed that only the database was being backed up.
Incompetence may have been a bit strong, I admit. But still, if communication had been good, this wouldn’t have happened.
As for the 90% of images that weren’t needed anyway - that was because the images were not stored in the database and so deleting the last of the thousands of database references to an image didn’t delete the image as well. Some were also leftovers because the database query did a ROLLBACK and the image didn’t understand that meant that it diidn’t need to save itself.
Yes, I agree that the effort to keep separate handling of images and data synchronized (I’m sure that’s not the correct word, but I hope you understand what I mean) depends on the complexity of the application, and is surely a factor to be considered.
[ot]
Also images outside the database can be easily stolen whereas ones in the database are as secure as everything else in the database.
Makes me wonder: wouldn’t it be possible to store them in a protected folder, and display them with some code similar to that used to get them from the database?[/ot]
That goes part way to achieving the security that placing the images in the database would give but you’d need to add code to it to test if the current person is allowed to access the image. When extracting from the database that would already have been taken care of as that criteria would be part of the WHERE clause on the select.
Again its a trade off between the data taking up more space and the code being a lot longer.
Which way you go with this will mostly depend on the size of the application you are developing. For tiny applications keeping the images separate usually has advantages but with bigger applications the benefits of placing the images in the database usually far exceed the alternative and can save millions of lines of code.
In the book “SQL Antipatterns” by Bill Karwin placing images outside of the database is one of the 24 practices that he identifies as usually being a bad idea. There is a whole chapter of that book that covers all the different sorts of things that can go wrong if you keep the images separate as well as lists of the pros and cons of both alternatives so that you can decide whether your application is one where the benefits of placing the images separately outweigh all the disadvantages that you get from doing so.
From my experience the main problem people have in this area is that keeping the images separate is usually a good idea for tiny applications but the advantages of doing that disappear as the application gets bigger while those used to working on the tiniest of applications don’t realise the problems they are creating for themselves when they continue to keep the images separate when their application has grown big enough to make it better to put them in the database - even if the only benefit you will get from that is automatic cleanup of images on a ROLLBACK it might be worth it.