Hi there,
What is the general consensus these days when it comes to storage of secure documents and files on web? (PHP and MySQL in mind of course)
Years ago, and in theory only using workgroup level DBs, I was always advised to go with file system as storage and not to allow database to blow up in size. (store only file references, document names for searching …the tidbits of search-friendly text).
Keep in mind I have used MySQL for simple reporting queries, backup/dump CRUD actions etc but generally database design was left out to DBAs or creators of CMS packages…in other words I had little say and choice when it came to ‘CREATE table…’.
At this stage I am looking to pickup a php framework and start gaining more control over some of the processes and with that come questions as such. From what I see most php applications (blogs, shopping carts, booking systems, member generated content etc.) seem to go with MyISAM tables while file system storage seems to be the mechanism of choice for handling documents & files.
On the other hand, documents tend to be uploaded into public folders and after the path/URI is revealed the mechanism of ‘protecting’ documents is really gone - even in those apps with ‘member authentication’ systems in place.
So, what do most people do in order to protect/store the files while not blowing up database in size?
I know it is a somewhat general question so lets put in some perspective:
Say I wanted to create a shopping cart app that would allow for member/shopper to purchase and download the file within 24hr of transaction.
It would be rather easy part to capture:
- path to file
- timestamp start of file download
- Y/N file download complete updated with a query
- CustomerID of performing the action
but the fact remains that user will be revealed a file path of http5://site.com/downloads/file.zip even with some file path obscuring…
So if we are storing/selling DVD movies we could potentially deal with 5-6GB of a single file…I can only imagine what it will do to MySQL table.
So, respected gurus…what is the general advise with respect to longevity of the application?
Thanks for your time!
Put the documents in private Amazon S3 buckets and provide time-expiring unique URLs to those documents (which S3 has a mechanism for you to create). The benefits are infinite storage capacity, automatic durability through geographic replication, and granular access control.
http://aws.amazon.com/s3/
Thank you Dan,
For the sake of discussion…lets take Amazon out the picture as this potential app should/could be ran on shared or dedicated hosting which should provide more than enough processing power.
I am hoping that my own setup of MySQL and a table or two would be sufficient enough to build member ACLs and what users can access/purchase etc.
Or even to leave Amazon in…how are they doing it then?After all web is web…browser is browser and URIs sit in public realm. Many shopping cart apps are already out there.(again just using a shopping cart and digital product as an example)
Thanks again for your thoughts, much obliged.
Then you don’t link people directly to the file but to a script. Your script checks that the request is valid (is from a paid user, is within the right time window, whatever your check may be) then sends the appropriate headers and file data. The files don’t have to be anywhere publicly accessible, just somewhere the script can access them.
To extend your kindness…any chance you can drop a link to this method or google friendly term for such a PHP script ? …just starting with php sorry for the newbie questions.
Thanks again!
I don’t know of a name for it. I’ll give you an example instead.
Let’s say that once someone purchases an ebook from you that they will download as a PDF file. You insert a row in your ORDERS table in the database for the customer, and you generate a unique non-guessable key for the purchase, like an MD5 hash of their e-mail address combined with a salt.
You send them an e-mail at purchase time built with that key:
http://www.example.com/download.php?key=d41d8cd98f00b204e9800998ecf8427e
This is what download.php does:
<?php
mysql_connect('dbhost', 'username', 'password');
mysql_select_db('dbname');
$sql = "SELECT email, file_purchased FROM orders WHERE md5(concat('salt', email)) = '" . mysql_real_escape_string($_GET['key']) . "' AND purchase_date >= DATE_SUB(CURRENT_TIMESTAMP, INTERAL 1 DAY)";
$result = mysql_query($sql);
if (mysql_num_rows($result) == 0) {
die("You are not authorized to access this file.");
}
$row = mysql_fetch_assoc($result);
header('Content-type: application/pdf');
header('Content-Disposition: attachment; filename="' . $row['file_purchased'] . '"');
readfile('/path/to/' . $row['file_purchased']);
Thank you very much!
I would have struggled to even search for terms for this. I am familiar with md5 for user/pass resets (through other scripts) but have not really thought of using it for this. Brilliant and simple.
Thank you again for taking your time!