How To Handle File Uploads With PHP

An common challenge faced by PHP programmers is how to accept files uploaded by visitors to your site. In this bonus excerpt from Chapter 12 of the recently published SitePoint book: Build Your Own Database Driven Web Site Using PHP & MySQL (4th Edition) by Kevin Yank, you’ll learn how to accept file uploads from your web site visitors securely and store them.

The first 4 Chapters from this book are also available on sitepoint.com. If you’d rather read them offline, you can download the chapters in PDF format.

We’ll start with the basics: let’s write an HTML form that allows users to upload files. HTML makes this quite easy with its <input type="file"/> tag. By default, however, only the name of the file selected by the user is sent. To have the file itself submitted with the form data, we need to add enctype="multipart/form-data" to the <form> tag:

<form action="index.php" method="post" 
   enctype="multipart/form-data">
 <div><label id="upload">Select file to upload:
   <input type="file" id="upload" name="upload"/></label></div>
 <div>
   <input type="hidden" name="action" value="upload"/>
   <input type="submit" value="Submit"/>
 </div>
</form>

As we can see, a PHP script (index.php, in this case) will handle the data submitted with the form above. Information about uploaded files appears in a array called $_FILES that’s automatically created by PHP. As you’d expect, an entry in this array called $_FILES['upload'] (from the name attribute of the <input/> tag) will contain information about the file uploaded in this example. However, instead of storing the contents of the uploaded file, $_FILES['upload'] contains yet another array. We therefore use a second set of square brackets to select the information we want:

$_FILES['upload']['tmp_name']

Provides the name of the file stored on the web server’s hard disk in the system temporary file directory, unless another directory has been specified using the upload_tmp_dir setting in your php.ini file. This file is only kept as long as the PHP script responsible for handling the form submission is running. So, if you want to use the uploaded file later on (for example, store it for display on the site), you need to make a copy of it elsewhere. To do this, use the copy function described in the previous section.

$_FILES['upload']['name']

Provides the name of the file on the client machine before it was submitted. If you make a permanent copy of the temporary file, you might want to give it its original name instead of the automatically-generated temporary filename that's described above.

$_FILES['upload']['size']

Provides the size (in bytes) of the file.

$_FILES['upload']['type']

Provides the MIME type of the file (sometimes referred to as file type or content type, an identifier used to describe the file format, for example, text/plain, image/gif, and so on).

Remember, 'upload' is just the name attribute of the <input/> tag that submitted the file, so the actual array index will depend on that attribute.

You can use these variables to decide whether to accept or reject an uploaded file. For example, in a photo gallery we would only really be interested in JPEG and possibly GIF and PNG files. These files have MIME types of image/jpeg, image/gif, and image/png respectively, but to cater to differences between browsers, you should use regular expressions to validate the uploaded file's type:

if (preg_match('/^image/p?jpeg$/i', $_FILES['upload']['type']) or 
   preg_match('/^image/gif$/i', $_FILES['upload']['type']) or
   preg_match('/^image/(x-)?png$/i', $_FILES['upload']['type']))
{
 // Handle the file...
}
else
{
 $error = 'Please submit a JPEG, GIF, or PNG image file.';
 include $_SERVER['DOCUMENT_ROOT'] . '/includes/error.html.php';
 exit();
}

The exact MIME type depends on the browser in use. Internet Explorer uses image/pjpeg for JPEG images and image/x-png for PNG images, while Firefox and other browsers use image/jpeg and image/png respectively. See Chapter 8, Content Formatting with Regular Expressions for help with regular expression syntax.

While you can use a similar technique to disallow files that are too large (by checking the $_FILES['upload']['size'] variable), I'd advise against it. Before this value can be checked, the file is already uploaded and saved in the temporary directory. If you try to reject files because you have limited disk space and/or bandwidth, the fact that large files can still be uploaded, even though they're deleted almost immediately, may be a problem for you.

Instead, you can tell PHP in advance the maximum file size you wish to accept. There are two ways to do this. The first is to adjust the upload_max_filesize setting in your php.ini file. The default value is 2MB, so if you want to accept uploads larger than that, you'll immediately need to change that value. A second restriction, affecting the total size of form submissions, is enforced by the post_max_size setting in php.ini. Its default value is 8MB, so if you want to accept really big uploads, you'll need to modify that setting, too.

The second method is to include a hidden <input/> field in your form with the name MAX_FILE_SIZE, and the maximum file size you want to accept with this form as its value. For security reasons, this value can't exceed the upload_max_filesize setting in your php.ini, but it does provide a way for you to accept different maximum sizes on different pages. The following form, for example, will allow uploads of up to 1 kilobyte (1024 bytes):

<form action="upload.php" method="post" 
   enctype="multipart/form-data">
 <p><label id="upload">Select file to upload:
 <input type="hidden" name="MAX_FILE_SIZE" value="1024"/>
   <input type="file" id="upload" name="upload"/></label></p>
 <p>
   <input type="hidden" name="action" value="upload"/>
   <input type="submit" value="Submit"/>
 </p>
</form>

Note that the hidden MAX_FILE_SIZE field must come before any <input type="file"/> tags in the form, so that PHP is apprised of this restriction before it receives any submitted files. Note also that this restriction can easily be circumvented by a malicious user who simply writes his or her own form without the MAX_FILE_SIZE field. For fail-safe security against large file uploads, use the upload_max_filesize setting in php.ini.

Assigning Unique Filenames

As I explained above, to keep an uploaded file, you need to copy it to another directory. And while you have access to the name of each uploaded file with its $_FILE['upload']['name'] variable, you have no guarantee that two files with the same name will not be uploaded. In such a case, storage of the file with its original name may result in newer uploads overwriting older ones.

For this reason, you'll usually want to adopt a scheme that allows you to assign a unique filename to every uploaded file. Using the system time (which you can access using the PHP time function), you can easily produce a name based on the number of seconds since January 1, 1970. But what if two files happen to be uploaded within one second of each other? To help guard against this possibility, we'll also use the client's IP address (automatically stored in $_SERVER['REMOTE_ADDR'] by PHP) in the filename. Since you're unlikely to receive two files from the same IP address within one second of each other, this is an acceptable solution for most purposes:

// Pick a file extension 
if (preg_match('/^image/p?jpeg$/i', $_FILES['upload']['type']))
{
 $ext = '.jpg';
}
else if (preg_match('/^image/gif$/i', $_FILES['upload']['type']))
{
 $ext = '.gif';
}
else if (preg_match('/^image/(x-)?png$/i',
   $_FILES['upload']['type']))
{
 $ext = '.png';
}
else
{
 $ext = '.unknown';
}

// The complete path/filename
$filename = 'C:/uploads/' . time() . $_SERVER['REMOTE_ADDR'] . $ext;

// Copy the file (if it is deemed safe)
if (!is_uploaded_file($_FILES['upload']['tmp_name']) or
   !copy($_FILES['upload']['tmp_name'], $filename))
{
 $error = "Could not  save file as $filename!";
 include $_SERVER['DOCUMENT_ROOT'] . '/includes/error.html.php';
 exit();
}

Important to note in the above code is the use of the is_uploaded_file function to check if the file is "safe." All this function does is return TRUE if the filename it's passed as a parameter ($_FILES['upload']['tmp_name'] in this case) was in fact uploaded as part of a form submission. If a malicious user loaded this script and manually specified a filename such as /etc/passwd (the system password store on Linux servers), and you had failed to use is_uploaded_file to check that $_FILES['upload'] really referred to an uploaded file, your script might be used to copy sensitive files on your server into a directory from which they would become publicly accessible over the Web! Thus, before you ever trust a PHP variable that you expect to contain the filename of an uploaded file, be sure to use is_uploaded_file to check it.

A second trick I have used in the above code is to combine is_uploaded_file and copy together as the condition of an if statement. If the result of is_uploaded_file($_FILES['upload']['tmp_name']) is FALSE (making !is_uploaded_file($_FILES['upload']['tmp_name']) TRUE), PHP will know immediately that the entire condition will be TRUE when it sees the or operator separating the two function calls. To save time, it will refrain from bothering to run copy, so the file won't be copied when is_uploaded_file returns FALSE. On the other hand, if is_uploaded_file returns TRUE, PHP goes ahead and copies the file. The result of copy then determines whether or not an error message is displayed. Similarly, if we'd used the and operator instead of or, a FALSE result in the first part of the condition would cause PHP to skip evaluating the second part. This characteristic of if statements is known as short-circuit evaluation, and works in other conditional structures such as while and for loops, too.

Finally, note in the above script that I've used UNIX-style forward slashes (/) in the path, despite it being a Windows path. If I'd used backslashes I'd have had to replace them with double-backslashes (\) to avoid PHP interpreting them as escaped characters. However, PHP is smart enough to convert forward slashes in a file path to backslashes when it's running on a Windows system. Since we can also use single slashes (/) as usual on non-Windows systems, adopting forward slashes in general for file paths in PHP will make your scripts more portable.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

No Reader comments

Comments on this post are closed.