How To Handle File Uploads With PHP

Notice: This is a discussion thread for comments about the SitePoint article, How To Handle File Uploads With PHP.


This needs to be said. If you are uploading images. Do not under any circumstance use $_FILES [*]['type'] this data is fully user controlled. Use either exif_imagetype or [URL=“http://us.php.net/manual/en/function.getimagesize.php”]getimagesize both of which can return a constant determining the actual image type. I am sad to see, that even SitePoint is continuing to use this insecure method of checking the file type for there newcomer articles.

Just to mention, have bought last two versions of the book in order to find out how to upload files. Both editions show how to do so as long as the files are stored IN THE DATABASE itself, and go on to say that this is not an ideal situation, but that information on how to store files in server director (with just file names and paths stored in the database) is common knowledge and can be found elsewhere in the book, but stops short of saying where.

If there is a future edition, please could you put the COMPLETE process in the same chapter.

I had to look elsewhere for a solution, so am sorted now, but it was frustrating at the time and buying the later edition in the hope of finding the complete solution was a disappointment (and costly!).

you can surely strimline all those preg_math’es into one regex?

perhaps have a seperate string for allowed image-types which plugs insite an or regex;;

$types = “jpeg|png|gif”;

/image\/($types){1}/i

Also, why the article uses copy() instead of the move_uploaded_file(), which is more correct thing to do for uploaded files. I mean - you don’t really want to copy the uploaded file (you don’t want to leave a copy of it still hanging around in a temp directory)

move_uploaded_file() also includes the same check as is_uploaded_file(), so using move_uploaded_file() is a much better way then to combine is_uploaded_file() && copy()

Also, the article does not mention any type of input filters.

Overall, I would recommend people to avoid doing everything written in that article and instead do things the right way - use getimagesize() to check image and use move_uploaded_file() to move the file. Also, use the move_uploaded_file as soon as possible - php may destroy the temporary uploaded file at the end of the session, so reference to uploaded file is only really good for one session - if you really want to see your uploaded file again then use move_uploaded_file() to move your file to a permanent location.

And if you want to check not only for images but for some other type of files, then use php’s fileinfo extension to get the mime type of uploaded file.

It might also be worth mentioning that you can just use the PHP function:

getimagesize($path_to_file)

which will return an array of information including the type. The type is either IMAGETYPE_JPEG,IMAGETYPE_GIF, or IMAGETYPE_PNG (at least for the sake of discussion). It is very rare you are just going to accept an image with its uploaded size, etc. While you may keep one for archive, you will likely resize, crop, and even make multiple versions, e.g. thumbnails, for your website. So using the PHP image functions at that point would just be a natural flow of things.

Also, as mentioned by Kevin, IE and FF send different mime-types. Using imagegetsize(), gets rid of potential problems arising. With the number of browsers being used and how they are constantly changing, I do not feel safe relying on $_FILES[*][type].

Apologies if this steps on anyones’ toes but the article feels more like a marketing gimmick (to sell the newest edition of the associated book) compared to anything actually practical and in the spirit of passing along very basic best practices that Sitepoint should be doing.

Personally, I find it vexing that technical writers can still push out this awful code—or is it simply a relic of previous editions with little-to-no revision since earlier publication?

That’s a strange marketing approach - write an article riddled with bad php practices to sell a book on php programming. I actually bought his first book about 6 or 7 years ago. When I was a php beginner, the book seemed very helpful and author very knowledgeable. But not after many years of programming php, I can spot a less than knowledgeable author from a mile away. If this article is a sample chapter, then I would stay far away from the book. A total beginner may adopt these techniques and be using them not knowing that this is a wrong way to handle uploads.

The opening paragraph of the article states:

Hi folks,

Thanks for the feedback on the techniques I present in this section of the book. It would be fair to say that this section has had relatively few changes made to it in the 4th edition compared to the rest of the book. That said, I wouldn’t agree that the code presented is “awful”.

Yes, the three regexps could have been combined into one, but I made the choice to keep them separate to make the code as clear as possible, and to illustrate how one might accept several unrelated file types.

Yes, one might use move_uploaded_file instead of combining is_uploaded_file and copy, but neither option is more secure or correct than the other, and this approach gave me the opportunity to demonstrate the basic copy function, which was introduced along with the other basic file manipulation functions in the previous section of the book, in action.

My thanks to logic_earth for pointing out the problem with relying on the $_FILES[*][‘type’] field to restrict the uploaded file type. Although this would rarely be a security issue (what does an attacker achieve by uploading a malicious code file, for example, if the application is just going treat it as an image?), using something like get exif_imagetype would indeed be a tighter, more reliable solution. I’ll get the article (and the book, in its next printing) updated accordingly.

copy() won’t work if the upload temporary directory is outside open_basedir, which happens to be the case on many shared hosts. You have to use move_uploaded_file().

Also, I’m not entirely sure why we really need to do is_uploaded_file(). It’s not as if the temporary file name is given by the user. It’s generated entirely by PHP. The user cannot actually specify “/etc/password” as the temp. name. I still do the check, but the documentation is fairly useless about this.

And on the topic of security, it is more important to make sure that you force download on image files (or any uploaded files) or you risk XSS attacks in several versions of IE. I wouldn’t trust the browser sending a mime type not so much for security reasons, but because the browser can be flat out wrong about it.

Come on guys, whilst I appreciate security is in no way a trivial matter, this book doesn’t claim to be a guide to building secure PHP applications.

Is it just a re-hash of code for marketing purposes? Probably.

Does it provide what it claims to offer? Sure.

Are there some security issues? Of course.

But hey, show me an application in any language which has never had any. Let’s take it for what it is, an entry level book and a great start for entry level PHP’ers.

:slight_smile:

Good point! I’ll be sure to insert a note to this effect as well.

And on the topic of security, it is more important to make sure that you force download on image files (or any uploaded files) or you risk XSS attacks in several versions of IE.

Can you be more specific? I know some old versions of IE are a bit funky and will second-guess stuff like the MIME-type of a URL ending in .html (treating it as an HTML document even if the MIME type says otherwise), but I’ve never seen anything that would affect a document served with an image MIME type from a .php script.

I don’t think checking the file type is necessarily the best method of checking the file’s type. I’ve run into some instances where the image file type wasn’t listed as iamge/jpeg when the extension was .jpeg (application/octet-stream).

Oh IE is bad. It doesn’t matter what kind of file you tell IE it is, or what the filename it is. If there’s HTML near the beginning of the image’s data, it’s an HTML file! I don’t remember if they’ve patched it in 7 or 8, but it sure is exploitable in IE 6. Just craft an image, throw some HTML near the beginning, and serve with any MIME type. It will be an HTML file according to IE’s interpretation.

You may have noticed some sites (namely Blogger) force download of uploaded images. It’s also annoying as heck though, but hey, you can thank Microsoft…

This is a great post. I have been looking for something just like this for a couple weeks now.

Does anyone know I how would add this into a typical SENDMAIL script and how to ensure a custom “Thank you” page displays after the user submits a contact form with file upload option? Any ideas or good tutorials? I’m frustratingly new to this and stuck behind the 8-ball.

Thanks for the great post!

Craig

Nice article, but there is an awful lot of preg-matching. :wink:

The unknown file extension also seems redundant, since you have argued that only gifs, jpegs and pngs should be allowed at in the first place.

How about:

$ext_is_ok = preg_match('/^image\\/(gif|p?jpeg|(x-)?png)$/i', $ext);
if ( $ext_is_ok ) {
    $ext = $ext[1]; // Get extension, 1st parenthesis
    switch ( $ext ) {
        case 'png':
        case 'x-png':
            $ext = '.png';
            break;
        case 'jpeg':
        case 'pjpeg':
            $ext = '.jpg';
            break;
        case 'gif':
            $ext = '.gif';
            break;
    }
    // Continue to handle the upload
} else {
    // error handling
}

Better yet forget using preg_match anything…


$d = getimagesize( $img ); # or exif_imagetype
$n = 'new_img_name' . image_type_to_extension( $d[2] );

Perhaps, but you will be allowing more types than gifs, pngs and jpegs.

I’m sure you can find a way to limit what types you let though with that method.

Yes I could. I was just pointing out the obvious fact, since I teach PHP to students that have a tendency to miss obvious facts.