Checking if file exists prior to upload with a single HTTP request

I’m using the File API to make an XHR upload script. Basically, the user uploads files to the server without even leaving the page and without using iframes. It’s quite nice and I can even show them thumbnails of images before they’re uploaded. Furthermore, the File API allows multiple files to be selected at once and therefore uploaded at the same time too.

I’ve hit a little hitch. If the file exists already, I need to provide options to either rename or abort. So I was thinking about how HTTP requests work. My limited knowledge of them is that the server receives some headers first, then the body of the request.

Can PHP receive the headers and then stop there or does it have to receive everything all in one go? I suspect it’s the latter and that the server software controls the separation of headers and request body, but I’d like to make sure.

If the answer is “no, PHP has to receive everything”, my next question is, “what would you do?” Do an Ajax request first to check if the file is allowed in or just upload the whole file and take it from there? The likelihood is that 99% of the time the file will be completely new to the server, and that it will be over 1MB in size.

I would either

  1. send a synchronous ajax request to the server to check if the file already exists

  2. upload the file to the server but then before moving the file to it’s final directory, check if the filename already exists in that directory.

My personal preference is 1, but you would need a plan B like option 2 for those with javascript turned off for some reason which means ajax would not be an option.

With either GET or POST, PHP will receive the whole shebang: HTTP headers and body.

You can however use the HEAD request. That will just get you the headers and not the body; perfect for checking if you want to know if a file exists but don’t care what it’s in it if it exists.

So instead of [xhrobject].open(“GET”, …) or [xhrobject].open(“POST”, …) use [xhrobject].open(“HEAD”, …) and the usual check onReadyStateChange (readyState 4 and httpStatus 200 for existing files and httpStatus 404 for non-existing files)

Kalon, why a synchronous request? Surely that should make no difference at all to the server. In fact, surely the server can’t even tell if an XHR is being done synchronously or asynchronously. Please tell me if this is wrong!

Scallio, that’s an interesting suggestion. I’ve been reading about HEAD and it seems to me it wouldn’t help, as it’s supposed to be identical to a GET request, except that the server doesn’t send the body (only the headers). This wouldn’t affect the actual upload, though, and it only affects what the server is sending back.

But couldn’t you do a HEAD request for the image first through AJAX to see if the file exists, and if not upload file, but if does show the question of override, etc? That’s also why the AJAX request should be synchronous as Kalon suggests: you need block everything else until you know whether file already exists or not.

Off Topic:

How do let the user select multiple files btw? Are you using Flash for that?

I’m not too au fair with uploads, but if you have access the filename prior to (full) upload then a HEAD request should suffice.

Are you using some sort of JS or Flash library to assist your upload?

I’m failing to see the complication in the process, unless of course you don’t have access to the filename in question.

Is this a business decision, do all files need to retain their original filename? Maybe you’re approaching this wrong…

Ah I see about the HEAD request. It still involves making an extra request, but it looks like I’ll have to do this. I don’t see why it has to be synchronous, though. The actual upload XHR would only take place once the first request that checks if the file exists has completed and the user has decided whether to go ahead with it or not.

I do have access to the filenames, on the server. It’s not a business decision and the files don’t need to have their original filenames, but I don’t want the user to have duplicates, or for files to be overwritten. So I need to check first, then prompt the user to check that the files are not the same. It’s preferable if they do retain their original names though.

Are you using some sort of JS or Flash library to assist your upload?
I’m writing my own. No Flash, just JavaScript (see below).

<input type="file" multiple="multiple">

Multiple file input, pretty cool. About bloody time as well, but Opera still doesn’t implement it and looks like IE9 isn’t going to either. :frowning: In Chrome and FF you can then also get some info about the file on the client side and even show a thumbnail. [I]And[/I] the user can drag and drop files into the page without having to click a file input button. [URL=“http://hacks.mozilla.org/2009/12/multiple-file-input-in-firefox-3-6/”]Check it out.

And the best thing is I can just upload files directly in the AJAX request. For IE and Opera of course I still have to use an ugly iframe to handle the uploads.

It wouldn’t have to be synchronous. It all depends on whether you need to get the ajax response that says whether the file exists or not before you actually upload anything.

eg…if you send a synchronous ajax request, your javascript in the client pauses until a response from the server is received. If the response says the file doesn’t exist, then you can continue with the upload. If the response says the file exists, then you display an appropriate message in the user’s browser informing them so and then not proceed with the upload.

edit: I am assuming that XHR = XMLHttpRequest object.

Synchronous requests are generally a bad thing because if the server takes a very long time to process the request, the browser is stuck with that. Since in this case an asynchronous request is perfectly suitable, I’d avoid using a synchronous one.

Anyway, thanks for the ideas, I can move forward now. :slight_smile:

I wouldn’t say it’s a bad thing, although the concept behind ajax is asynchronous communications with the server, because there can be situations where synchronous requests are required.

If the server is slow in processing the ajax request for some reason, then it’s likely the rest of the website would be running slowly as well indicating other problems on the server and/or network not related to ajax.

Off Topic:

Just a thought: isn’t synchronous AJAX actually called SJAX? I mean, I’ve never seen any reference to SJAX, but to me it seems it makes more sense than synchronous AJAX. Write it out full and you get Synchronous Asynchronous Javascript And XML, which is a bit weird …

Off Topic:

I think we’re splitting hairs now.

AJAX = Asynchronous Javascript And XML

If you want to send an asychronous ajax request, you set the async parameter in the xmlhttprequest object’s open method to true. If you want to send a synchronous request, you set it to false.

Off Topic:

Yes, I know what AJAX stands for (see my previous post) and I know the difference between asynchronous and synchronous and how to use them thank you very much.

All I was saying is that doesn’t make sense to call a synchronous XMLHttpRequest (or any equivalent) AJAX, seeing as they are not asynchronous at all.

I am splitting hairs here, I know, which is why I would I put my wondering in an off-topic block …

Off Topic:

yep splitting hairs :slight_smile: but it makes perfect sense to me.

think of it as saying a negative positive number is a negative number :idea: which is valid.

Off Topic:

Yeah I think it should be termed SJAX too. It’s a bit like when people talk about “PIN numbers”. You know what they mean, but the pedant inside you winces.

Off Topic:

maybe I’m misunderstanding something here, but I don’t see why you need SJAX and maybe that is why afaik it isn’t called that anywhere.

A synchronous asynchronous request is clearly synchronous - well to me it is

just like

  1. a negative positive number is negative: -1 * A = -A

  2. a positive negative number is negative: 1 * -A = -A

  3. a negative negative number is a positive number: -1 * -A = A

or is there something wrong with my logic? :weyes:

Off Topic:

Exactly! :slight_smile:

Off Topic:

nope, disagree :disagree:

[ot]Kalon, I don’t think the analogy works very well. I see what you mean about the numbers, but in this case (async vs sync) it’s more like saying “backwards forwards” is the same as “backwards”. Or a “left right” is the same as a “left”. It just sounds stupid.

And, we all know two wrongs don’t make a right![/ot]

Off Topic:

Yup.
There are no such things as “synchronous asynchronous” or “asynchronous synchronous” and “synchronous synchronous” is just synchronous, and “asynchronous asynchronous” would be asynchronous I suppose (not sure though, it makes my head hurt a bit to think about it …)
Hence, the rules for positive/negative do not apply to synchronous/asynchronous :slight_smile: