A simple question I’ve been unable to find an answer to.
I want to create temporary files by using tempnam(), and I’m putting them in the directory defined by sys_get_temp_dir(). Is there a danger that the temporary files directory will start to get very large? Or does the filesystem periodically flush this directory clean? I don’t want to start doing maintenance cleaning on a temporary files directory.
PHP deletes the temporary files made by the engine (such as those of the $_FILES array) as part of page clean up (database connections are also formally closed at this point). This cleanup occurs even if a fatal error occurs. The only thing that can stop it is an engine level error, and those are almost impossible to trigger from a script.
I’m familiar with that cleanup, but it didn’t occur to me that it would also apply to manually created files (like with tempnam). So essentially, as part of the cleanup, the temp_dir is emptied?
The thing is, I want to create temporary files that exist after the end of the script, because I want the user to confirm whether the uploaded file should be kept or not. I know I can easily have my own temporary directory, but I thought I’d try the default system temporary directory first.
Files created with tempnam are not cleaned up by PHP (check the docs, especially the changes in 4.0.6 section).
If there is no tmpwatch on the tmp folder, it may fill up unexpectedly (as I once found out after it filled up with “temporarily” stored images for a mailer which were stored in the tmp folder for more than a year ).
Thanks Immerse, I’ll do a bit of reading on tmpwatch. cron jobs are probably no good (I’ve never actually used one), as this is for something that’ll be released “into the wild” for people to put on their own servers.
What about a little piece of PHP that is in the script that, say, checks the time and then runs the cleanup on the same day every month or week. I can see this might be bad practice because the time is checked every time, but considering what some PHP scripts do just to output some HTML, this might not be too bad. What do you reckon?
One big problem I can foresee is how do you know which files you can delete? Maybe there are files there belonging to other processes which you shouldn’t throw away. Perhaps if you name the files yourself, using a specific prefix and you can identify them more easily (e.g. /tmp/raffles_[timestamp].tmp or something).
Kalon’s suggestion is a good one: create a directory somewhere where you store your files. Only downside is that users installing it in the wild will have to change the directory permissions at setup.
How long do you need to store the files? Can you not throw them away at the end of the request?
Edit>>
Ah, I see tempnam can also use a prefix as second argument. Awesome
Well, if it’s a folder that I’ve created solely for this purpose, then I can just empty the whole thing without any worries. Also, if the folder is created by the script and the script is the only one to read, write and delete files from it, could that cause any problems? I would have thought anything created by the script would by default be fully accessible by the script.
Yup, if the script creates it, then it has full access.
However, the script (or more accurately, the server-user that runs the script) has to have write permissions on the folder to create the files. That’s why, when installing just about any piece of PHP open source software, you’ll always need to change the permissions on at least 1 folder.
Although… if you can create a folder in the tmp dir (which is globally writable) this may not be necessary. I wouldn’t count on it though, as different server setups will have different rules.
But yeah, use a folder specifically for the script, and just empty it out every so often.
In my opinion that’s okay. However, I try to avoid file or db access on every request just for this purpose. You can check fast if to run your garbage collector like this:
if (mt_rand(1, 100) == 1) {
// do cleanup
}
or you could use microtime (should be faster) - this way you will set the probability of invoking the cleanup procedure and it will not affect performance for the majority of requests.
You could use filemtime to delete only old files. Personally, I think it’s a good idea to take extra precautions and not to delete every file because there can be weird setups with paths, user errors, etc. For example append .tmp extension to every temp file and then purge only those that end with .tmp. I remember one accident when a new developer made a method to delete temporary images after uploading and because he made a small mistake with the path he managed to wipe out the whole site down to the last file - and yes, the method worked recursively! Fortunately, this was a development server but he could have deleted all other projects also - fortunately it didn’t happen.