Results 1 to 3 of 3
Thread: Unbuffered Queries
Dec 30, 2012, 20:42 #1
- Join Date
- Nov 2003
- Columbus, OH
- 67 Post(s)
- 2 Thread(s)
This is a new area to me within PHP. I'm gearing up for a new role in which PHP will be at my disposal to automate some reporting on some very large DBs. I'll be dumping some CSVs out in which the queries to run and fully export the file will most likely run 20+ minutes.
I'm a complete beginner when it comes to understanding how PHP handles output while its still running. I'm curious, when using mysql_unbuffered_query(), is the script going to timeout at the set time limit? If so what are my options? SHould I be jacking the PHP timeout up to 2 hours? I'll also want to be able to retrieve task progress while it is running, what options might I have for this?
What might some error handling look like for this?Kyle Wolfe
Dec 31, 2012, 03:12 #2
- Join Date
- Dec 2003
- 7 Post(s)
- 0 Thread(s)
ignore_user_abort() is set). This may work differently if you have zlib compression enabled or some other form of output buffering. Therefore, if you want the script to run for a long time it's best not to output anything at all until it finishes or start output buffering and then flush the buffer at the end of the script. For such a long running script I would write any output to a log file - this way I can monitor the progress independently from the script by reading the log. I would certainly not rely on the browser connection to be open for 20+ minutes.
this article should answer your question, especially the second point 2. IMHO, it would be best not to rely on unbuffered queries but instead split the queries with LIMIT x, y so that they return smaller result sets in chunks. But YMMV, I think the reliability of long unbuffered queries will vary greatly depending on the server setup.
set_error_handler to log any errors to a log file.
Jan 1, 2013, 19:37 #3
- Join Date
- Sep 2004
- 4 Post(s)
- 1 Thread(s)
In addition, I would recommend trying to base most of the actual generating of the information as stored procedures if it is not already setup this way, the stored procedures will generate the actual report into a temporary table created for this specific report. Please note this is not a real temporary table, but one you created for this report and will delete after everything is completed.
Then all the other language needs to do is compile the csv files from the temporary table and append any magic if something need to be altered or combined.
Remember that it is usually much faster to run several smaller queries than one large query, so make sure you test this (The larger data set you work with, the more true this statement is).
If you decide to stick with PHP, write this as a CLI script.
On another side, a good question is if you will be able to change these times? And if you are, will it be a good idea to do it? I.e. are the data accessed by other people etc. Since if you increase these timeouts it also makes it much easier to do a successful denial of service attack with limited resources, i.e. capturing all of the connections and keeping them alive.