Thoughts on large / long script

Paracoding here, I have something like this:


//curl_req.php
class curl_req {
   public function getResults($arr) {
    //boring logic, make curl request, arrary push results, then return the array
   }

}

//index.php
set_time_limit(0);
require_once 'inc/curl_req.php';
$arr = array(
"1",
"2"
);
$ver = new curl_req();
$ver->getResults($arr);
//print results

Works like a charm, I print the results to a table in the web browser, set the time limit to 0 for larger files. My problem:

These files have to be run from a remote linux box. I want to be able to do an ajax request or similar to check the status, is this possible? How?

I think you mean how to store the status of the last fetch. Either on the File System, in a DB or Tweet it or something like that?

If its the file system, how about index.php writes a file, call it status.php containing json equivalent to the likes of this:


$results['time'] = 1112223334456;
$results['job'][0] = "Done";
$results['job'][1] = "Done";
$results['job'][2] = "Problem";

Then ajaxially call index.php?status=yes and it returns the json variables?

Would that work?

getResults has a foreach() within it, this is what takes the longest. I want to know how far this iteration is, it doesnt matter to me where the data is stored, but I dont think I should be writing a SQL statement in every iteration to show an update of status.

I’ve never actually used it before, but why don’t you look into something like Gearman? http://gearman.org/

I think you’re able to report on the progress of a task and stuff like that, and as far as I’m aware you could trigger this off via an ajax call and update when the task is finished. As I say I’ve not used Gearman myself, but I think it may do what you need?

*Edit: http://php.net/manual/en/gearmanclient.jobstatus.php

"Gets the status for a background job given a job handle. The status information will specify whether the job is known, whether the job is currently running, and the percentage completion. "

An SQL query on every iteration might slow down too much but why not execute it, say, every 5 seconds? Would that be accurate enough for you? Some untested code sample, it shouldn’t be slow:


$start = time();

foreach ($data as $item) {
  if (time() - $start >= 5) {
    $start = time();

    // update status (SQL, file, etc.)
    // ...
  }

  // do your stuff...
}

Or, update status every 1000th iteration:


// suppose that $data is numerically indexed
foreach ($data as $i => $item) {
  if ($i % 1000 == 0) {
    // update status (SQL, file, etc.)
    // ...
  }

  // do your stuff...
}

I like it, thanks!