Help with Parrallel loop in PHP

My code is working perfectly fine, but to make it faster i want to run it in parallel
ill briefly explain the part which needs to be executed in parallel…

after processing the input , [will skip the initial part as ive have already figured till here]
Need to do this in parallel


> im trying to fetch data from html form from another webpage [for this part i have used curl functions ..]
[have already generated the input to this form in the initial part]

> after getting the result from the html form, the result is stored in a php var
> im processing the data in the php var using HTML DOM..  [Thanks @fretburner for this part]
> some mysql queries done here, based on the data frm previous step [select & insert]

i guess we cud use curl for parallel in the initial step but have heard sometimes using curl is not so good …

what is the best way to perform it in parallel, if possible explain to help me code…
Thanks

I don’t quite understand what is meant by parallel considering each step seems to be dependent on previous one. Unless there is something I’m missing the HTML from a web page can’t be parsed until that HTML is retrieved using cURL. Unless you mean this same process is running multiple times in in a single script execution. That case I would recommend looking into cURL multi functions. In particular the first one to look at would be curil_multi_init(). Other than that search on the web for “php multithreading”. It seems like there are a few libraries and what not for that.

ya ur correct and im already looking in curl_multi_init(), but getting confused as der is also a form here…

lets take it one by one, for the first step

for single execution, i have the input for the html stored in $u
here is the code working


   $location = 'abc';

    $userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_VERBOSE, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_URL, $location );


    $post_array = array(
        "rid" => $u,
        "submit" => "submit"
    );
    curl_setopt($ch, CURLOPT_POSTFIELDS, $post_array); 
    $response = curl_exec($ch);  

now i want the same in parallel…
// already 3 inputs will be generated at one time for the HTML form, [will increase to 5 or 6 later]
// inputs are stored in php var $u[1], $u[2], $u[3]

i have this code, not yet executed but i am unable to decide where the output is stored, [as in the prev case its stored in $response]


  $location = 'abc';

    $userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';

$ch = curl_init();
$ch1 = curl_init();
$ch2 = curl_init();


    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_VERBOSE, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_URL, $location );

    curl_setopt($ch1, CURLOPT_HEADER, 0);
    curl_setopt($ch1, CURLOPT_VERBOSE, 0);
    curl_setopt($ch1, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch1, CURLOPT_USERAGENT, $userAgent);
    curl_setopt($ch1, CURLOPT_POST, 1);
    curl_setopt($ch1, CURLOPT_URL, $location );

    curl_setopt($ch2, CURLOPT_HEADER, 0);
    curl_setopt($ch2, CURLOPT_VERBOSE, 0);
    curl_setopt($ch2, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch2, CURLOPT_USERAGENT, $userAgent);
    curl_setopt($ch2, CURLOPT_POST, 1);
    curl_setopt($ch2, CURLOPT_URL, $location );



    $post_array1 = array(
        "rid" => $u[1],
        "submit" => "submit"
    );

    $post_array2 = array(
        "rid" => $u[2],
        "submit" => "submit"
    );

    $post_array3 = array(
        "rid" => $u[3],
        "submit" => "submit"
    );


    curl_setopt($ch, CURLOPT_POSTFIELDS, $post_array1); 

    curl_setopt($ch1, CURLOPT_POSTFIELDS, $post_array2); 

    curl_setopt($ch2, CURLOPT_POSTFIELDS, $post_array3); 

$mh = curl_multi_init();

curl_multi_add_handle($mh,$ch);
curl_multi_add_handle($mh,$ch1);
curl_multi_add_handle($mh,$ch2);

$active = null;

//execute the handles
do {
    $mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

while ($active && $mrc == CURLM_OK) {

    if (curl_multi_select($mh) != -1) {
        do {
            $mrc = curl_multi_exec($mh, $active);
        } while ($mrc == CURLM_CALL_MULTI_PERFORM);
    }
}

//close the handles
curl_multi_remove_handle($mh, $ch1);
curl_multi_remove_handle($mh, $ch2);
curl_multi_close($mh);


This answer from Stack Overflow might help: http://stackoverflow.com/questions/9308779/php-parallel-curl-requests

sorry am getting confused … not able to modify to add the form input…[as showed the code in prev post]
and url from where we get the data is always constant…

finally got the contents… will post if get any prob…
Anyways, Thanks for the replies…

I am not sure I would recommend “hacking” threading this way, especially when you need it completed in parallel. As you cannot guarantee that the requests are completed in order, which introduce a race condition problem.

So in the end it depends if the processes you want to do “threaded” is prone for race conditions.

If, you can try using Gearman or Amazon SQS (If your using Amazon Cloud).

i was able to go thro multi_curl but wats happening its skippin few url requests in between, and without any errors it continues with the others in the list , not sure if i faced his kind of problem with only curl , but i guess the success rate was a lil high, again saying blindly as i observed a few…

hey i didnt und u quite a bit, but i did checked out gearman, but didnt und …
right now i have the full logic but as i said above it skipped a few url request, so i just want a method by which success rate will be high …

P.S - i actually have only one url but i pass some set of inputs [already available with me] to the html form on the target page…
i just want the target page contents in possibly a php var… im guessin further as it wud be similar…
running out of time so pls help…

Thanks
Coolguy