PHP Curl API Request Over Muliple Pages

Hi,

I’m getting a list of products via an API which is split over multiple pages, so I have to loop over the pages until there are no more pages and save the data in a variable. So here is the basic curl request which is fine:

$curl = curl_init();
curl_setopt_array($curl, array(
  CURLOPT_URL => "https://myurl/products?page_no=1",
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => "",
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => "GET",
  CURLOPT_POSTFIELDS =>"{\n    \"email\":  $user ,\n    \"password\": $pw \n}",
  CURLOPT_HTTPHEADER => array(
    "Authorization: Bearer $tokenout",
    "Content-Type: application/json"
  ),
));

$rsp = curl_exec($curl);

curl_close($curl);

So does anyone have any advice as to the best approach is. My idea was to wrap the request in a function and instead of hard coding the page number increment it inside a variable. It would be something like this:

$page = 1;

function multipleCurlRequests()
{
    
$curl = curl_init();
curl_setopt_array($curl, array(
  CURLOPT_URL => "https://myurl/products? . $page . ",
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => "",
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => "GET",
  CURLOPT_POSTFIELDS =>"{\n    \"email\":  $user ,\n    \"password\": $pw \n}",
  CURLOPT_HTTPHEADER => array(
    "Authorization: Bearer $token",
    "Content-Type: application/json"
  ),
));

    if (curl_errno($curl)) {
        echo 'Error:' . curl_error($curl);
     
        return;
    } else {
        $page++;
        multipleCurlRequests();
        $rsp = curl_exec($curl);
        curl_close($curl);
    }
}
// Call first
multipleCurlRequests();

This seems like a really hashed approach though, I am not sure what will happen in terms of performance trying to retrieve the data over multiple pages like this.

There seems to be little information how specifically to handle this.

Thanks in advance.

I’m not sure I’d call it recursively, but it might work. What happens when you try it? Apart from the function not knowing anything about $page because it’s defined in the main scope and not passed in anywhere.

I haven’t used CURL in real life, I think you need to call curl_exec() before you check for errors, and you don’t seem to store the results anywhere.

… this… sounds like the antithesis of an API. It sounds more like “Please help me steal some other website’s products.”
What API are you trying to access? Does the API not allow for a number of records to be retrieved?

Not sure why you’d think that? API appears to be split over multiple pages you have to iterate over the pages to get all the products:

Developer can continue to pass page_no incrementally until all products information retrieved.

It seems odd to me too. Hope that helps.

Chris

Their API developer needs to reevaluate their idea of an API, then…

In any case, droop’s mechanically correct/technically incorrect.

You should curl_exec first, and then check to see if there was an error.
You don’t need to call curl_exec first, you’d just get a 0 the first time through (because curl hasn’t run a result).

The proper flow should be (pseudo-basic code go):

Instantiate your curl object
Set Global options
-Label: S1-
Set URL
Execute
Check for Error
On Success:
Process Result
Change Page value
Goto S1
On Fail:
Report Error
Close Curl Object

Thanks that’s great - I’m starting to worry I misunderstood the way their accounts work. I haven’t come across it before but I think the client has to select their products first in their account, since when I try to retrieve the products its only returning 39. In other words whatever they call an API doesn’t give you access to all the products. So I need to check with the company first.

Chris

I’ve done some stuff recently with the Xero accounting API, and in some areas that returns results in pages if you ask for enough information. For example if you ask for all contacts, it will send them in pages of, I think, 100 contacts per page. You keep on requesting the next page, either until it returns fewer than 100 contacts (or whatever the number is), or returns none at all.

Note that this behavior is different from returning a curl error. Receiving a valid return of nothing is not going to generate an error, so…

1 Like

This is other issue here - even if I had fixing looping through the pages what is going to happen here in terms of memory? I have to run a server cron job to call this function every morning.

Shared with Zight

To do… what?
Or I guess more to the point, does what you need to do require all of the information at once? or could you do the process on each page of data, and discard the result after?

No it’s the whole product range in WooCommerce and it has to be updated - I’ve got another part of the application syncing the CSV file and checking against an SKU code to see if it’s changed, but I need to download the whole product list to do this. I can’t connect the plugin directly o the API I can only feed it a CSV file. If I had to develop the whole application to import the products too it would be a huge job.

… Why?

So you’ve got a part of the application that goes line by line and checks against a single SKU to see if the product has changed.

Let’s draw a couple of rational conclusions:

  1. A product is uniquely identified by the SKU.
  2. There is one CSV line per SKU;
  3. There is one SKU per CSV line.

In other words, the products in the API and the CSV file have a 1-1 relationship.

Why not just take a page of data, run through the SKU’s on that page, and update the CSV.

Then you can discard the information on the page, and get a new page. The only thing in memory at that point is the CSV.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.