Using cURL for Remote Requests

If you’re a Linux user then you’ve probably used cURL. It’s a powerful tool used for everything from sending email to downloading the latest My Little Pony subtitles. In this article I’ll explain how to use the cURL extension in PHP. The extension offers us the functionality as the console utility in the comfortable world of PHP. I’ll discuss sending GET and POST requests, handling login cookies, and FTP functionality.

Before we begin, make sure you have the extension (and the libcURL library) installed. It’s not installed by default. In most cases it can be installed using your system’s package manager, but barring that you can find instructions in the PHP manual.

How Does it Work?

All cURL requests follow the same basic pattern:

  1. First we initialize the cURL resource (often abbreviated as ch for “cURL handle”) by calling the curl_init() function.
  2. Next we set various options, such as the URL, request method, payload data, etc. Options can be set individually with curl_setopt(), or we can pass an array of options to curl_setopt_array().
  3. Then we execute the request by calling curl_exec().
  4. Finally, we free the resource to clear out memory.

So, the boilerplate code for making a request looks something like this:

<?php
// init the resource
$ch = curl_init();

// set a single option...
curl_setopt($ch, OPTION, $value);
// ... or an array of options
curl_setopt_array($ch, array( 
    OPTION1 => $value1, 
    OPTION2 => $value2
));

// execute
$output = curl_exec($ch);

// free
curl_close($ch);

The only thing that changes for the request is what options are set, which of course depends on what you’re doing with cURL.

Retrieve a Web Page

The most basic example of using cURL that I can think of is simply fetching the contents of a web page. So, let’s fetch the homepage of the BBC as an example.

<?php
curl_setopt_array(
    $ch, array( 
    CURLOPT_URL => 'http://www.bbc.co.uk/',
    CURLOPT_RETURNTRANSFER => true
));

$output = curl_exec($ch);
echo $output;

Check the output in your browser and you should see the BBC website displayed. We’re lucky as the site displays correctly because of its absolute linking to stylesheets and images.

The options we just used were:

  • CURLOPT_URL – specifies the URL for the request
  • CURLOPT_RETURNTRANSFER – when set false, curl_exec() returns true or false depending on the success of the request. When set to true, curl_exec() returns the contents of the response.

Log in to a Website

cURL executed a GET request to retrieve the BBC page, but cURL can also use other methods, such as POST and PUT. For this example, let’s simulate logging into a WordPress-powered website. Logging in is done by sending a POST request to http://example.com/wp-login.php with the following details:

  • login – the username
  • pwd – the password
  • redirect_to – the URL we want to go to after logging in
  • testcookie – should be set to 1 (this is just for WordPress)

Of course these parameters are specific to each site. You should always check the input names for yourself, something that can easily be done by viewing the source of an HTML page in your browser.

<?php
$postData = array(
    'login' => 'acogneau',
    'pwd' => 'secretpassword',
    'redirect_to' => 'http://example.com',
    'testcookie' => '1'
);

curl_setopt_array($ch, array(
    CURLOPT_URL => 'http://example.com/wp-login.php',
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => $postData,
    CURLOPT_FOLLOWLOCATION => true
));

$output = curl_exec($ch);
echo $output;

The new options are:

  • CURLOPT_POST – set this true if you want to send a POST request
  • CURLOPT_POSTFIELDS – the data that will be sent in the body of the request
  • CURLOPT_FOLLOWLOCATION – if set true, cURL will follow redirects

Uh oh! If you test the above however you’ll see an error message: “ERROR: Cookies are blocked or not supported by your browser. You must enable cookies to use WordPress.” This is normal, because we need to have cookies enabled for sessions to work. We do this by adding two more options.

<?php
curl_setopt_array($ch, array(
    CURLOPT_URL => 'http://example.com/wp-login.php',
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => $postData,
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_COOKIESESSION => true,
    CUROPT_COOKIEJAR => 'cookie.txt'
));

The new options are:

  • CURLOPT_COOKIESESSION – if set to true, cURL will start a new cookie session and ignore any previous cookies
  • CURLOPT_COOKIEJAR – this is the name of the file where cURL should save cookie information. Make sure you have the correct permissions to write to the file!

Now that we’re logged in, we only need to reference the cookie file for subsequent requests.

Working with FTP

Using cURL to download and upload files via FTP is easy as well. Let’s look at downloading a file:

<?php
curl_setopt_array($ch, array(
    CURLOPT_URL => 'ftp://ftp.example.com/test.txt',
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_USERPWD => 'username:password'
));

$output = curl_exec($ch);
echo $output;

Note that there aren’t many public FTP servers that allow anonymous uploads and downloads for security reasons, so the URL and credentials above are just place-holders.

This is almost the same as sending an HTTP request, but only a couple minor differences:

  • CURLOPT_URL – the URL of the file, note the use of “ftp://” instead of “http://”
  • CURLOT_USERPWD – the login credentials for the FTP server

Uploading a file via FTP is slightly more complex, but still managable. It looks like this:

<?php
$fp = fopen('test.txt', 'r');
curl_setopt_array($ch, array(
    CURLOPT_URL => 'ftp://ftp.example.com/test.txt',
    CURLOPT_USERPWD => 'username:password'
    CURLOPT_UPLOAD => true,
    CURLOPT_INFILE => $fp,
    CURLOPT_INFILESIZE => filesize('test.txt')
));
curl_exec($ch);

fclose($fp);
curl_close($ch);

The important options here are:

  • CURLOPT_UPLOAD – obvious boolean
  • CURLOPT_INFILE – a readable stream for the file we want to upload
  • CURLOPT_INFILESIZE – the size of the file we want to upload in bytes

Sending Multiple Requests

Imagine we have to perform five requests to retrieve all of the necessary data. Keep in mind that some things will be beyond our control, such as network latency and the response speed of the target servers. It should be obvious then that any delays when issuing five consecutive calls can really add up! One way to mitigate this problem is to issue the requests asynchronously.

Asynchronous techniques are more common in the JavaScript and Node.js communities, but briefly instead of waiting for a time-consuming task to complete, we assign the task to a different thread or process and continue to do other things in the meantime. When the task is complete we come back for its result. The important thing is that we haven’t wasted time waiting for a result; we spent it executing other code independently.

The approach for performing multiple asynchronous cURL requests is a bit different from before. We start out the same – we initiate each channel and then set the options – but then we initiate a multihandler using curl_multi_init() and add our channels to it with curl_multi_add_handle(). We execute the handlers by looping through them and checking their status. In the end we get a response’s content with curl_multi_getcontent().

<?php
// URLs we want to retrieve
$urls = array(
    'http://www.google.com', 
    'http://www.bing.com', 
    'http://www.yahoo.com',
    'http://www.twitter.com',
    'http://www.facebook.com'
);

// initialize the multihandler
$mh = curl_multi_init();

$channels = array();
foreach ($urls as $key => $url) {
    // initiate individual channel
    $channels[$key] = curl_init();
    curl_setopt_array($channels[$key], array(
        CURLOPT_URL => $url,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_FOLLOWLOCATION => true
    ));

    // add channel to multihandler
    curl_multi_add_handle($mh, $channels[$key]);
}

// execute - if there is an active connection then keep looping
$active = null;
do {
    $status = curl_multi_exec($mh, $active);
}
while ($active && $status == CURLM_OK);

// echo the content, remove the handlers, then close them
foreach ($channels as $chan) {
    echo curl_multi_getcontent($chan);
    curl_multi_remove_handle($mh, $chan);
    curl_close($chan);
}

// close the multihandler
curl_multi_close($mh);

The above code took around 1,100 ms to execute on my laptop. Performing the requests sequentially without the multi interface it took around 2,000 ms. Imagine what your gain will be if you are sending hundreds of requests!

Multiple projects exist that abstract and wrap the multi interface. Discussing them is beyond the scope of the article, but if you’re planning to issue multiple requests asynchronously then I recommend you take a look at them:

Troubleshooting

If you’re using cURL then you are probably performing your requests to third-party servers. You can’t control them and much can go wrong: servers can go offline, directory structures can change, etc. We need an efficient way to find out what’s wrong when something doesn’t work, and luckily cURL offers two functions for this: curl_getinfo() and curl_error().

curl_getinfo() returns an array with all of the information regarding the channel, so if you want to check if everything is all right you can use:

<?php
var_dump(curl_getinfo($ch));

If an error pops up, you can check it out with curl_error():

<?php
if (!curl_exec($ch)) {
    // if curl_exec() returned false and thus failed
    echo 'An error has occurred: ' . curl_error($ch);
}
else {
    echo 'everything was successful';
}

Conclusion

cURL offers a powerful and efficient way to make remote calls, so if you’re ever in need of a crawler or something to access an external API, cURL is a great tool for the job. It provides us an nice interface and a relatively easy way to execute requests. For more information, check out the PHP Manual and the cURL website. See you next time!

Comments on this article are closed. Have a question about PHP? Why not ask it on our forums?

Image via Fotolia

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://machoman.com Machoman

    Very nice article dude, i like the little pony in the beginning :p.

    grtz & beatz Machoman

  • boen_robot

    So… how does cURL compare to using PHP’s streams?

    (For some reason, this is ALWAYS missing in cURL related articles, manuals, etc.)

  • Samuel

    I tried your code multi curl code But I am not getting any output? I am missing anything.[ I am using PHP 5.3.2. ]

  • Michael

    Interesting… can this be used to redirect a page to another, passing on variables in the URI like a GET request?

    In my scripts, whenever I use something like header(“Location: http://someserver.com?email=email@server.com“), the “@” character gets chewed up (the resulting URI only has “emailserver.com” without the “@”). I’m wondering if this can be used to circumvent that problem…

  • Dan

    Yeah … so are people still using the awful curl_* API? I can’t imagine why you would in a world with tools like http://rdlowrey.github.io/Artax/ and https://github.com/guzzle/guzzle

  • Johan

    Awesome article.. has all the answers to questions I have at the moment (wanting to move some curl commands from another language to PHP)! Thanks for the examples!

  • Ben

    Wow…what a great, informative, and concise article. I’ve successfully used cURL in several instances, but always by copying examples and then ‘trial and erroring’ it until I got it to work. Now I actually understand it and can write my own cURL requests from scratch. But the really powerful piece of information you revealed was how to send multiple requests asynchronously…that is REALLY helpful. Thanks again for the great article.

  • newexception.com

    great text