cURL Experiments

Ladies & Gentlemen!

Oops! Let me try again:
Gentle Ladies & Hard Men (after-all, it’s the ladies who are gentle compared to men and men hard, rough & tough compared to the ladies)!

Get used to me addressing you as “hard men” because that’s how I’m gonna do it frequently. :rofl:

Anyway, this thread is about cURL.
I will try building a unique script based on cURL. But, let’s get rolling to learn first!
I have some viral traffic & viral money earning ideas and cURL will impliment them. Stick around and see how deep the rabbit hole is and what comes out of it! (Not joking!).

1 Like

cURL Sample 1:

Why do you reckon the following code sample is not showing any page loading ? I see a complete blank page loading on my xampp.

<?php

//This code was found on: http://www.binarytides.com/php-curl-tutorial-beginners/
//gets the data from a URL
function get_url($url) 
{
    $ch = curl_init();
     
    if($ch === false)
    {
        die('Failed to create curl object');
    }
     
    $timeout = 5;
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}
 
echo get_url('http://www.apple.com/');
?>

cURL Sample 2:

Is it true that the following short code is just as good as the one mentioned on my previous post ?


<?php 

//This code was found on: http://www.binarytides.com/php-curl-tutorial-beginners/
//2nd Example
//The above GET request to a url can be done in a much simpler way like this:

//Make a HTTP GET request and print it (requires allow_url_fopen to be enabled)
echo file_get_contents('http://www.apple.com/');

?>

cURL Sample 3:

Can you figure-out or atleast guess why the following code is better than the previous 2 ?
What benefits do you see in it than the other 2 code samples ?
No, I’m not testing you but trying to learn from you. :slight_smile:


<?php 

//3rd Option
//I see a blank page!!! 
//This code was found on: http://www.binarytides.com/php-curl-tutorial-beginners/
//Calling the curl_setopt function again and again to set the options is a bit tedious. There is a useful function called curl_setopt_array that takes an array of options and sets them all at once. Here is a quick example:
//Make a HTTP GET request and print it (requires allow_url_fopen to be enabled)
echo file_get_contents('http://www.apple.com/');
curl_setopt_array($ch, array(
    CURLOPT_URL => $url ,
    CURLOPT_RETURNTRANSFER => 1,
    CURLOPT_CONNECTTIMEOUT => $timeout ,
));

?>

And, what is meant by the following:
“//Calling the curl_setopt function again and again to set the options is a bit tedious. There is a useful function called curl_setopt_array that takes an array of options and sets them all at once.”.

Can you elaborate it ? Because, if I understand it then I’ll understand the benefits of this code over the previous 2.

Thanks!

  1. curl_getinfo()
  2. no
  3. none

There is a very small overhead every time you call a function. In general (because I don’t know the specifics of how PHP works internally) various items are placed on a stack while execution is transferred to the function, and when the function completes those items are retrieved from the stack so that the calling code can continue execution as if it never left off. Thus, a small amount of time is added each time you call a function. So if you can call the function once instead of ten times, that’s “better”, though you might not be able to measure the difference. And you’ll also take up some time creating and populating the array that might be the same as the overhead for repeated function calls, leaving you no better off.

I suspect the person you quoted might also be talking about the look of the code with all those separate calls, and if you call cURL a lot you might use the same options (other than the URL) and hence be able to create the array once and just change certain parts of it as required.

2 Likes

Thanks for your input but I didn’t understand what you meant by:
curl_getinfo()
Do you mean that part is missing on one of my samples ? If so, then which one and in which line ?
Infact, what should have my code looked like ?

I don’t see that mentioned in any of the code samples in that tutorial, from where I was getting the code samples:

Guys,

On this cURL tutorial:

Under the section “Make GET requests - fetch a url”, you will see 3 blocks of code.
I’m referring to the 3rd one that looks like this:


    curl_setopt_array($ch, array(
        CURLOPT_URL => $url ,
        CURLOPT_RETURNTRANSFER => 1,
        CURLOPT_CONNECTTIMEOUT => $timeout ,
    ));

How come no url is mentioned in that code ?
I can now see why my code sample 3 is not working. See my 4th post above.
What do you think that code should look like ? Care to show an example, how you’d do things ?

CURL does HTTP request. like HTTP requests, HTTP responses consist [at least] of headers and body. curl_exec receives the body, curl_getinfo receives the headers. as long as you are not asking for this information, you’re plain ignoring it and any error within.

  1. http://php.net/manual/de/function.curl-getinfo.php

  2. https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#3xx_Redirection

Erm…

    curl_setopt_array($ch, array(
    CURLOPT_URL => $url ,  // *** MIGHT THIS BE THE URL?
    CURLOPT_RETURNTRANSFER => 1,
    CURLOPT_CONNECTTIMEOUT => $timeout ,
));

Ah! But it doesn’t say the website address.
Look at code sample 1 on this page:

Look under the heading:
Make GET requests - fetch a url

That code block looks like this and it has the url: apple.com


//gets the data from a URL
function get_url($url) 
{
    $ch = curl_init();
     
    if($ch === false)
    {
        die('Failed to create curl object');
    }
     
    $timeout = 5;
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}
 
**echo get_url('http://www.apple.com/');**

Therefore, we know where the $url would redirect.

Now, on code block sample 3 (which I’m assuming is a different code from the 1st mentioned above) does not state the value of the $url and that is why I’m guessing the cURL is not fetching any page but showing a complete blank page.
That 3rd block of code only has 5 lines and looks like this:


curl_setopt_array($ch, array(
    CURLOPT_URL => $url ,
    CURLOPT_RETURNTRANSFER => 1,
    CURLOPT_CONNECTTIMEOUT => $timeout ,
));

NOTE: I’m taking into account that these 2 code blocks are separate from each other and have no connection. Or, am I wrong ?

Fellow Programmers,

I did test both code blocks separately and they did redirect me to apple.com.

Sample 1:

//gets the data from a URL
function get_url($url) 
{
    $ch = curl_init();
     
    if($ch === false)
    {
        die('Failed to create curl object');
    }
     
    $timeout = 5;
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}
 
echo get_url('http://www.apple.com/');

Sample 2:

// Make a HTTP GET request and print it (requires allow_url_fopen to be enabled)
echo file_get_contents('http://www.apple.com/');

I was just curious to learn why anyone would bother using the long version if the short version can do the same job.
I got my answer from someone that the short version is risky. This is what I learnt:

file_get_contents() using a URL is not guaranteed to work in all situations, as it depends on a configuration setting to allow it to use HTTP (which is sometimes disabled for security reasons). cURL, on the other hand, should normally work, as long as the PHP cURL extension is installed.

I’ve also learnt now that, writing the following for each and every url to fetch can be tedious:

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);

And so, best to url_setopt_array that takes an array of options and sets them all at once. Like one programmer’s contributed code sample below:


<?php
function get_url($url) 
{
    $ch = curl_init();
    if($ch === false)
    {
        die('Failed to create curl object');
    }
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}
// get_url2 is the same as get_url except curl options are contained in an
// array and set using curl_setopt_array
function get_url2($url) 
{
    $ch = curl_init();
    if($ch === false)
    {
        die('Failed to create curl object');
    }
    $curlOptions = array(CURLOPT_URL => $url, CURLOPT_RETURNTRANSFER => 1,
                         CURLOPT_CONNECTTIMEOUT => 5);
    curl_setopt_array($ch, $curlOptions);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}
// Call get_url or get_url2, but not both.  Comment and uncomment as needed to experiment.
// Change $myUrl variable for different websites, see what comes back as a blank page. Note that
// there is no guarantee that you'll always get the same blank and non-blank pages as I do or
// the same results every time.  Lots of factors.
$myUrl = 'http://www.google.com'; // non-blank, but incomplete page (no Google logo)
//$myUrl = 'http://www.daniweb.com'; // blank page
//$myUrl = 'http://www.apple.com'; // blank page
// echo get_url($myUrl);
echo get_url2($myUrl);
?>

Thanks for everyone’s inputs in this forum and others. :slight_smile:
I hope I remember them all!!! Don’t blame me if I don’t! Lol!

Because the “long” version IS cURL and the “short” version is an entirely different function that does almost exactly the same thing, but fails at times?

1 Like

I suspect whoever wrote that sample would expect the reader to infer that in order for it to work, you must set the value of $url to whatever URL you want to use. When writing tutorials and sample code, I imagine there is a point where you expect a certain amount of knowledge or intuition from your readers.

In the case of that specific article, I read it that the third code block is intended to show how you can replace the separate calls in the original code (sample block 1) with a single call to set all the options, and therefore it is obvious that it would be using the same variables. After all, they also don’t define $timeout individually in that third code block either.

1 Like

edit

3 posts were split to a new topic: Forum ‘unresponsive script’ issue

Final Code that encountered above mentioned problem:

<?php
$url = "http://google.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "$url");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 5);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$result = curl_exec($ch);
curl_close($ch);
$result = preg_replace("#(<\s*a\s+[^>]*href\s*=\s*[\"'])(?!http)([^\"'>]+)([\"'>]+)#",'$1http://$url/$2$3', $result);
echo $result
?

Folks,

I’m afraid the following preg_replace is not doing the job. Can you think of a better one ?
Actually, if you can come-up with one that does the following then I’d appreciate it:

  1. Replace ‘https://’ with: ‘http://mydomain.com’.
  2. Replace ‘http://’ with: ‘http://mydomain.com’.
  3. Replace ‘www.’ with: ‘http://mydomain.com’.
  4. Replace all subdomains and sub sub domains etc. with: ‘http://mydomain.com’.
    eg1. Replace ‘**mail.**domain.com’ with ‘http://mydomain.com’.
    eg2. Replace ‘**ny.mail.**domain.com’ with ‘http://mydomain.com’.
    eg3. Replace ‘**europe.spain.mail.**domain.com’ with ‘http://mydomain.com’.
    eg4. Replace ‘**west.europe.deutchland.mail.**domain.com’ with ‘http://mydomain.com’.
    And so on. You get the picture.

I tried the following code but it’s not working. Instead of replacing things with ‘http://mydomain.com’ it is replacing with: ‘%24url//’. And that is a big No! No!


<?php
$url = "http://google.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "$url");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 5);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$result = curl_exec($ch);
curl_close($ch);
$result = preg_replace("#(<\s*a\s+[^>]*href\s*=\s*[\"'])(?!http)([^\"'>]+)([\"'>]+)#",'$1http://$url/$2$3', $result);
echo $result
?>

Cheers!

Php Buddies,

Why do you reckon the following script is unable to replace the ‘https://’ or the ‘http://’ words with ‘http://mymydomain.com’ ?
It is able to replace the words ‘www.’ with ‘http://mymydomain.com’, though.

Open 2 tabs in your browser where one opens to the page where you are running the following script in your wamp/xampp and the other tab should open direct to http://ebay.com for your experiment.

Running the script in your wamp/xampp and hovering your mouse over “fashion” link on ebay would show you ‘http//mymydomain.com/tracker.php?ebay.com/global/fashion’ and that is proof ‘www.’ got replaced.
You may confirm this cross checking both tabs by hovering your mouse over the mentioned links.
Now, in the (non xampp/wamp) tab, hover your mouse over the links “register”, “sign in”, etc. and you will see they start with ‘https://’. Then finally, in the other tab (xampp/wamp), hover your mouse over these same links and you’d see the ‘https://’ have not been replaced with ‘http://mymydomain.com’.
Why is that ?
Why is the str_replace failing on these 2 occasions to replace the ‘http://’ and the ‘https://’ to ‘http://mymydomain.com’ ?

I’ve moved the post “Why str_replace Not Working Properly With cURL?” to this topic, because it really is a continuation of what you were discussing previously, and you had not included a script with it in the new thread, which meant the new thread was impossible to answer.

1 Like