Hi,
Please can you tell me why when I try to do the following,
I obtain an error about browser compatibility?
$html = file_get_contents("http://www.facebook.com/");
var_dump($html);
while when I try to do the same thing to google.com I obtain the expected result?
$html = file_get_contents("http://www.google.com/");
var_dump($html);
what is the security device the first site adopt to prevent the use, in this case, of file_get_contents?
instead why when I do “Save AS” through the browser the page has saved correctly on a file?
Do “Save as” and “file_get_contents”, in this context, behave in the same way?
many thanks.
A typical HTTP request looks like this:
GET /forums/showthread.php?t=662499
Host: www.sitepoint.com
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/532.5 (KTHML, like Gecko) Chrome/4.0.249.89 Safari/532.5
As you can see, the browser identifies itself to the web server through the User-Agent header. Facebook is looking at this header to determine if your web browser is one the site is designed to support.
When you make a request in PHP using only file_get_contents(), you are not sending any User-Agent header, and Facebook is programmed to tell browsers it doesn’t support that message instead of serving the site.
If you were just playing around, I hope that explanation is useful. If you were really attempting to scrape something from Facebook through PHP, stop now, because you’re not allowed even once you figure out the technical pieces to doing so. You access Facebook programmatically only through the APIs they expose for that purpose.