I've used libcurl to screen scrape before, but what I need to do is use libcurl and regex to match specific links on a page and then I need to scrape content from the pages that those links link to. Hope that makes sense. Is it possible to embed libcurl within another libcurl. If anyone can point me to an example of this that would be much appreciated.
So my scraping php file generally starts with:
$ch = curl_init("http://www.thewebsite.com/");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
But I won't know the addresses that the links on this website point to. So I'd have to scrape those and then feed them into another libcurl for each link (and there could be any number of links). Thanks in advance!






Bookmarks