SitePoint Sponsor

User Tag List

Results 1 to 9 of 9

Thread: cURL question

  1. #1
    SitePoint Addict Latox's Avatar
    Join Date
    Dec 2008
    Location
    Australia
    Posts
    389
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    cURL question

    I have coded a script to look through a URL for:

    Code:
    <a href="http://www.mysiteurl.com/" title="mysitename">mysitename</a>
    The code:

    PHP Code:
    $mylink "<a href=\"http://www.mysiteurl.com/\" title=\"mysitename\">mysitename</a>";
    $ch=curl_init("".$linkUrl."");
    curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
    curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
    $content=curl_exec($ch);
    curl_close($ch);
    if(
    strpos($content,"".$mylink."")===false){
    echo 
    "not found";
    }else{
    echo 
    "found"
    I'm new to cURL, how would I write it to function to check if

    Code:
    <a href="http://www.mysiteurl.com/" title="mysitename">mysitename</a>
    OR

    Code:
    <a href="http://www.mysiteurl.com/" title="mysitename" target="_blank">mysitename</a>
    OR!!!

    How would I code it to check a href = www.mysite.com and anchor = mysitename, that way if people introduced link class, target, etc- it would still find my link.

    Would it be easier to do a simple page search for "http://www.mysite.com" and "mysitename"?

    Also, is it possible to spider all pages on X website with cURL, instead of just the main page? How?

    Thanks.
    :-)

  2. #2
    SitePoint Addict Latox's Avatar
    Join Date
    Dec 2008
    Location
    Australia
    Posts
    389
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I have changed it to search X site for plainly http://www.mysite.com - this works fine, I suppose I can handle this.

    My next question:

    Is it possible to spider all pages on X website with cURL, instead of just the main page? How?
    :-)

  3. #3
    Web Professional
    Join Date
    Oct 2008
    Location
    London
    Posts
    862
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

  4. #4
    SitePoint Addict Latox's Avatar
    Join Date
    Dec 2008
    Location
    Australia
    Posts
    389
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Do you ever actually answer with helpful responses or are you just on commission for php.net?
    :-)

  5. #5
    Web Professional
    Join Date
    Oct 2008
    Location
    London
    Posts
    862
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I do answer questions when someone asks a specific question. Check out my previous posts. If you have a specific problem then ask a question, if not then do some research first. Don’t expect someone else to do the job for you.

    I gave you a hint: use DOMDocument to find anchor elements. If you get stuck again then continue the thread—I’m more than happy to help, but I’m not going to come up with a full solution in the first post.

  6. #6
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    I believe we've already discussed a solution, no?

    It's a little harsh to be demanding solutions, or even answers from forum participants just like yourself.
    @AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.

  7. #7
    SitePoint Addict Latox's Avatar
    Join Date
    Dec 2008
    Location
    Australia
    Posts
    389
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    My question was can it be done with cURL
    :-)

  8. #8
    Web Professional
    Join Date
    Oct 2008
    Location
    London
    Posts
    862
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by dyfuCa View Post
    My question was can it be done with cURL
    Yes, it can. But not cURL alone. Using DOMDocument, you will need to recursively find links for each page fetched with cURL.

    My answer, however, was to your first post. It’s only a coincidence that you managed to write another one seconds before I replied (thus couldn’t see it).

  9. #9
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by silverbulletuk View Post
    use Curl To Obtain The Website Homepage, Parse It Using Domdocument & Xpath To Extract All The Links, Then Loop Through These Links To See If Your Site Is One Of Them.

    Depending On Whether Or Not Your Want To Go Into Sub Pages, Cycle Through The Links To Find All Which Contain The Correct Base Address, Then Start The Process Again For This Page.

    This Could Quickly Become A Quite Intensive And Lengthy Process, As Such, You Should Probably Implement Some Kind Of Queuing System For A Script (launched Via Cron) To Cycle Through.
    @AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •