SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Member
    Join Date
    Jun 2013
    0 Post(s)
    0 Thread(s)

    Extracting a specific image from a webpage

    I am using php curl to complete some question answer task, then to save my work, i need to enter captcha. So , i need to log in manually into the site to enter captcha and save my work.

    But i want to display the captcha image in my script, after completion of my task and after entering the captcha walue, it will send the captcha value and submit my work.

    The final page has many images, whose src's are like: `images/logo.png` , `images/refer.jpg` and `Captcha.jsp`

    I want to display the image having src `Captcha.jsp` and entire attribute details are `<img width="120" src="Captcha.jsp"></img>`

    How to extract that specific image ? By extracting, i don't mean its alt, src, dimensions etc. But want the exact image to be displayed. I think these codes are needed :
    `curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);`

    Help me sorting out this, Just tell me how to extract that specific image ?

  2. #2
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Atlanta, GA, USA
    73 Post(s)
    0 Thread(s)
    jsp isnt an image unto itself. It stands for JavaServer Pages.
    The idea of Captcha is that you ARNT able to capture the image. The idea of Captcha is that you CANT automate it with a script - you're meant to do it manually.

    that said; to capture the return of the transfer, yes, you should use RETURNTRANSFER, and then set your exec to a variable ($return = curl_exec($ch)). Then filter $return in the means you wish to use (DOMDocument probably a good idea here.)

    As always when helping CURL questions, the following disclaimer applies: make sure you have permission to use CURL to scrape this site; most sites prohibit such activity.
    Never grow up. The instant you do, you lose all ability to imagine great things, for fear of reality crashing in.

  3. #3
    SitePoint Member
    Join Date
    Jun 2010
    0 Post(s)
    0 Thread(s)
    You just have to display the image tag of captcha to the user. which in this case is

    <img width="120" src="Captcha.jsp"></img>
    But you also have to present the user with a input text box to fill the captcha text.
    You curl script will need a submit option in case a captcha is present.

    So the flow is somewhat like this

    1. Curl loads remote page.
    2. If curl finds captcha then it loads an html page with captcha image url and text box.
    3. user submits back the captcha text and curl script proceeds from where it left.

Tags for this Thread


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts