Logging into an external site with PHP cURL


#1

Hello,

Im a fairly new developer... graduated a bootcamp that taught react, node, and mysql a month ago and recently hired by a company to learn php. My first project (to learn) is to recreate an existing project they have. I've struggled having not known any php the past two weeks, but now I have to use cURL.

I am having a handful of issues one of which is getting around a captcha that shows up sometimes (supposedly the distil keys they have given me will solve this issue). The other is I cant ever seem to get the page to actually login. I've googled, youtubed, and looked though books, but cURL seems to often be brushed over quickly.

Any advise or sources of reference that you would recommend? Im struggling hard.


#2

If you want to use cURL to access file(s) that have a CAPTCHA controlling access, you will need to set up an API, typically an API username and API hash key. The cURL sends the API credentials and the server script uses them to authenticate the Request.


#3

The whole point of captcha is to only allow access from humans. So if the captcha is at all effective the curl request won't get past it.
You would need a back-door to the resource via an API as @Mittineague describes.


#4

Makes sense to me...

Unfortunately this is an existing project they are having me recreate. So somehow between generating 5 cookies per proxy they are making it work. I just cant get it to work....

Awful first project lol


#5

It sounds like your project manager needs needs to be brought up to date - tactfully of course.

The reason it "worked" before was because it exploited security weaknesses that have since been remedied.

If the external site is under your control you will need to put together an API for you (and possibly others) to access it. If the site is not under your control, you will need to contact the site and ask them what ways are available for you to access it.


#6

Thats what I was kind of thinking. I asked them if the existing project was really working and didn't get a clear answer. I think thats also why they may not be helping me is that they think it should work. I think what may happen is time will go by and they will decide that this process is no longer able to be automated without using a clickfarm to get around captcha.


#7

So far this has kinda been about my project, but I do have some cURL related questions.

After curl is initialized... you can execute that handler repeatedly? Im trying to understand this and all I can come up is that its because its synchronous. Which brings me to my next question.... what is the reason for setting the same curl_opt repeatedly if you arnt changing the value?

Thanks for any help,

For example I have been looking at this code:

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_PROXY, $proxy_account_2018);
curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxy_pw_2018);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
			$htmlContent = curl_exec($ch);```

//**but then instead of curl_close($ch)**

curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36');
curl_setopt($ch, CURLOPT_URL, $login_url);
curl_setopt($ch, CURLOPT_POST, TRUE );
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE );
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiepath.'bids_'.$user.'.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiepath.'bids_'.$user.'.txt');
$postResult = curl_exec($ch);