Curl to Access a Page That Requires a Login from a Different Page

CURL to access a page that requires a login from a different page

The web site likely uses cookies to store your session information. When you run

curl --user user:pass https://xyz.example/a  #works ok
curl https://xyz.example/b #doesn't work

curl is run twice, in two separate sessions. Thus when the second command runs, the cookies set by the 1st command are not available; it's just as if you logged in to page a in one browser session, and tried to access page b in a different one.

What you need to do is save the cookies created by the first command:

curl --user user:pass --cookie-jar ./somefile https://xyz.example/a

and then read them back in when running the second:

curl --cookie ./somefile https://xyz.example/b

Alternatively you can try downloading both files in the same command, which I think will use the same cookies.

Login with curl and move to another page

If you want to go to /buy after you log in, just use the same curl handle and issue another request for that page. cURL will retain the cookies for the duration of the handle (and on subsequent requests since you are saving them to a file and reading them back with the cookie jar.

For example:



$user_agent       = "Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20140319 Firefox/24.0 Iceweasel/24.4.0";
$curl_crack = curl_init();

CURL_SETOPT($curl_crack,CURLOPT_URL,"https://www.vininspect.com/en/account/login");
CURL_SETOPT($curl_crack,CURLOPT_USERAGENT,$user_agent);
CURL_SETOPT($curl_crack,CURLOPT_PROXY,"183.78.169.60:37899");
CURL_SETOPT($curl_crack,CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($curl_crack,CURLOPT_POST,True);
CURL_SETOPT($curl_crack,CURLOPT_POSTFIELDS,"LoginForm[email]=naceriwalid%40hotmail.com&LoginForm[password]=passwordhere&toploginform[rememberme]=0&yt1=&toploginform[rememberme]=0");
CURL_SETOPT($curl_crack,CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($curl_crack,CURLOPT_FOLLOWLOCATION,True);
CURL_SETOPT($curl_crack,CURLOPT_COOKIEFILE,"cookie.txt"); //Put the full path of the cookie file if you want it to write on it
CURL_SETOPT($curl_crack,CURLOPT_COOKIEJAR,"cookie.txt"); //Put the full path of the cookie file if you want it to write on it
CURL_SETOPT($curl_crack,CURLOPT_CONNECTTIMEOUT,30);
CURL_SETOPT($curl_crack,CURLOPT_TIMEOUT,30);

$exec = curl_exec($curl_crack);
if(preg_match("/^you are logged|logout|successfully logged$/i",$exec))
{
$post = array('search' => 'keyword', 'abc' => 'xyz');

curl_setopt($curl_crack, CURLOPT_POST, 1); // change back to GET
curl_setopt($curl_crack, CURLOPT_POSTFIELDS, http_build_query($post)); // set post data
curl_setopt($curl_crack, CURLOPT_URL, 'http://example.com/buy'); // set url for next request

$exec = curl_exec($curl_crack); // make request to buy on the same handle with the current login session
}

Here are some other examples of using PHP & cURL to make multiple requests:

How to login in with Curl and SSL and cookies (links to multiple other examples)

Grabbing data from a website with cURL after logging in?

Pinterest login with PHP and cURL not working

Login to Google with PHP and Curl, Cookie turned off?

PHP Curl - Cookies problem

How can I log in to Stack Exchange using curl?

unfortunately, the login protocol is much more complex than that, and is not a scheme built-in to curl. this is not a job for curl, but some scripting language (like PHP or Python), though libcurl would be of great help to manage the http protocol and cookies and the likes. and libxml2 would be of help to parse out the login CSRF key, which is hidden in the HTML. and they may require a referer header, and they may even be checking that the referer header is real, not faked (idk, but it wouldn't surprise me).

first, make a plain normal HTTP GET request to https://sustainability.stackexchange.com/users/login , and make sure to save the cookies and the html response. now extract the POST URL and input elements of the form with id login-form, this includes the CSRF token, username, and password, and bunch of others. then make an application/x-www-form-urlencoded-encoded POST request to https://sustainability.stackexchange.com/users/login , with the cookies received from the first GET request, and the POST data of all the <input elements you extracted, and remember to fill out the "email" and "password" inputs.

NOW you should get the logged-in html, and to continue to get the logged-in version of the page, make sure to apply the same cookie session id to the next http requests (its this cookie session id that makes the website remember you as the guy that logged in on that account~)

here's an example in PHP, using libcurl and libxml2 (using PHP's DOMDocument as a convenience wrapper around libxml2, and using hhb_curl from https://github.com/divinity76/hhb_.inc.php/blob/master/hhb_.inc.php as a convenience wrapper around libcurl, taking care of cookies, referers, libcurl error handling (turns silent libcurl errors into exceptions, and more), at the end, it dumps the logged-in HTML, proving that it's logged in. (and the email/password provided, is a dummy account for testing, there's no problem in it being compromised, which obviously happens when i post the credentials here.):

<?php
declare(strict_types = 1);
require_once ('hhb_.inc.php');
$hc = new hhb_curl ( 'https://sustainability.stackexchange.com/users/login', true );
// getting a cookie session, CSRF token, and a referer:
$hc->exec ();
// hhb_var_dump ( $hc->getStdErr (), $hc->getStdOut () );
$domd = @DOMDocument::loadHTML ( $hc->getResponseBody () );
$inputs = array ();
$form = $domd->getElementById ( "login-form" );
$url = $form->getAttribute ( "action" );
if (! parse_url ( $url, PHP_URL_HOST )) {
$url = 'https://' . rtrim ( parse_url ( $hc->getinfo ( CURLINFO_EFFECTIVE_URL ), PHP_URL_HOST ), '/' ) . '/' . ltrim ( $url, '/' );
}
// hhb_var_dump ( $url, $hc->getStdErr (), $hc->getStdOut () ) & die ();

foreach ( $form->getElementsByTagName ( "input" ) as $input ) {
if (false !== stripos ( $input->getAttribute ( "type" ), 'button' ) || false !== stripos ( $input->getAttribute ( "type" ), 'submit' )) {
// not sure why, but buttones, even ones with names and values, are ignored by the browser when logging in,
// guess its safest to follow suite.
continue;
}
// var_dump ( $input->getAttribute ( "type" ) );
$inputs [$input->getAttribute ( "name" )] = $input->getAttribute ( "value" );
}
assert ( ! empty ( $inputs ['fkey'] ), 'failed to extract the csrf token!' );
$inputs ['email'] = 'vs5jkqyx4hw3seqr@my10minutemail.com';
$inputs ['password'] = 'TestingAccount123';
$hc->setopt_array ( array (
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => http_build_query ( $inputs ),
CURLOPT_URL => $url
) );
$hc->exec ();

hhb_var_dump ( $inputs, $hc->getStdErr (), $hc->getStdOut () );

interesting note, by default, libcurl uses multipart/form-data-encoding on POST requests, but this site (and most sites, really), uses application/x-www-form-urlencoded-encoding on POST requests. here i used PHP's http_build_query() to encode the POST data in in application/x-www-form-urlencoded-format



Related Topics



Leave a reply



Submit