How can I log in to Stack Exchange using curl?

送分小仙女□ 提交于 2021-02-07 23:52:19

问题


I would like to log in to a remote website from Terminal, which requires an username and password to log in.

So I first tried to log in to one of the Stack Exchange site. According to this answer, you use -u username:password to add your credentials.

So I tried the following:

USERNAME="mine@gmail.com"
PASSWORD="myPassword"

URL="https://sustainability.stackexchange.com/"
curl $URL -u $USERNAME:$PASSWORD

But the resultant website is not a page that the logged-in user sees but it is a page that non-verified user sees, which shows a Sign-up button.

I feel that it works only on the cases where you type in your credentials at the pop-ups shown when you try to access it.

So how can I log in in these cases from within Terminal?


回答1:


unfortunately, the login protocol is much more complex than that, and is not a scheme built-in to curl. this is not a job for curl, but some scripting language (like PHP or Python), though libcurl would be of great help to manage the http protocol and cookies and the likes. and libxml2 would be of help to parse out the login CSRF key, which is hidden in the HTML. and they may require a referer header, and they may even be checking that the referer header is real, not faked (idk, but it wouldn't surprise me).

first, make a plain normal HTTP GET request to https://sustainability.stackexchange.com/users/login , and make sure to save the cookies and the html response. now extract the POST URL and input elements of the form with id login-form, this includes the CSRF token, username, and password, and bunch of others. then make an application/x-www-form-urlencoded-encoded POST request to https://sustainability.stackexchange.com/users/login , with the cookies received from the first GET request, and the POST data of all the <input elements you extracted, and remember to fill out the "email" and "password" inputs.

NOW you should get the logged-in html, and to continue to get the logged-in version of the page, make sure to apply the same cookie session id to the next http requests (its this cookie session id that makes the website remember you as the guy that logged in on that account~)

here's an example in PHP, using libcurl and libxml2 (using PHP's DOMDocument as a convenience wrapper around libxml2, and using hhb_curl from https://github.com/divinity76/hhb_.inc.php/blob/master/hhb_.inc.php as a convenience wrapper around libcurl, taking care of cookies, referers, libcurl error handling (turns silent libcurl errors into exceptions, and more), at the end, it dumps the logged-in HTML, proving that it's logged in. (and the email/password provided, is a dummy account for testing, there's no problem in it being compromised, which obviously happens when i post the credentials here.):

<?php
declare(strict_types = 1);
require_once ('hhb_.inc.php');
$hc = new hhb_curl ( 'https://sustainability.stackexchange.com/users/login', true );
// getting a cookie session, CSRF token, and a referer:
$hc->exec ();
// hhb_var_dump ( $hc->getStdErr (), $hc->getStdOut () );
$domd = @DOMDocument::loadHTML ( $hc->getResponseBody () );
$inputs = array ();
$form = $domd->getElementById ( "login-form" );
$url = $form->getAttribute ( "action" );
if (! parse_url ( $url, PHP_URL_HOST )) {
    $url = 'https://' . rtrim ( parse_url ( $hc->getinfo ( CURLINFO_EFFECTIVE_URL ), PHP_URL_HOST ), '/' ) . '/' . ltrim ( $url, '/' );
}
// hhb_var_dump ( $url, $hc->getStdErr (), $hc->getStdOut () ) & die ();

foreach ( $form->getElementsByTagName ( "input" ) as $input ) {
    if (false !== stripos ( $input->getAttribute ( "type" ), 'button' ) || false !== stripos ( $input->getAttribute ( "type" ), 'submit' )) {
        // not sure why, but buttones, even ones with names and values, are ignored by the browser when logging in,
        // guess its safest to follow suite.
        continue;
    }
    // var_dump ( $input->getAttribute ( "type" ) );
    $inputs [$input->getAttribute ( "name" )] = $input->getAttribute ( "value" );
}
assert ( ! empty ( $inputs ['fkey'] ), 'failed to extract the csrf token!' );
$inputs ['email'] = 'vs5jkqyx4hw3seqr@my10minutemail.com';
$inputs ['password'] = 'TestingAccount123';
$hc->setopt_array ( array (
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => http_build_query ( $inputs ),
        CURLOPT_URL => $url 
) );
$hc->exec ();

hhb_var_dump ( $inputs, $hc->getStdErr (), $hc->getStdOut () );

interesting note, by default, libcurl uses multipart/form-data-encoding on POST requests, but this site (and most sites, really), uses application/x-www-form-urlencoded-encoding on POST requests. here i used PHP's http_build_query() to encode the POST data in in application/x-www-form-urlencoded-format




回答2:


You can do it via browser's tool. You need to copy cookies with all headers via Chrome browser > View > Javascript Console > Network > (right click)> Copy option menu > click select "Copy as cURL" :

Normally we curl in this way :

curl -c cookie.txt -d "LoginName=username" -d "password=changepassword" https://examplesite/a
curl -b cookie.txt https://examplesite/b

Copy via right click will be very big (of course I changed things to prevent myself getting hacked) :

curl 'https://meta.stackoverflow.com/' -H 'pragma: no-cache' -H 'accept-encoding: gzip, deflate, sdch, br' -H 'accept-language: en-US,en;q=0.8' -H 'upgrade-insecure-requests: 1' -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36' -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'cache-control: no-cache' -H 'authority: meta.stackoverflow.com' -H 'cookie: prov=xxxxxxxxxxx; __qca=P0-xxxxxxx-xxxxxx; acct=t=xxxxxxxxxxxx; _ga=GA1.2.xxxxxxxx; _gid=GA1.2.xxxxxxx; _ga=GA1.3.xxxxxxx; _gid=xxxxxxxxx9' -H 'referer: https://meta.stackoverflow.com/' --compressed



回答3:


The url for login is not https://sustainability.stackexchange.com/ it is https://sustainability.stackexchange.com/users/login

and the link you refer say

curl -u username:password $URL

not

curl $URL -u username:password

Try

USERNAME="mine@gmail.com"
PASSWORD="myPassword"

URL="https://sustainability.stackexchange.com/users/login"
curl -u $USERNAME:$PASSWORD $URL

Update

Stack Exchange requires an additional key for login which is called fkey. If you inspected the login form from browser, you can see a hidden input field with name fkey and value is a hash value. It is required to identify the session and to prevent fake login attempts.




回答4:


If you (using Chrome) look at the login form on the login page (right click, inspect, look at the html), you learn that the form is posting email and password fields to https://sustainability.stackexchange.com/users/login.

The way to do that with curl is:

curl https://sustainability.stackexchange.com/users/login -d "email=test@test.com&password=monkey"

If you dig through the html that is returned you can see that this is an invalid login.

The problem is that if you want to use a logged in session in a subsequent call, you'd need to store the session cookie that you're getting from the site in order to make that subsequent call. Looking at the curl man page, you can see there's a -c <cookie_jar_file> option. If you pass that in with a filename, it should save the cookies from the login call, and you should be able to make subsequent calls using the session you've estabished, and you should be in business.

EDIT: the other answers and comments here have pointed out a couple of things missing from this answer. It's necessary to get and subsequently post the csrf key, and the correct MIME type for the post. It's certainly possible to do this on the command line, but would be a lot easier using a more complete language (per the accepted answer). I did find this question that has suggestions for a tool that could potentially be used to chop out the HTML/XML snippets that would be needed to make it work: https://superuser.com/questions/528709/command-line-css-selector-tool/528728



来源:https://stackoverflow.com/questions/44154326/how-can-i-log-in-to-stack-exchange-using-curl

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!