Scrap password protected asp page

大憨熊 提交于 2021-01-28 02:54:44


I would like to develop automatic scrapper for asp password protected web page. I have a login/password for this page.

First of all, a look in to Firebug log during authorization via firefox. What I have found:

  1. When I open login page, I get cookie with "__RequestVerificationToken". i.e http://mysite
  2. When I press Login button FF makes POST query to http://mysite/Account/Login with parameters UserName, Password and __RequestVerificationToken, also it uses cookie saved on step 1
  3. In case of successful authorisation I get another cookie .ASPXAUTH and goes to http://mysite/Account/Index (page which I want to scrap)

My code

//1. Get __RequestVerificationToken cookie

    $urlLogin = "http://mysite";
    $cookieFile = "cookie.txt";

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL, $urlLogin);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($ch, CURLOPT_VERBOSE, TRUE);
    curl_setopt($ch, CURLOPT_STDERR,$f = fopen("answer.txt", "w+"));
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1; rv:18.0) Gecko/20100101 Firefox/18.0' );
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile); 


//2. Parse token value for the post request

preg_match_all('/=(.*); p/i',$hash, $regs);

//3. Make a post request

    $postData = '__RequestVerificationToken='.$regs[1][0].'&UserName=someLogin'.'&Password=somePassword';
    $urlSecuredPage = "http://mysite/Account/Login";
    curl_setopt($ch, CURLOPT_URL, $urlSecuredPage); 
    curl_setopt($ch, CURLOPT_POST, TRUE);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile); 
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile); 

    $data = curl_exec($ch);

At step 3 my cookie saved on step 1 is rewriting with new value of __RequestVerificationToken. I don`t understand why it happens. As a result I can not authorize due to wrong __RequestVerificationToken and get HTTP 500 error.

Where I`m wrong?


There are should be two things for __RequestVerificationToken. One of them in hidden input value, the second one in the cookie. Value from hidden input value is sent in each request. And for each request it has a new value. It depends on cookie value.

So you need to save input value and cookie, and send them back together. If you won't send value from hidden input, then Asp.Net MVC thinks that this is an attack, and generate new cookie. New cookie will be generated only if validation failed or the cookie itself doesn't exists. If you get that cookie, and you always send __RequestVerificationToken input value with POST request, then it shouldn't generate new cookie.

If it's still generated, then you are sending incorrect __RequestVerificationToken from hidden input value. Try to do the same from Fiddler\Charles, and check will be return success result or not.

They are used to prevent CSRF attacks.


Big thanks for Sergey Litvinov and hindmost

Correct code below

$urlLogin = "http://mysite";
$cookieFile = "/Volumes/Media/WebServer/aszh/cookie.txt";

$ch = curl_init();

//Make GET request and get __RequestVerificationToken cookie
curl_setopt($ch, CURLOPT_URL, $urlLogin);
curl_setopt($ch, CURLOPT_VERBOSE, TRUE);
curl_setopt($ch, CURLOPT_STDERR,$f = fopen("/Volumes/Media/WebServer/aszh/answer.txt", "w+"));
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1; rv:18.0) Gecko/20100101 Firefox/18.0' );
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile); 


//Parse answer and get __RequestVerificationToken hidden input value
preg_match_all('/type="hidden" value="(.*)" /i', $data, $regs);
$token = $regs[1][0];

$postData = array('__RequestVerificationToken'=>$token,

//Make POST request and get .ASPXAUTH cookie
$urlSecuredPage = "http://mysite/Account/Login";
curl_setopt($ch, CURLOPT_URL, $urlSecuredPage); 
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postData));
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile); 

$data = curl_exec($ch);

