Sending an ASP.net POST with Python's Requests

前端 未结 1 543
野性不改
野性不改 2020-12-29 13:51

I\'m scraping an old ASP.net website using Python\'s requests module.

I\'ve spent 5+ hours trying to figure out how to simulate this POST request to no avail. Doing

相关标签:
1条回答
  • 2020-12-29 14:18

    You have too many request parameters, and should not set the content-type, content-length, host, origin, or connection headers; leave those to requests to set.

    You are also doubling up the url parameters; either add the vr parameter to the URL manually or use params, not do both.

    It may well be that some of the parameters in the POST body are generated by the ASP application tied to a session. I'd use a GET request with a Session object the valuation_url, parse the form in that page to extract the __CALLBACKID parameter. The requests Session will then store any cookies the server sets and reuse those:

    item_request_headers = {
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36",
        "Accept": "*/*",
        "Accept-Encoding": "gzip,deflate,sdch",
        "Accept-Language": "en-US,en;q=0.8"
    }
    payload = {"vr": int(item_number[0])}
    
    session = requests.Session(headers=item_request_headers)
    
    # Get form page
    form_response = session.get(validation_url, params=payload) 
    
    # parse form page; BeautifulSoup could do this for example
    soup = BeautifulSoup(form_response.content)
    callbackid = soup.select('input[name=__CALLBACKID]')[0]['value']
    
    item_request_body = {
        "__SPSCEditMenu": "true",
        "MSOWebPartPage_PostbackSource": "",
        "MSOTlPn_SelectedWpId": "",
        "MSOTlPn_View": 0,
        "MSOTlPn_ShowSettings": "False",
        "MSOGallery_SelectedLibrary": "",
        "MSOGallery_FilterString": "",
        "MSOTlPn_Button": "none",
        "__EVENTTARGET": "",
        "__EVENTARGUMENT": "",
        "MSOAuthoringConsole_FormContext": "",
        "MSOAC_EditDuringWorkflow": "",
        "MSOSPWebPartManager_DisplayModeName": "Browse",
        "MSOWebPartPage_Shared": "",
        "MSOLayout_LayoutChanges": "",
        "MSOLayout_InDesignMode": "",
        "MSOSPWebPartManager_OldDisplayModeName": "Browse",
        "MSOSPWebPartManager_StartWebPartEditingName": "false",
        "__VIEWSTATE": viewstate,
        "keywords": "Search our site",
        "__CALLBACKID": callbackid,
        "__CALLBACKPARAM": "startvr"
    }
    
    item_url = 'http://www.example.com/EN/items/Pages/yourrates.aspx'
    
    response = session.post(url=item_url, params=payload, data=item_request_body,
                            headers={'Referer': form_response.url})
    

    The session handles the headers (setting a user agent, and accept parameters), only on the POST with the session do we add a referrer header as well.

    0 讨论(0)
提交回复
热议问题