Returning 403 Forbidden from simple get but loads okay in browser

前端 未结 2 1931
执笔经年
执笔经年 2020-12-11 23:14

I\'m trying to get some data from a page, but it\'s returning the error [403 Forbidden].

I thought it was the user agent, but I tri

相关标签:
2条回答
  • 2020-12-11 23:34

    These all headers I can see for a generic GET request that are included by the browser:

    Host: <URL>
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: en-US,en;q=0.5
    Accept-Encoding: gzip, deflate, br
    Connection: keep-alive
    Upgrade-Insecure-Requests: 1
    

    Try to include all those incrementally in your request (1 by 1) in order to identify which one(s) is/are required for a successful request.

    On the other hand, take look of the tabs: Cookies and/or Security available in your browser console / developer tools under Network option.

    0 讨论(0)
  • 2020-12-11 23:44

    The site could be using anything in the request to trigger the rejection.

    So, copy all headers from the request that your browser makes. Then delete them one by one1 to find out which are essential.

    As per Python requests. 403 Forbidden, to add custom headers to the request, do:

    result = requests.get(url, headers={'header':'value', <etc>})
    

    1A faster way would be to delete half of them each time instead but that's more complicated since there are probably multiple essential headers

    0 讨论(0)
提交回复
热议问题