How to access a site via a headless driver without being denied permission

后端 未结 2 1050
粉色の甜心
粉色の甜心 2020-12-18 09:00

I am trying to retrieve the html code of a site using a headless chrome driver. However I get a \"permission denied\" message. If I use a \"regular\" driver it all works fin

相关标签:
2条回答
  • 2020-12-18 09:51

    Adding in the following code snippet got the page to return for me:

    user_agent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36'    
    chrome_options.add_argument('user-agent={0}'.format(user_agent))
    

    The site is obviously checking for headless browsers and then denying them access. Here's an article on avoiding detection: Making Chrome Headless Undetectable

    To get the user agent being used by the driver you can run the following command:

    driver.execute_script("return navigator.userAgent")
    

    Chromes headless user agent is something like this:

    u'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/71.0.3578.98 Safari/537.36'

    0 讨论(0)
  • 2020-12-18 10:00

    you have to change user-agent in code

    If you send a lot of requests, you have to change the user-agent value in every request There are many libraries in Python and other languages ​​to help you How to do it See link below for how to use it :

    Way to change Google Chrome user agent in Selenium?

    0 讨论(0)
提交回复
热议问题