How to get status code by using selenium.py (python code)

前端 未结 12 1103
深忆病人
深忆病人 2020-11-30 01:01

I am writing a selenium script by python, but I think I don\'t see any information about:

How to get http status code from selenium Python code.

相关标签:
12条回答
  • 2020-11-30 01:47

    I will refer you to a question I asked earlier: How to detect when Selenium loads a browser's error page

    The short of it is that unless you want to get uber fancy with something like a squid proxy or browsermob, then you have to go for a dirty solution like below.

    Replace

    driver.get( "http://google.com" )
    

    with

    def goTo( url ):
        if "errorPageContainer" in [ elem.get_attribute("id") for elem in driver.find_elements_by_css_selector("body > div") ]:
            raise Exception( "this page is an error" )
        else:
            driver.get( url )
    

    You can get creative and get the error code based on the text displayed in the actual browser. This will have to be customized based on the browser; the one above works for firefox.

    The only way this becomes problematic is with 404's (page not found), since many sites have their own error pages and you have to customize it for each one.

    0 讨论(0)
  • 2020-11-30 01:48

    I used the following trick by using requests to make sure that server is responding first. Then I used driver:

    resp = requests.get(link)
    while resp.status_code != 200:
        resp = requests.get(link)
        if resp.status_code == 200:
            break
    
    html = driver.page_source
    
    soup = BeautifulSoup(html)
    
    0 讨论(0)
  • 2020-11-30 01:50

    You can also inspect the last message in the log for an error status code: print browser.get_log('browser')[-1]['message']

    0 讨论(0)
  • 2020-11-30 01:53

    I've been surfing the net for about 3 hours and I found not a single way to do that with web-driver. I'v not ever worked with selenium directly. The only suggestion that came in my mind is to use module "requests" like this:

    import requests
    from selenium import webdriver
    
    driver = webdriver.get("url")
    r = requests.get("url")
    print r.status_code
    

    Complete tutorial about using requests is here and you can install the module using the command pip install requests.

    But there is a problem that may not always happen, but you should focus that driver's response and request's response are not the same; so you just get the request's status code and if the url responses are not stable it probably causes wrong results.

    0 讨论(0)
  • 2020-11-30 01:55

    I do not have much experience with python. I have a more detailed java example here:

    https://stackoverflow.com/a/39979509/5703420

    The idea is to enable Performance logging. This is triggering "Network.enable" on chromedriver. Then get the Performance log entries and parse them for "Network.responseReceived" message.

        from selenium import webdriver
    
        from selenium.webdriver.common.desired_capabilities import DesiredCapabilities    
        # enable browser logging
        d = DesiredCapabilities.CHROME
        d['loggingPrefs'] = { 'performance':'ALL' }
    
        driver = webdriver.Chrome(executable_path="c:\\windows\\chromedriver.exe", service_args=["--verbose", "--log-path=D:\\temp3\\chromedriverxx.log"], desired_capabilities=d)
    
        driver.get('https://api.ipify.org/?format=text')
    
        print(driver.title)
    
        print(driver.page_source)
    
        performance_log = driver.get_log('performance')
        print (str(performance_log).strip('[]'))
    
        for entry in driver.get_log('performance'):
            print (entry)
    

    The output will contain "Network.responseReceived" for your url, other requests that are done by the page load, or redirect urls. All you have to do is parse the log entries.

    '{"message":{"method":"Network.responseReceived","params":{"frameId":"9488.1","loaderId":"9488.1","requestId":"9488.1","response":{"connectionId":14,"connectionReused":false,"encodedDataLength":-1,"fromDiskCache":false,"fromServiceWorker":false,"headers":{"Connection":"keep-alive","Content-Length":"13","Content-Type":"text/plain","Date":"Wed, 12 Oct 2016 06:15:47 GMT","Server":"Cowboy","Via":"1.1 vegur"},"headersText":"HTTP/1.1 200 OK\\r\\nServer: Cowboy\\r\\nConnection: keep-alive\\r\\nContent-Type: text/plain\\r\\nDate: Wed, 12 Oct 2016 06:15:47 GMT\\r\\nContent-Length:13\\r\\nVia:1.1vegur\\r\\n\\r\\n","mimeType":"text/plain","protocol":"http/1.1","remoteIPAddress":"54.197.246.207","remotePort":443,"requestHeaders":{"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8","Accept-Encoding":"gzip, deflate, sdch, br","Accept-Language":"en-GB,en-US;q=0.8,en;q=0.6","Connection":"keep-alive","Host":"api.ipify.org","Upgrade-Insecure-Requests":"1","User-Agent":"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36"},"requestHeadersText":"GET /?format=text HTTP/1.1\\r\\nHost: api.ipify.org\\r\\nConnection: keep-alive\\r\\nUpgrade-Insecure-Requests: 1\\r\\nUser-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36\\r\\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\\r\\nAccept-Encoding: gzip, deflate, sdch, br\\r\\nAccept-Language: en-GB,en-US;q=0.8,en;q=0.6\\r\\n\\r\\n","securityDetails":{"certificateId":1,"certificateValidationDetails":{"numInvalidScts":0,"numUnknownScts":0,"numValidScts":0},"cipher":"AES_128_GCM","keyExchange":"ECDHE_RSA","protocol":"TLS 1.2","signedCertificateTimestampList":[]},"securityState":"secure","status":200,"statusText":"OK","timing":{"connectEnd":320.508999997401,"connectStart":3.08100000256673,"dnsEnd":3.08100000256673,"dnsStart":0,"proxyEnd":-1,"proxyStart":-1,"pushEnd":0,"pushStart":0,"receiveHeadersEnd":465.725000001839,"requestTime":78246.775045,"sendEnd":320.995999994921,"sendStart":320.825999995577,"sslEnd":320.435000001453,"sslStart":141.675999999279,"workerReady":-1,"workerStart":-1},"url":"https://api.ipify.org/?format=text"},"timestamp":78247.242716,"type":"Document"}},"webview":"6e8a3b1d-e5aa-40fb-a695-280cbb0ee420"}'}, {'timestamp': 1476252948094, 'level': 'INFO', 'message': '{"message":{"method":"Network.dataReceived","params":{"dataLength":13,"encodedDataLength":171,"requestId":"9488.1","timestamp":78247.243137}},"webview":"6e8a3b1d-e5aa-40fb-a695-280cbb0ee420"}'}, {'timestamp': 1476252948094, 'level': 'INFO', 'message': '{"message":{"method":"Page.frameNavigated","params":{"frame":{"id":"9488.1","loaderId":"9488.1","mimeType":"text/plain","securityOrigin":"https://api.ipify.org","url":"https://api.ipify.org/?format=text"}}},"webview":"6e8a3b1d-e5aa-40fb-a695-280cbb0ee420"}'}, {'timestamp': 1476252948095, 'level': 'INFO', 'message': '{"message":{"method":"Network.loadingFinished","params":{"encodedDataLength":171,"requestId":"9488.1","timestamp":78247.242066}},"webview":"6e8a3b1d-e5aa-40fb-a695-280cbb0ee420"}'}, {'timestamp': 1476252948115, 'level': 'INFO', 'message': '{"message":{"method":"Page.loadEventFired","params":{"timestamp":78247.264169}},"webview":"6e8a3b1d-e5aa-40fb-a695-280cbb0ee420"}'}, {'timestamp': 1476252948115, 'level': 'INFO', 'message': '{"message":{"method":"Page.frameStoppedLoading","params":{"frameId":"9488.1"}},"webview":"6e8a3b1d-e5aa-40fb-a695-280cbb0ee420"}'}, {'timestamp': 147625298116, 'level': 'INFO', 'message': '{"message":{"method":"Page.domContentEventFired","params":{"timestamp":78247.276475}},"webview":"6e8a3b1d-e5aa-40fb-a695-280cbb0ee420"}'}, {'timestamp': 1476252948122, 'level': 'INFO', 'message': '{"message":{"method":"Network.requestWillBeSent","params":{"documentURL":"https://api.ipify.org/?format=text","frameId":"9488.1","initiator":{"type":"other"},"loaderId":"9488.1","request":{"headers":{"Referer":"https://api.ipify.org/?format=text","User-Agent":"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36"},"initialPriority":"High","method":"GET","mixedContentType":"none","url":"https://api.ipify.org/favicon.ico"},"requestId":"9488.2","timestamp":78247.280131,"type":"Other","wallTime":1476252948.11805}},"webview":"6e8a3b1d-e5aa-40fb-a695-280cbb0ee420"}'}
    

    and get "status":200 from the json response. You can also parse the response "headers".

    0 讨论(0)
  • 2020-11-30 02:01

    Corey Goldberg had a great implementation of a profiler using Selenium and Python outputting a formatted result. Here is the link.

    http://coreygoldberg.blogspot.com/2009/10/automated-webhttp-profiler-with.html

    0 讨论(0)
提交回复
热议问题