I am writing a selenium script by python, but I think I don\'t see any information about:
How to get http status code from selenium Python code.
I'm using java here as I haven't got much experience in Python. Also, I don't know how to get only the http status codes. Following will give you the entire network traffic, you can capture status codes from it.
First start your server as
selenium.start("captureNetworkTraffic=true");
Then capture your trafic as
String traffic = selenium.captureNetworkTraffic("xml");
You can get output in json as well.
import json
from selenium.webdriver.chrome.webdriver import WebDriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
chromedriver_path = "YOUR/PATH/TO/chromedriver.exe"
url = "https://selenium-python.readthedocs.io/api.html"
capabilities = DesiredCapabilities.CHROME.copy()
capabilities['goog:loggingPrefs'] = {'performance': 'ALL'}
browser = WebDriver(chromedriver_path, desired_capabilities=capabilities)
browser.get(url)
logs = browser.get_log('performance')
Option 1: if you just want to return the status code under the assumption that the page you want the status code from... exists in the log containing 'text/html
content type
def get_status(logs):
for log in logs:
if log['message']:
d = json.loads(log['message'])
try:
content_type = 'text/html' in d['message']['params']['response']['headers']['content-type']
response_received = d['message']['method'] == 'Network.responseReceived'
if content_type and response_received:
return d['message']['params']['response']['status']
except:
pass
Usage:
>>> get_status(logs)
200
Option 2: if you wanted to see all status codes in the relevant logs
def get_status_codes(logs):
statuses = []
for log in logs:
if log['message']:
d = json.loads(log['message'])
if d['message'].get('method') == "Network.responseReceived":
statuses.append(d['message']['params']['response']['status'])
return statuses
Usage:
>>> get_status_codes(logs)
[200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200]
Note 1: much of this is based on @Stefan Matei answer, however, a few things have changed between Chrome versions and I provide an idea of how to parse the logs.
Note 2: ['content-type']
Not fully reliable. Casing can change. Inspect for your use-case.
Unfortunately, Selenium does not provide this information by design. There is a very lengthy discussion about this, but the short of it is that:
We're left with hacks like:
In order to get a status code from url using Selenium you can use a javascript and XMLHttpRequest
object. WebDriver
class has a execute_async_script()
method and you can call it to execute a javascript code within the browser:
from selenium import webdriver
driver = webdriver.Chrome(executable_path="C:\ChromeDriver\chromedriver.exe")
driver.get('https://stackoverflow.com/')
js = '''
let callback = arguments[0];
let xhr = new XMLHttpRequest();
xhr.open('GET', 'https://stackoverflow.com/', true);
xhr.onload = function () {
if (this.readyState === 4) {
callback(this.status);
}
};
xhr.onerror = function () {
callback('error');
};
xhr.send(null);
'''
status_code = driver.execute_async_script(js)
print(status_code) # 200
driver.close()
More information about execute_async_script method.
It seems to be possible to get response status code from the log via API.
from selenium import webdriver
import json
browser = webdriver.PhantomJS()
browser.get('http://www.google.fr')
har = json.loads(browser.get_log('har')[0]['message'])
har['log']['entries'][0]['response']['status']
har['log']['entries'][0]['response']['statusText']
YOU CAN GET STATUS CODE FROM THE TITLE
For example, 403 Forbidden response from nginx.
<html>
<head>
<title>403 Forbidden</title>
</head>
<body></body>
</html>
Selenium code:
text = driver.find_element_by_tag_name('title').text
if '403 Forbidden' in text:
print('[INFO] status code is 403')
Ofcourse, this decision does not cover all the cases.