How can I get the HTML source in a variable using the Selenium module with Python?
I wanted to do something like this:
from selenium import webdriver
browser = webdriver.Firefox()
browser.get(raw_input("Enter URL: "))
if "whatever" in html_source:
# Do something
else:
# Do something else
How can I do this? I don't know how to access the HTML source.
You need to call the page_source
property. See below.
from selenium import webdriver
browser = webdriver.Firefox()
browser.get(raw_input("Enter URL: "))
html_source = browser.page_source
if "whatever" in html_source:
# do something
else:
# do something else
With Selenium2Library you can use get_source()
import Selenium2Library
s = Selenium2Library.Selenium2Library()
s.open_browser("localhost:7080", "firefox")
source = s.get_source()
driver.page_source will help you get the page source code. You can check if the text is present in the page source or not.
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("some url")
if "your text here" in driver.page_source:
print('Found it!')
else:
print('Did not find it.')
If you want to store the page source in a variable, add below line after driver.get:
var_pgsource=driver.page_source
and change the if condition to:
if "your text here" in var_pgsource:
By using the page source you will get the whole HTML code.
So first decide the block of code or tag in which you require to retrieve the data or to click the element..
options = driver.find_elements_by_name_("XXX")
for option in options:
if option.text == "XXXXXX":
print(option.text)
option.click()
You can find the elements by name, XPath, id, link and CSS path.
To answer your question about getting the URL to use for urllib, just execute this JavaScript code:
url = browser.execute_script("return window.location;")
I'd recommend getting the source with urllib and, if you're going to parse, use something like Beautiful Soup.
import urllib
url = urllib.urlopen("http://example.com") # Open the URL.
content = url.readlines() # Read the source and save it to a variable.
来源:https://stackoverflow.com/questions/7861775/python-selenium-accessing-html-source