Python - Same xpath in selenium and lxml different results

问题

I have this site http://www.google-proxy.net/ and i need to get first proxy's ip:port.

br = webdriver.Firefox()
br.get("http://www.google-proxy.net/")
ip = br.find_element_by_xpath("//tr[@class='odd']/td[1]").text; time.sleep(random.uniform(1, 1))
port = br.find_element_by_xpath("//tr[@class='odd']/td[2]").text; time.sleep(random.uniform(1, 1))

and it works fine. But now i want to do the same with lxml

page = requests.get(proxy_server)
root = lxml.html.fromstring(page.text)
ip = root.xpath("//tr[@class='odd']/td[1]/text()")
port = root.xpath("//tr[@class='odd']/td[1]/text()")

and i get empty lists. Why is that?

回答1:

Looks like 'odd' classes are added by Javascript in this site.

Selenium, as it runs the browser, executes the Javascript, so you have the expected class.

requests library will not execute JS, so there's no 'odd' class.

回答2:

When you use Selenium to open http://www.google-proxy.net, JavaScript is enabled. In this case, JavaScript adds the classes odd and even to the tr elements.

The requests.get method loads the HTML from http://www.google-proxy.net without JavaScript enabled. So the classes odd and even are not added to the tr elements, and your XPath/lxml functionality doesn't select anything. To replicate this behaviour you can use JavaScript switcher plugins eg Chrome plugin. This allows you to easily load webpages without JavaScript enabled.

来源：https://stackoverflow.com/questions/34705159/python-same-xpath-in-selenium-and-lxml-different-results

标签

python

selenium

lxml