Python - Same xpath in selenium and lxml different results

你说的曾经没有我的故事 提交于 2019-12-11 14:02:22

问题


I have this site http://www.google-proxy.net/ and i need to get first proxy's ip:port.

br = webdriver.Firefox()
br.get("http://www.google-proxy.net/")
ip = br.find_element_by_xpath("//tr[@class='odd']/td[1]").text; time.sleep(random.uniform(1, 1))
port = br.find_element_by_xpath("//tr[@class='odd']/td[2]").text; time.sleep(random.uniform(1, 1))

and it works fine. But now i want to do the same with lxml

page = requests.get(proxy_server)
root = lxml.html.fromstring(page.text)
ip = root.xpath("//tr[@class='odd']/td[1]/text()")
port = root.xpath("//tr[@class='odd']/td[1]/text()")

and i get empty lists. Why is that?


回答1:


Looks like 'odd' classes are added by Javascript in this site.

Selenium, as it runs the browser, executes the Javascript, so you have the expected class.

requests library will not execute JS, so there's no 'odd' class.




回答2:


When you use Selenium to open http://www.google-proxy.net, JavaScript is enabled. In this case, JavaScript adds the classes odd and even to the tr elements.

The requests.get method loads the HTML from http://www.google-proxy.net without JavaScript enabled. So the classes odd and even are not added to the tr elements, and your XPath/lxml functionality doesn't select anything. To replicate this behaviour you can use JavaScript switcher plugins eg Chrome plugin. This allows you to easily load webpages without JavaScript enabled.



来源:https://stackoverflow.com/questions/34705159/python-same-xpath-in-selenium-and-lxml-different-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!