Can I parse xpath using python, selenium and lxml?

人走茶凉 提交于 2019-12-08 07:56:32

问题


So I have been trying to figure our how to use BeautifulSoup and did a quick search and found lxml can parse the xpath of an html page. I would LOVE if I could do that but the tutorial isnt that intuitive.

I know how to use Firebug to grab the xpath and was curious if anyone has use lxml and can explain how I can use it to parse specific xpath's, and print them.. say 5 per line..or if it's even possible?!

Selenium is using Chrome and loads the page properly, just need help moving forward.

Thanks!


回答1:


lxml's ElementTree has a .xpath() method (note that the ElementTree in the xml package in the Python distribution dosent have that!)

e.g.

# see http://lxml.de/xpathxslt.html

from lxml import etree

# root = etree.parse('/tmp/stack-overflow-questions.xml')
root = etree.XML('''
        <answers>
            <answer author="dlam" question-id="13965403">AAA</answer>
        </answers>
''')

all_answers = root.xpath('.//answer')

for i, answer in enumerate(all_answers):
    who_answered = answer.attrib['author']
    question_id = answer.attrib['question-id']
    answer_text = answer.text
    print 'Answer #{0} by {1}: {2}'.format(i, who_answered, answer_text)



回答2:


I prefer to use lxml. Because the efficiency of lxml is more higher than selenium for large elements extraction. You can use selenium to get source of webpages and parse the source with lxml's xpath instead of the native find_elements_with_xpath in selenium.



来源:https://stackoverflow.com/questions/13965403/can-i-parse-xpath-using-python-selenium-and-lxml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!