How to use lxml to find an element by text?

前端 未结 2 920
北荒
北荒 2020-12-25 13:05

Assume we have the following html:


    
        TEXT A
        T         


        
相关标签:
2条回答
  • 2020-12-25 13:11

    Another way that looks more straightforward to me:

    results = []
    root = lxml.hmtl.fromstring(the_html_above)
    for tag in root.iter():
        if "TEXT A" in tag.text
            results.append(tag)
    
    0 讨论(0)
  • 2020-12-25 13:13

    You are very close. Use text()= rather than @text (which indicates an attribute).

    e = root.xpath('.//a[text()="TEXT A"]')
    

    Or, if you know only that the text contains "TEXT A",

    e = root.xpath('.//a[contains(text(),"TEXT A")]')
    

    Or, if you know only that text starts with "TEXT A",

    e = root.xpath('.//a[starts-with(text(),"TEXT A")]')
    

    See the docs for more on the available string functions.


    For example,

    import lxml.html as LH
    
    text = '''\
    <html>
        <body>
            <a href="/1234.html">TEXT A</a>
            <a href="/3243.html">TEXT B</a>
            <a href="/7445.html">TEXT C</a>
        <body>
    </html>'''
    
    root = LH.fromstring(text)
    e = root.xpath('.//a[text()="TEXT A"]')
    print(e)
    

    yields

    [<Element a at 0xb746d2cc>]
    
    0 讨论(0)
提交回复
热议问题