How can I get text out of a
tag with a inside?

前端 未结 1 1742
心在旅途
心在旅途 2020-12-04 01:02

I\'m trying to extract the text from inside a

tag with a inside on www.uszip.com:

Here is an example of what I\'m t

相关标签:
1条回答
  • 2020-12-04 01:29

    Unfortunately, you cannot match tags with both text and nested tags, based on the contained text alone.

    You'd have to loop over all <dt> without text:

    for dt in soup.find_all('dt', text=False):
        if 'Land area' in dt.text:
            print dt.contents[0]
    

    This sounds counter-intuitive, but the .string attribute for such tags is empty, and that is what BeautifulSoup is matching against. .text contains all strings in all nested tags combined, and that is not matched against.

    You could also use a custom function to do the search:

    soup.find_all(lambda t: t.name == 'dt' and 'Land area' in t.text)
    

    which essentially does the same search with the filter encapsulated in a lambda function.

    0 讨论(0)
提交回复
热议问题