发表新帖

发表新帖

How can I get text out of a
tag with a inside?

前端未结

关注

 1  1742

I\'m trying to extract the text from inside a

tag with a inside on www.uszip.com:

Here is an example of what I\'m t

相关标签:

1条回答

慢半拍i

2020-12-04 01:29
Unfortunately, you cannot match tags with both text and nested tags, based on the contained text alone.

You'd have to loop over all <dt> without text:
```
for dt in soup.find_all('dt', text=False):
    if 'Land area' in dt.text:
        print dt.contents[0]
```
This sounds counter-intuitive, but the .string attribute for such tags is empty, and that is what BeautifulSoup is matching against. .text contains all strings in all nested tags combined, and that is not matched against.

You could also use a custom function to do the search:
```
soup.find_all(lambda t: t.name == 'dt' and 'Land area' in t.text)
```
which essentially does the same search with the filter encapsulated in a lambda function.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题