Finding next occurring tag and its enclosed text with Beautiful Soup

后端未结

关注

 1  1015

太阳男子 2021-01-08 00:54

I\'m trying to parse text between the tag

. When I type soup.blockquote.get_text().

I get the result I want for the fir

1条回答

温柔的废话 (楼主)

2021-01-08 01:40

Use find_next_sibling (If it not a sibling, use find_next instead)

>>> html = '''
... 
... header
... 
... blah blah
... 
... eiaoiefj
... capture this next
... 
... 
don'tcapturethis
... 
... capture this too but separately after "capture this next"
... 
... 
... '''

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(html)
>>> quote1 = soup.blockquote
>>> quote1.text
u'blah blah\n'
>>> quote2 = quote1.find_next_siblings('blockquote')
>>> quote2.text
u'capture this next\n'

0 讨论(0)