BeautifulSoup `find_all` generator

后端 未结 3 1025
情书的邮戳
情书的邮戳 2021-02-01 11:17

Is there any way to turn find_all into a more memory efficient generator? For example:

Given:

soup = BeautifulSoup(content, \"html.parser\         


        
3条回答
  •  醉话见心
    2021-02-01 11:48

    The simplest method is to use find_next:

    soup = BeautifulSoup(content, "html.parser")
    
    def find_iter(tagname):
        tag = soup.find(tagname)
        while tag is not None:
            yield tag
            tag = tag.find_next(tagname)
    

提交回复
热议问题