Extract all [removed] tags in an HTML page and append to the bottom of the document

后端 未结 1 1208
無奈伤痛
無奈伤痛 2020-12-20 04:37

Could someone tell me how I can extract and remove all the

1条回答
  •  心在旅途
    2020-12-20 05:22

    The answer is simple and may miss many nuances. How ever, this should give you an idea of how to go about doing it, improving it in general. I am sure this can be improved but you should be able to do that quickly with help of the documentation.

    Reference doc: http://www.crummy.com/software/BeautifulSoup/documentation.html

    from bs4 import BeautifulSoup
    
    doc = ['Page title',
           '

    This is paragraph one.', '

    This is paragraph two.', ''] soup = BeautifulSoup(''.join(doc)) for tag in soup.findAll('script'): # Use extract to remove the tag tag.extract() # use simple insert soup.body.insert(len(soup.body.contents), tag) print soup.prettify()

    Output:

    
     
      
       Page title
      
     
     
      

    This is paragraph one .

    This is paragraph two .

    0 讨论(0)
提交回复
热议问题