Python - beautifulsoup, apply in every text file in folder and produce new text file

前端 未结 2 1962
暖寄归人
暖寄归人 2021-01-21 18:15

I am using the following Python - Beautifulsoup code to remove html elements from a text file:

from bs4 import BeautifulSoup

with open(\"textFileWithHtml.txt\")         


        
2条回答
  •  轮回少年
    2021-01-21 18:44

    The glob module lets you list all the files in a directory:

    import glob
    for path in glob.glob('*.txt'):
        with open(path) as markup:
            soup = BeautifulSoup(markup.read())
    
        with open("strip_" + path, "w") as f: 
            f.write(soup.get_text().encode('utf-8'))
    

    If you want to also do that for every subfolder recursively, check out os.walk

提交回复
热议问题