Extracting text from HTML file using Python

后端 未结 30 2777
一生所求
一生所求 2020-11-22 04:05

I\'d like to extract the text from an HTML file using Python. I want essentially the same output I would get if I copied the text from a browser and pasted it into notepad.

30条回答
  •  傲寒
    傲寒 (楼主)
    2020-11-22 04:33

    You can use html2text method in the stripogram library also.

    from stripogram import html2text
    text = html2text(your_html_string)
    

    To install stripogram run sudo easy_install stripogram

提交回复
热议问题