Remove Sub String by using Python

前端 未结 3 762
时光说笑
时光说笑 2020-12-01 01:05

I already extract some information from a forum. It is the raw string I have now:

string = \'i think mabe 124 + 

        
相关标签:
3条回答
  • 2020-12-01 01:43
    >>> import re
    >>> st = " i think mabe 124 + <font color=\"black\"><font face=\"Times New Roman\">but I don't have a big experience it just how I see it in my eyes <font color=\"green\"><font face=\"Arial\">fun stuff"
    >>> re.sub("<.*?>","",st)
    " i think mabe 124 + but I don't have a big experience it just how I see it in my eyes fun stuff"
    >>> 
    
    0 讨论(0)
  • 2020-12-01 01:46
    BeautifulSoup(text, features="html.parser").text 
    
    0 讨论(0)
  • 2020-12-01 01:50
    import re
    re.sub('<.*?>', '', string)
    "i think mabe 124 + but I don't have a big experience it just how I see it in my eyes fun stuff"
    

    The re.sub function takes a regular expresion and replace all the matches in the string with the second parameter. In this case, we are searching for all tags ('<.*?>') and replacing them with nothing ('').

    The ? is used in re for non-greedy searches.

    More about the re module.

    0 讨论(0)
提交回复
热议问题