Python, remove all non-alphabet chars from string

后端 未结 6 1508
时光说笑
时光说笑 2020-11-30 21:08

I am writing a python MapReduce word count program. Problem is that there are many non-alphabet chars strewn about in the data, I have found this post Stripping everything b

6条回答
  •  春和景丽
    2020-11-30 21:42

    You can use the re.sub() function to remove these characters:

    >>> import re
    >>> re.sub("[^a-zA-Z]+", "", "ABC12abc345def")
    'ABCabcdef'
    

    re.sub(MATCH PATTERN, REPLACE STRING, STRING TO SEARCH)

    • "[^a-zA-Z]+" - look for any group of characters that are NOT a-zA-z.
    • "" - Replace the matched characters with ""

提交回复
热议问题