Python remove anything that is not a letter or number

后端 未结 7 1280
甜味超标
甜味超标 2020-12-24 01:40

I\'m having a little trouble with Python regular expressions.

What is a good way to remove all characters in a string that are not letters or numbers?

Thanks

7条回答
  •  天涯浪人
    2020-12-24 02:10

    '\W' is the same as [^A-Za-z0-9_] plus accented chars from your locale.

    >>> re.sub('\W', '', 'text 1, 2, 3...')
    'text123'
    

    Maybe you want to keep the spaces or have all the words (and numbers):

    >>> re.findall('\w+', 'my. text, --without-- (punctuation) 123')
    ['my', 'text', 'without', 'punctuation', '123']
    

提交回复
热议问题