Python: Strip everything but spaces and alphanumeric

后端 未结 4 552
时光说笑
时光说笑 2020-12-14 06:37

I have a large string with brackets and commas and such. I want to strip all those characters but keep the spacing. How can I do this. As of now I am using

s         


        
相关标签:
4条回答
  • 2020-12-14 06:51

    A bit faster implementation:

    import re
    
    pattern = re.compile('([^\s\w]|_)+')
    strippedList = pattern.sub('', value)
    
    0 讨论(0)
  • 2020-12-14 07:08

    Demonstrating what characters you will get in the result:

    >>> s = ''.join(chr(i) for i in range(256)) # all possible bytes
    >>> re.sub(r'[^\s\w_]+','',s) # What will remain
    '\t\n\x0b\x0c\r 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz'
    

    Docs: re.sub, Regex HOWTO: Matching Characters, Regex HOWTO: Repeating Things

    0 讨论(0)
  • 2020-12-14 07:10

    The regular-expression based versions might be faster (especially if you switch to using a compiled expression), but I like this for clarity:

    "".join([c for c in origList if c in string.letters or c in string.whitespace])
    

    It's a bit weird with the join() call, but I think that is pretty idiomatic Python for converting a list of characters into a string.

    0 讨论(0)
  • 2020-12-14 07:13
    re.sub(r'([^\s\w]|_)+', '', origList)
    
    0 讨论(0)
提交回复
热议问题