Removing all non-numeric characters from string in Python

后端 未结 7 1073
遥遥无期
遥遥无期 2020-11-29 17:42

How do we remove all non-numeric characters from a string in Python?

7条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-29 17:55

    Fastest approach, if you need to perform more than just one or two such removal operations (or even just one, but on a very long string!-), is to rely on the translate method of strings, even though it does need some prep:

    >>> import string
    >>> allchars = ''.join(chr(i) for i in xrange(256))
    >>> identity = string.maketrans('', '')
    >>> nondigits = allchars.translate(identity, string.digits)
    >>> s = 'abc123def456'
    >>> s.translate(identity, nondigits)
    '123456'
    

    The translate method is different, and maybe a tad simpler simpler to use, on Unicode strings than it is on byte strings, btw:

    >>> unondig = dict.fromkeys(xrange(65536))
    >>> for x in string.digits: del unondig[ord(x)]
    ... 
    >>> s = u'abc123def456'
    >>> s.translate(unondig)
    u'123456'
    

    You might want to use a mapping class rather than an actual dict, especially if your Unicode string may potentially contain characters with very high ord values (that would make the dict excessively large;-). For example:

    >>> class keeponly(object):
    ...   def __init__(self, keep): 
    ...     self.keep = set(ord(c) for c in keep)
    ...   def __getitem__(self, key):
    ...     if key in self.keep:
    ...       return key
    ...     return None
    ... 
    >>> s.translate(keeponly(string.digits))
    u'123456'
    >>> 
    

提交回复
热议问题