Strip all non-numeric characters (except for “.”) from a string in Python

后端 未结 6 1518
死守一世寂寞
死守一世寂寞 2020-12-07 22:26

I\'ve got a pretty good working snippit of code, but I was wondering if anyone has any better suggestions on how to do this:

val = \'\'.join([c for c in val          


        
6条回答
  •  忘掉有多难
    2020-12-07 22:54

    Here's some sample code:

    $ cat a.py
    a = '27893jkasnf8u2qrtq2ntkjh8934yt8.298222rwagasjkijw'
    for i in xrange(1000000):
        ''.join([c for c in a if c in '1234567890.'])
    

    $ cat b.py
    import re
    
    non_decimal = re.compile(r'[^\d.]+')
    
    a = '27893jkasnf8u2qrtq2ntkjh8934yt8.298222rwagasjkijw'
    for i in xrange(1000000):
        non_decimal.sub('', a)
    

    $ cat c.py
    a = '27893jkasnf8u2qrtq2ntkjh8934yt8.298222rwagasjkijw'
    for i in xrange(1000000):
        ''.join([c for c in a if c.isdigit() or c == '.'])
    

    $ cat d.py
    a = '27893jkasnf8u2qrtq2ntkjh8934yt8.298222rwagasjkijw'
    for i in xrange(1000000):
        b = []
        for c in a:
            if c.isdigit() or c == '.': continue
            b.append(c)
    
        ''.join(b)
    

    And the timing results:


    $ time python a.py
    real    0m24.735s
    user    0m21.049s
    sys     0m0.456s
    
    $ time python b.py
    real    0m10.775s
    user    0m9.817s
    sys     0m0.236s
    
    $ time python c.py
    real    0m38.255s
    user    0m32.718s
    sys     0m0.724s
    
    $ time python d.py
    real    0m46.040s
    user    0m41.515s
    sys     0m0.832s
    

    Looks like the regex is the winner so far.

    Personally, I find the regex just as readable as the list comprehension. If you're doing it just a few times then you'll probably take a bigger hit on compiling the regex. Do what jives with your code and coding style.

提交回复
热议问题