efficiently checking that string consists of one character in Python

后端未结
关注
 8  1309
太阳男子 2020-11-29 01:03
What is an efficient way to check that a string s in Python consists of just one character, say \'A\'? Something like all_equal(s, \'A\')

      
      
        
          8条回答        

        
                    
            
            
                         
                
              
              
                
                   独厮守ぢ
                                             
                
                
                (楼主)
            
              
              
                2020-11-29 01:53
              

            
            
                        
>>> s = 'AAAAAAAAAAAAAAAAAAA'
>>> s.count(s[0]) == len(s)
True


This doesn't short circuit.  A version which does short-circuit would be:

>>> all(x == s[0] for x in s)
True


However, I have a feeling that due the the optimized C implementation, the non-short circuiting version will probably perform better on some strings (depending on size, etc)



Here's a simple timeit script to test some of the other options posted:

import timeit
import re

def test_regex(s,regex=re.compile(r'^(.)\1*$')):
    return bool(regex.match(s))

def test_all(s):
    return all(x == s[0] for x in s)

def test_count(s):
    return s.count(s[0]) == len(s)

def test_set(s):
    return len(set(s)) == 1

def test_replace(s):
    return not s.replace(s[0],'')

def test_translate(s):
    return not s.translate(None,s[0])

def test_strmul(s):
    return s == s[0]*len(s)

tests = ('test_all','test_count','test_set','test_replace','test_translate','test_strmul','test_regex')

print "WITH ALL EQUAL"
for test in tests:
    print test, timeit.timeit('%s(s)'%test,'from __main__ import %s; s="AAAAAAAAAAAAAAAAA"'%test)
    if globals()[test]("AAAAAAAAAAAAAAAAA") != True:
        print globals()[test]("AAAAAAAAAAAAAAAAA")
        raise AssertionError

print
print "WITH FIRST NON-EQUAL"
for test in tests:
    print test, timeit.timeit('%s(s)'%test,'from __main__ import %s; s="FAAAAAAAAAAAAAAAA"'%test)
    if globals()[test]("FAAAAAAAAAAAAAAAA") != False:
        print globals()[test]("FAAAAAAAAAAAAAAAA")
        raise AssertionError


On my machine (OS-X 10.5.8, core2duo, python2.7.3) with these contrived (short) strings, str.count smokes set and all, and beats str.replace by a little, but is edged out by str.translate and strmul is currently in the lead by a good margin:

WITH ALL EQUAL
test_all 5.83863711357
test_count 0.947771072388
test_set 2.01028490067
test_replace 1.24682998657
test_translate 0.941282987595
test_strmul 0.629556179047
test_regex 2.52913498878

WITH FIRST NON-EQUAL
test_all 2.41147494316
test_count 0.942595005035
test_set 2.00480484962
test_replace 0.960338115692
test_translate 0.924381017685
test_strmul 0.622269153595
test_regex 1.36632800102


The timings could be slightly (or even significantly?) different between different systems and with different strings, so that would be worth looking into with an actual string you're planning on passing.

Eventually, if you hit the best case for all enough, and your strings are long enough, you might want to consider that one. It's a better algorithm ... I would avoid the set solution though as I don't see any case where it could possibly beat out the count solution.

If memory could be an issue, you'll need to avoid str.translate, str.replace and strmul as those create a second string, but this isn't usually a concern these days.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它8个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复