Deleting consonants from a string in Python

前端未结

关注

 3  793

情深已故 2020-12-01 20:01

Here is my code. I\'m not exactly sure if I need a counter for this to work. The answer should be \'iiii\'.

def eliminate_consonants(x):


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   抹茶落季
                                             
                
                
                (楼主)
            
              
              
                2020-12-01 21:00
              

            
            
                        
Correcting your code

The line if char == vowels: is wrong. It has to be if char in vowels:. This is because you need to check if that particular character is present in the list of vowels. Apart from that you need to print(char,end = '') (in python3) to print the output as iiii all in one line.

The final program will be like

def eliminate_consonants(x):
        vowels= ['a','e','i','o','u']
        for char in x:
            if char in vowels:
                print(char,end = "")

eliminate_consonants('mississippi')


And the output will be

iiii




Other ways include


Using in a string

def eliminate_consonants(x):
    for char in x:
        if char in 'aeiou':
            print(char,end = "")


As simple as it looks, the statement if char in 'aeiou' checks if char is present in the string aeiou. 
A list comprehension

 ''.join([c for c in x if c in 'aeiou'])


This list comprehension will return a list that will contain the characters only if the character is in aeiou    
A generator expression

''.join(c for c in x if c in 'aeiou')


This gen exp will return a generator than will return the characters only if the character is in aeiou
Regular Expressions

You can use re.findall to discover only the vowels in your string. The code

re.findall(r'[aeiou]',"mississippi")


will return a list of vowels found in the string i.e. ['i', 'i', 'i', 'i']. So now we can use str.join and then use 

''.join(re.findall(r'[aeiou]',"mississippi"))

str.translate and maketrans

For this technique you will need to store a map which matches each of the non vowels to a None type. For this you can use string.ascii_lowecase. The code to make the map is 

str.maketrans({i:None for i in string.ascii_lowercase if i not in "aeiou"})


this will return the mapping. Do store it in a variable (here m for map)

"mississippi".translate(m)


This will remove all the non aeiou characters from the string. 
Using dict.fromkeys

You can use dict.fromkeys along with sys.maxunicode. But remember to import sys first! 

dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')


and now use str.translate. 

'mississippi'.translate(m)

Using bytearray

As mentioned by J.F.Sebastian in the comments below, you can create a bytearray of lower case consonants by using

non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))


Using this we can translate the word ,

'mississippi'.encode('ascii', 'ignore').translate(None, non_vowels)


which will return b'iiii'. This can easily be converted to str by using decode i.e. b'iiii'.decode("ascii"). 
Using bytes

bytes returns an bytes object and is the immutable version of bytearray. (It is Python 3 specific)

non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))


Using this we can translate the word ,

'mississippi'.encode('ascii', 'ignore').translate(None, non_vowels)


which will return b'iiii'. This can easily be converted to str by using decode i.e. b'iiii'.decode("ascii"). 




Timing comparison

Python 3

python3 -m timeit -s "text = 'mississippi'*100; non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
100000 loops, best of 3: 2.88 usec per loop
python3 -m timeit -s "text = 'mississippi'*100; non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
100000 loops, best of 3: 3.06 usec per loop
python3 -m timeit -s "text = 'mississippi'*100;d=dict.fromkeys(i for i in range(127) if chr(i) not in 'aeiou')" "text.translate(d)"
10000 loops, best of 3: 71.3 usec per loop
python3 -m timeit -s "import string; import sys; text='mississippi'*100; m = dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')" "text.translate(m)"
10000 loops, best of 3: 71.6 usec per loop
python3 -m timeit -s "text = 'mississippi'*100" "''.join(c for c in text if c in 'aeiou')"
10000 loops, best of 3: 60.1 usec per loop
python3 -m timeit -s "text = 'mississippi'*100" "''.join([c for c in text if c in 'aeiou'])"
10000 loops, best of 3: 53.2 usec per loop
python3 -m timeit -s "import re;text = 'mississippi'*100; p=re.compile(r'[aeiou]')" "''.join(p.findall(text))"
10000 loops, best of 3: 57 usec per loop


The timings in sorted order

translate (bytes)    |  2.88
translate (bytearray)|  3.06
List Comprehension   | 53.2
Regular expressions  | 57.0
Generator exp        | 60.1
dict.fromkeys        | 71.3
translate (unicode)  | 71.6


As you can see the final method using bytes is the fastest. 



Python 3.5

python3.5 -m timeit -s "text = 'mississippi'*100; non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
100000 loops, best of 3: 4.17 usec per loop
python3.5 -m timeit -s "text = 'mississippi'*100; non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
100000 loops, best of 3: 4.21 usec per loop
python3.5 -m timeit -s "text = 'mississippi'*100;d=dict.fromkeys(i for i in range(127) if chr(i) not in 'aeiou')" "text.translate(d)"
100000 loops, best of 3: 2.39 usec per loop
python3.5 -m timeit -s "import string; import sys; text='mississippi'*100; m = dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')" "text.translate(m)"
100000 loops, best of 3: 2.33 usec per loop
python3.5 -m timeit -s "text = 'mississippi'*100" "''.join(c for c in text if c in 'aeiou')"
10000 loops, best of 3: 97.1 usec per loop
python3.5 -m timeit -s "text = 'mississippi'*100" "''.join([c for c in text if c in 'aeiou'])"
10000 loops, best of 3: 86.6 usec per loop
python3.5 -m timeit -s "import re;text = 'mississippi'*100; p=re.compile(r'[aeiou]')" "''.join(p.findall(text))"
10000 loops, best of 3: 74.3 usec per loop


The timings in sorted order

translate (unicode)  |  2.33
dict.fromkeys        |  2.39
translate (bytes)    |  4.17
translate (bytearray)|  4.21
List Comprehension   | 86.6
Regular expressions  | 74.3
Generator exp        | 97.1

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复