Deleting consonants from a string in Python

前端 未结 3 780
情深已故
情深已故 2020-12-01 20:01

Here is my code. I\'m not exactly sure if I need a counter for this to work. The answer should be \'iiii\'.

def eliminate_consonants(x):
                


        
3条回答
  •  抹茶落季
    2020-12-01 21:00

    Correcting your code

    The line if char == vowels: is wrong. It has to be if char in vowels:. This is because you need to check if that particular character is present in the list of vowels. Apart from that you need to print(char,end = '') (in python3) to print the output as iiii all in one line.

    The final program will be like

    def eliminate_consonants(x):
            vowels= ['a','e','i','o','u']
            for char in x:
                if char in vowels:
                    print(char,end = "")
    
    eliminate_consonants('mississippi')
    

    And the output will be

    iiii
    

    Other ways include

    • Using in a string

      def eliminate_consonants(x):
          for char in x:
              if char in 'aeiou':
                  print(char,end = "")
      

      As simple as it looks, the statement if char in 'aeiou' checks if char is present in the string aeiou.

    • A list comprehension

       ''.join([c for c in x if c in 'aeiou'])
      

      This list comprehension will return a list that will contain the characters only if the character is in aeiou

    • A generator expression

      ''.join(c for c in x if c in 'aeiou')
      

      This gen exp will return a generator than will return the characters only if the character is in aeiou

    • Regular Expressions

      You can use re.findall to discover only the vowels in your string. The code

      re.findall(r'[aeiou]',"mississippi")
      

      will return a list of vowels found in the string i.e. ['i', 'i', 'i', 'i']. So now we can use str.join and then use

      ''.join(re.findall(r'[aeiou]',"mississippi"))
      
    • str.translate and maketrans

      For this technique you will need to store a map which matches each of the non vowels to a None type. For this you can use string.ascii_lowecase. The code to make the map is

      str.maketrans({i:None for i in string.ascii_lowercase if i not in "aeiou"})
      

      this will return the mapping. Do store it in a variable (here m for map)

      "mississippi".translate(m)
      

      This will remove all the non aeiou characters from the string.

    • Using dict.fromkeys

      You can use dict.fromkeys along with sys.maxunicode. But remember to import sys first!

      dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')
      

      and now use str.translate.

      'mississippi'.translate(m)
      
    • Using bytearray

      As mentioned by J.F.Sebastian in the comments below, you can create a bytearray of lower case consonants by using

      non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))
      

      Using this we can translate the word ,

      'mississippi'.encode('ascii', 'ignore').translate(None, non_vowels)
      

      which will return b'iiii'. This can easily be converted to str by using decode i.e. b'iiii'.decode("ascii").

    • Using bytes

      bytes returns an bytes object and is the immutable version of bytearray. (It is Python 3 specific)

      non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))
      

      Using this we can translate the word ,

      'mississippi'.encode('ascii', 'ignore').translate(None, non_vowels)
      

      which will return b'iiii'. This can easily be converted to str by using decode i.e. b'iiii'.decode("ascii").


    Timing comparison

    Python 3

    python3 -m timeit -s "text = 'mississippi'*100; non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
    100000 loops, best of 3: 2.88 usec per loop
    python3 -m timeit -s "text = 'mississippi'*100; non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
    100000 loops, best of 3: 3.06 usec per loop
    python3 -m timeit -s "text = 'mississippi'*100;d=dict.fromkeys(i for i in range(127) if chr(i) not in 'aeiou')" "text.translate(d)"
    10000 loops, best of 3: 71.3 usec per loop
    python3 -m timeit -s "import string; import sys; text='mississippi'*100; m = dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')" "text.translate(m)"
    10000 loops, best of 3: 71.6 usec per loop
    python3 -m timeit -s "text = 'mississippi'*100" "''.join(c for c in text if c in 'aeiou')"
    10000 loops, best of 3: 60.1 usec per loop
    python3 -m timeit -s "text = 'mississippi'*100" "''.join([c for c in text if c in 'aeiou'])"
    10000 loops, best of 3: 53.2 usec per loop
    python3 -m timeit -s "import re;text = 'mississippi'*100; p=re.compile(r'[aeiou]')" "''.join(p.findall(text))"
    10000 loops, best of 3: 57 usec per loop
    

    The timings in sorted order

    translate (bytes)    |  2.88
    translate (bytearray)|  3.06
    List Comprehension   | 53.2
    Regular expressions  | 57.0
    Generator exp        | 60.1
    dict.fromkeys        | 71.3
    translate (unicode)  | 71.6
    

    As you can see the final method using bytes is the fastest.


    Python 3.5

    python3.5 -m timeit -s "text = 'mississippi'*100; non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
    100000 loops, best of 3: 4.17 usec per loop
    python3.5 -m timeit -s "text = 'mississippi'*100; non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
    100000 loops, best of 3: 4.21 usec per loop
    python3.5 -m timeit -s "text = 'mississippi'*100;d=dict.fromkeys(i for i in range(127) if chr(i) not in 'aeiou')" "text.translate(d)"
    100000 loops, best of 3: 2.39 usec per loop
    python3.5 -m timeit -s "import string; import sys; text='mississippi'*100; m = dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')" "text.translate(m)"
    100000 loops, best of 3: 2.33 usec per loop
    python3.5 -m timeit -s "text = 'mississippi'*100" "''.join(c for c in text if c in 'aeiou')"
    10000 loops, best of 3: 97.1 usec per loop
    python3.5 -m timeit -s "text = 'mississippi'*100" "''.join([c for c in text if c in 'aeiou'])"
    10000 loops, best of 3: 86.6 usec per loop
    python3.5 -m timeit -s "import re;text = 'mississippi'*100; p=re.compile(r'[aeiou]')" "''.join(p.findall(text))"
    10000 loops, best of 3: 74.3 usec per loop
    

    The timings in sorted order

    translate (unicode)  |  2.33
    dict.fromkeys        |  2.39
    translate (bytes)    |  4.17
    translate (bytearray)|  4.21
    List Comprehension   | 86.6
    Regular expressions  | 74.3
    Generator exp        | 97.1
    

提交回复
热议问题