How to count the number of words in a sentence, ignoring numbers, punctuation and whitespace?

前端 未结 8 2037
礼貌的吻别
礼貌的吻别 2020-11-28 06:45

How would I go about counting the words in a sentence? I\'m using Python.

For example, I might have the string:

string = \"I     am having  a   very         


        
相关标签:
8条回答
  • 2020-11-28 07:16

    How about using a simple loop to count the occurrences of number of spaces!?

    txt = "Just an example here move along" 
    count = 1
    for i in txt:
    if i == " ":
       count += 1
    print(count)

    0 讨论(0)
  • 2020-11-28 07:21
        def wordCount(mystring):  
            tempcount = 0  
            count = 1  
    
            try:  
                for character in mystring:  
                    if character == " ":  
                        tempcount +=1  
                        if tempcount ==1:  
                            count +=1  
    
                        else:  
                            tempcount +=1
                     else:
                         tempcount=0
    
                 return count  
    
             except Exception:  
                 error = "Not a string"  
                 return error  
    
        mystring = "I   am having   a    very nice 23!@$      day."           
    
        print(wordCount(mystring))  
    

    output is 8

    0 讨论(0)
  • 2020-11-28 07:22

    str.split() without any arguments splits on runs of whitespace characters:

    >>> s = 'I am having a very nice day.'
    >>> 
    >>> len(s.split())
    7
    

    From the linked documentation:

    If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

    0 讨论(0)
  • 2020-11-28 07:26
    s = "I     am having  a   very  nice  23!@$      day. "
    sum([i.strip(string.punctuation).isalpha() for i in s.split()])
    

    The statement above will go through each chunk of text and remove punctuations before verifying if the chunk is really string of alphabets.

    0 讨论(0)
  • 2020-11-28 07:26
    import string 
    
    sentence = "I     am having  a   very  nice  23!@$      day. "
    # Remove all punctuations
    sentence = sentence.translate(str.maketrans('', '', string.punctuation))
    # Remove all numbers"
    sentence = ''.join([word for word in sentence if not word.isdigit()])
    count = 0;
    for index in range(len(sentence)-1) :
        if sentence[index+1].isspace() and not sentence[index].isspace():
            count += 1 
    print(count)
    
    0 讨论(0)
  • 2020-11-28 07:30

    Ok here is my version of doing this. I noticed that you want your output to be 7, which means you dont want to count special characters and numbers. So here is regex pattern:

    re.findall("[a-zA-Z_]+", string)
    

    Where [a-zA-Z_] means it will match any character beetwen a-z (lowercase) and A-Z (upper case).


    About spaces. If you want to remove all extra spaces, just do:

    string = string.rstrip().lstrip() # Remove all extra spaces at the start and at the end of the string
    while "  " in string: # While  there are 2 spaces beetwen words in our string...
        string = string.replace("  ", " ") # ... replace them by one space!
    
    0 讨论(0)
提交回复
热议问题