How to find a word - First letter will be capital & other will be lower

懵懂的女人 提交于 2019-12-24 05:56:20

问题


Problem Statement: Filter those words from the complete set of text6, having first letter in upper case and all other letters in lower case. Store the result in variable title_words. print the number of words present in title_words.

I have tried every possible ways to find the answer but don't know where I am lagging.

import nltk
from nltk.book import text6
title_words = 0
for item in set(text6):
    if item[0].isupper() and item[1:].islower():
        title_words += 1
print(title_words)

I have tried in this way as well:

title_words = 0
for item in text6:
    if item[0].isupper() and item[1:].islower():
        title_words += 1
print(title_words)

I am not sure how many count its required, whatever the count is coming its not allowing me to pass the challenge. Please let me know if I am doing anything wrong in this code


回答1:


I think the problem is with set(text6). I suggest you iterate over text6.tokens.

Update, explanation

The code you've provided is correct.

The issues is that the text can contain same words multiple times. Doing a set(words) will reduce the total available words, so you start with an incomplete data set.

The other responses are not necessary wrong in checking the validity of a word, but they are iterating over the same wrong data set.




回答2:


One of the above suggestions did work for me. Sample code below.

title_words = [word for word in text6 if (len(word)==1 and word[0].isupper()) or (word[0].isupper() and word[1:].islower()) ]
print(len(title_words))



回答3:


In the question, "Store the result in variable title_words. print the number of words present in title_words."

The result of filtering a list of elements is a list of the same type of elements. In your case, filtering the list text6 (assuming it's a list of strings) would result in a (smaller) list of strings. Your title_words variable should be this filtered list, not the number of strings; the number of strings would just be the length of the list.

It's also ambiguous from the question if capitalized words should be filtered out (ie. removed from the smaller list) or filtered (ie. kept in the list), so try out both to see if you're interpreting it incorrectly.




回答4:


Give regular expressions a try:

>>> import re
>>> from nltk.book import text6
>>>
>>> text = ' '.join(set(text6))
>>> title_words = re.findall(r'([A-Z]{1}[a-z]+)', text)
>>> len(title_words)
461



回答5:


There are 50 singleton elements (elements of length one) in text6, however, your code would not pass any as a success, like, 'I' or 'W' etc. Is that correct, or do you require words of minimum length 2?




回答6:


Just few changes according to what the question asks.

from nltk.book import text6
title_words = []
for item in set(text6):
    if item[0].isupper() and item[1:].islower():
        title_words.append(item)
print(len(title_words))


来源:https://stackoverflow.com/questions/55438634/how-to-find-a-word-first-letter-will-be-capital-other-will-be-lower

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!