Why does Python 'for word in words:' iterate on individual characters instead of words?

与世无争的帅哥 提交于 2019-12-01 17:34:08

问题


When I run the following code on a string words:

def word_feats(words):
    return dict([(word, True) for word in words])
print(word_feats("I love this sandwich."))

I get the output dict-comprehension in letters instead of words:

{'a': True, ' ': True, 'c': True, 'e': True, 'd': True, 'I': True, 'h': True, 'l': True, 'o': True, 'n': True, 'i': True, 's': True, 't': True, 'w': True, 'v': True, '.': True}

What am I doing wrong?


回答1:


You need to explicitly split the string on whitespace:

def word_feats(words):
    return dict([(word, True) for word in words.split()])

This uses str.split() without arguments, splitting on arbitrary-width whitespace (including tabs and line separators). A string is a sequence of individual characters otherwise, and direct iteration will indeed just loop over each character.

Splitting into words, however, has to be an explicit operation you need to perform yourself, because different use-cases will have different needs on how to split a string into separate parts. Does punctuation count, for example? What about parenthesis or quoting, should words grouped by those not be split, perhaps? Etc.

If all you are doing is setting all values to True, it'll be much more efficient to use dict.fromkeys() instead:

def word_feats(words):
    return dict.fromkeys(words.split(), True)

Demo:

>>> def word_feats(words):
...     return dict.fromkeys(words.split(), True)
... 
>>> print(word_feats("I love this sandwich."))
{'I': True, 'this': True, 'love': True, 'sandwich.': True}



回答2:


You have to split the words string:

def word_feats(words):
    return dict([(word, True) for word in words.split()])
print(word_feats("I love this sandwich."))

Example

>>> words = 'I love this sandwich.'
>>> words = words.split()
>>> words
['I', 'love', 'this', 'sandwich.']

You can also use other characters on which to split:

>>> s = '23/04/2014'
>>> s = s.split('/')
>>> s
['23', '04', '2014']

Your Code

def word_feats(words):
    return dict([(word, True) for word in words.split()])
print(word_feats("I love this sandwich."))

[OUTPUT]
{'I': True, 'love': True, 'this': True, 'sandwich.': True}


来源:https://stackoverflow.com/questions/23243948/why-does-python-for-word-in-words-iterate-on-individual-characters-instead-of

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!