How to split a string into a list?

后端未结

关注

 9  2369

I want my Python function to split a sentence (input) and store each word in a list. My current code splits the sentence, but does not store the words as a list. How do I do

相关标签:

9条回答

天命终不由人

2020-11-21 05:05

If you want all the chars of a word/sentence in a list, do this:

print(list("word"))
#  ['w', 'o', 'r', 'd']


print(list("some sentence"))
#  ['s', 'o', 'm', 'e', ' ', 's', 'e', 'n', 't', 'e', 'n', 'c', 'e']

0 讨论(0)

情歌与酒

2020-11-21 05:07

I think you are confused because of a typo.

Replace print(words) with print(word) inside your loop to have every word printed on a different line

0 讨论(0)
发布评论:

提交评论
- 加载中...
[愿得一人]

2020-11-21 05:14
shlex has a .split() function. It differs from str.split() in that it does not preserve quotes and treats a quoted phrase as a single word:
```
>>> import shlex
>>> shlex.split("sudo echo 'foo && bar'")
['sudo', 'echo', 'foo && bar']
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

别那么骄傲

2020-11-21 05:15

How about this algorithm? Split text on whitespace, then trim punctuation. This carefully removes punctuation from the edge of words, without harming apostrophes inside words such as we're.

>>> text
"'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'"

>>> text.split()
["'Oh,", 'you', "can't", 'help', "that,'", 'said', 'the', 'Cat:', "'we're", 'all', 'mad', 'here.', "I'm", 'mad.', "You're", "mad.'"]

>>> import string
>>> [word.strip(string.punctuation) for word in text.split()]
['Oh', 'you', "can't", 'help', 'that', 'said', 'the', 'Cat', "we're", 'all', 'mad', 'here', "I'm", 'mad', "You're", 'mad']

0 讨论(0)

闹比i

2020-11-21 05:20
Depending on what you plan to do with your sentence-as-a-list, you may want to look at the Natural Language Took Kit. It deals heavily with text processing and evaluation. You can also use it to solve your problem:
```
import nltk
words = nltk.word_tokenize(raw_sentence)
```
This has the added benefit of splitting out punctuation.

Example:
```
>>> import nltk
>>> s = "The fox's foot grazed the sleeping dog, waking it."
>>> words = nltk.word_tokenize(s)
>>> words
['The', 'fox', "'s", 'foot', 'grazed', 'the', 'sleeping', 'dog', ',', 
'waking', 'it', '.']
```
This allows you to filter out any punctuation you don't want and use only words.

Please note that the other solutions using string.split() are better if you don't plan on doing any complex manipulation of the sentence.

[Edited]
0 讨论(0)
发布评论:

提交评论
- 加载中...
灰色年华

2020-11-21 05:23
Splits the string in text on any consecutive runs of whitespace.
```
words = text.split()      
```
Split the string in text on delimiter: ",".
```
words = text.split(",")   
```
The words variable will be a list and contain the words from text split on the delimiter.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页