'string' has incorrect type (expected str, got spacy.tokens.doc.Doc)

杀马特。学长 韩版系。学妹 提交于 2020-03-02 05:44:04

问题


I have a dataframe:

train_review = train['review']
train_review

It looks like:

0      With all this stuff going down at the moment w...
1      \The Classic War of the Worlds\" by Timothy Hi...
2      The film starts with a manager (Nicholas Bell)...
3      It must be assumed that those who praised this...
4      Superbly trashy and wondrously unpretentious 8...

I add the tokens into a string:

train_review = train['review']
train_token = ''
for i in train['review']:
   train_token +=i

What I want is to tokenize the reviews using Spacy. Here is what I tried, but I get the following error:

Argument 'string' has incorrect type (expected str, got spacy.tokens.doc.Doc)

How can I solve that? Thanks in advance!


回答1:


In your for loop you are taking spacy.tokens from your dataframe and appending them to a string, so you should cast it to str. Like this:

train_review = train['review']
train_token = ''
for i in train['review']:
   train_token += str(i)


来源:https://stackoverflow.com/questions/53588518/string-has-incorrect-type-expected-str-got-spacy-tokens-doc-doc

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!