Document-term matrix in R - bigram tokenizer not working

后端未结

关注

 1  1365

死守一世寂寞 2020-12-20 22:18

I am trying to make 2 document-term matrices for a corpus, one with unigrams and one with bigrams. However, the bigram matrix is currently just identical to the unigram matr

1条回答

别那么骄傲 (楼主)

2020-12-20 22:31

The tokenizer option doesn't seem to work with Corpus (SimpleCorpus). Using VCorpus instead cleared up the problem.

0 讨论(0)
发布评论:

提交评论
- 加载中...