Document-term matrix in R - bigram tokenizer not working

后端 未结 1 1365
死守一世寂寞
死守一世寂寞 2020-12-20 22:18

I am trying to make 2 document-term matrices for a corpus, one with unigrams and one with bigrams. However, the bigram matrix is currently just identical to the unigram matr

1条回答
  •  别那么骄傲
    2020-12-20 22:31

    The tokenizer option doesn't seem to work with Corpus (SimpleCorpus). Using VCorpus instead cleared up the problem.

    0 讨论(0)
提交回复
热议问题