I am trying to make 2 document-term matrices for a corpus, one with unigrams and one with bigrams. However, the bigram matrix is currently just identical to the unigram matr
The tokenizer option doesn't seem to work with Corpus (SimpleCorpus). Using VCorpus instead cleared up the problem.