Finding 2 & 3 word Phrases Using R TM Package
I am trying to find a code that actually works to find the most frequently used two and three word phrases in R text mining package (maybe there is another package for it that I do not know). I have been trying to use the tokenizer, but seem to have no luck. If you worked on a similar situation in the past, could you post a code that is tested and actually works? Thank you so much! You can pass in a custom tokenizing function to tm 's DocumentTermMatrix function, so if you have package tau installed it's fairly straightforward. library(tm); library(tau); tokenize_ngrams <- function(x, n=3)