R DocumentTermMatrix control list not working, silently ignores unknown parameters

后端 未结 2 501
面向向阳花
面向向阳花 2020-12-31 18:03

I have two following DTM-s:

dtm <- DocumentTermMatrix(t)

dtmImproved <- DocumentTermMatrix(t, 
               control=list(minWordLength = 4, minDocFr         


        
相关标签:
2条回答
  • 2020-12-31 18:20

    It is always a good idea to read the source code if available. Read the Source code of the wordcloud function@GitHub, here is what it says:
    # Author: ianfellows
    .....
    if(min.freq > max(freq))
    min.freq <- 0

    So your DocumentTermMatrix, returned a max(freq) < min.freq bound that you set, i.e. non-of the terms appeared in more than your min.freq bound that you set.

    Hope this Helps MJJ

    0 讨论(0)
  • 2020-12-31 18:31
    dtmImproved <- DocumentTermMatrix(t, control=list(wordLengths=c(4, 15), 
                                       bounds = list(global = c(5,Inf))))
    

    This solves the problem! The lack of proper documentation really mads me down (:

    0 讨论(0)
提交回复
热议问题