Find the most frequently occuring words in a text in R

后端 未结 5 1565
刺人心
刺人心 2020-12-14 13:33

Can someone help me with how to find the most frequently used two and three words in a text using R?

My text is...

text <- c(\"Th         


        
5条回答
  •  既然无缘
    2020-12-14 14:14

    Simplest?

    require(quanteda)
    
    # bi-grams
    topfeatures(dfm(text, ngrams = 2, verbose = FALSE))
    ##      of_the     a_phrase the_sentence       may_be         as_a       in_the    in_common    phrase_is 
    ##           5            4            4            3            3            3            2            2 
    ##  is_usually     group_of 
    ##           2            2 
    
    # for tri-grams
    topfeatures(dfm(text, ngrams = 3, verbose = FALSE))
    ##     a_phrase_is   group_of_words    of_a_sentence  of_the_sentence   for_example_in   example_in_the 
    ##               2                2                2                2                2                2 
    ## in_the_sentence   an_orange_bird orange_bird_with      bird_with_a 
    #               2                2                2                2 
    

提交回复
热议问题