Count word occurrences in R

前端 未结 4 1235
误落风尘
误落风尘 2020-11-29 06:04

Is there a function for counting the number of times a particular keyword is contained in a dataset?

For example, if dataset <- c(\"corn\", \"cornmeal\", \"

相关标签:
4条回答
  • 2020-11-29 06:27

    You can also do something like the following:

    length(dataset[which(dataset=="corn")])
    
    0 讨论(0)
  • 2020-11-29 06:28

    Let's for the moment assume you wanted the number of element containing "corn":

    length(grep("corn", dataset))
    [1] 3
    

    After you get the basics of R down better you may want to look at the "tm" package.

    EDIT: I realize that this time around you wanted any-"corn" but in the future you might want to get word-"corn". Over on r-help Bill Dunlap pointed out a more compact grep pattern for gathering whole words:

    grep("\\<corn\\>", dataset)
    
    0 讨论(0)
  • 2020-11-29 06:34

    Another quite convenient and intuitive way to do it is to use the str_count function of the stringr package:

    library(stringr)
    dataset <- c("corn", "cornmeal", "corn on the cob", "meal")
    
    # for mere occurences of the pattern:
    str_count(dataset, "corn")
    # [1] 1 1 1 0
    
    # for occurences of the word alone:
    str_count(dataset, "\\bcorn\\b")
    # [1] 1 0 1 0
    
    # summing it up
    sum(str_count(dataset, "corn"))
    # [1] 3
    
    0 讨论(0)
  • 2020-11-29 06:39

    I'd just do it with string division like:

    library(roperators)
    
    dataset <- c("corn", "cornmeal", "corn on the cob", "meal")
    
    # for each vector element:
    dataset %s/% 'corn'
    
    # for everything:
    sum(dataset %s/% 'corn') 
    
    0 讨论(0)
提交回复
热议问题