How do I speed up text searches in R?

后端 未结 2 1368
予麋鹿
予麋鹿 2020-12-30 14:58

I have a large text vector I would like to search for a particular character or phrase. Regular expressions are taking forever. How do I search it quickly?

Sample

2条回答
  •  暖寄归人
    2020-12-30 15:17

    There's no need for regular expressions here, and their power comes with a computational cost.

    You can turn off regular expression parsing in any of the regex functions in R with the ,fixed=TRUE argument. Speed gains result:

    library(microbenchmark)
    m <- microbenchmark( 
        grep( " ", garbage, fixed=TRUE ),
        grep( " ", garbage )
    )
    m
    Unit: milliseconds
                                 expr       min        lq   median        uq      max neval
     grep(" ", garbage, fixed = TRUE)  491.5634  497.1309  499.109  503.3009 1128.643   100
                   grep(" ", garbage) 1786.8500 1801.9837 1810.294 1825.2755 3620.346   100
    

提交回复
热议问题