Imperfect string match using data.table

前端 未结 2 917
灰色年华
灰色年华 2020-12-21 07:32

Ok, so I posted a question a while back concerning writing an R function to accelerate string matching of large text files. I had my eyes opened to \'data.table\' and my que

2条回答
  •  一向
    一向 (楼主)
    2020-12-21 07:34

    I finally got it.

    The agrep-function has a value-option that needs to be altered from FALSE (default) to TRUE:

    dt <- dt[lapply(car.vins, agrep, x = vin.vins, max.distance = c(cost=2, all=2), value = TRUE)
             , .(NumTimesFound = .N)
             , by = vin.names]
    

    Note: the max.distance parameters can be altered based on Levenshtein distance, substitutions, deletions, etc. 'agrep' is a fascinating function!

    Thanks again for all the help!

提交回复
热议问题