R - simple Record Linkage - the next step ?

拈花ヽ惹草 提交于 2019-12-03 20:08:25
claire_

Taken from this post, here's an example that should work for you:

tv3 = as.data.frame(c("TOURDEFRANCE", 'TOURDEFRANCE', "TOURDE FRANCE", 
    "TOURDE FRANZ", "GET FRESH"))
colnames(tv3) <- "name"

tv3 %>% compare.dedup(strcmp = TRUE) %>%
        epiWeights() %>%
        epiClassify(0.5) %>%
        getPairs(show = "links", single.rows = TRUE) -> matches

In result, the matches dataframe should help you determining thresholds (set in epiClassify()).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!