Python Fuzzy Matching (FuzzyWuzzy) - Keep only Best Match

后端 未结 3 1667
渐次进展
渐次进展 2020-12-08 23:36

I\'m trying to fuzzy match two csv files, each containing one column of names, that are similar but not the same.

My code so far is as follows:

impor         


        
3条回答
  •  渐次进展
    2020-12-09 00:24

    Several pieces of your code can be greatly simplified by using process.extractOne() from FuzzyWuzzy. Not only does it just return the top match, you can set a score threshold for it within the function call, rather than needing to perform a separate logical step, e.g.:

    process.extractOne(row, data, score_cutoff = 60)
    

    This function will return a tuple of the highest match plus the accompanying score if it finds a match satisfying the condition. It will return None otherwise.

提交回复
热议问题