I\'m trying to fuzzy match two csv files, each containing one column of names, that are similar but not the same.
My code so far is as follows:
impor
fuzzywuzzy's process.extract() returns the list in reverse sorted order , with the best match coming first.
so to find just the best match, you can set the limit argument as 1 , so that it only returns the best match, and if that is greater than 60 , you can write it to the csv, like you are doing now.
Example -
from fuzzywuzzy import process
## For each row in the lookup compute the partial ratio
for row in parse_csv("names_2.csv"):
for found, score, matchrow in process.extract(row, data, limit=1):
if score >= 60:
print('%d%% partial match: "%s" with "%s" ' % (score, row, found))
Digi_Results = [row, score, found]
writer.writerow(Digi_Results)