Suppose I want to match address records (or person names or whatever) against each other to merge records that are most likely referring to the same address. Basically, I gu
Disclaimer: I don't know any algorithm that does that, but would really be interested in knowing one if it exists. This answer is a naive attempt of trying to solve the problem, with no previous knowledge whatsoever. Comments welcome, please don't laugh too laud.
If you try doing it by hand, I would suggest applying some kind of "normalization" to your strings : lowercase them, remove punctuation, maybe replace common abbreviations with the full words (Dr. => drive, St => street, etc...).
Then, you can try different alignments between the two strings you compare, and compute the correlation by averaging the absolute differences between corresponding letters (eg a = 1, b = 2, etc.. and corr(a, b) = |a - b| = 1
) :
west lawnmover drive
w lawnmower street
Thus, even if some letters are different, the correlation would be high. Then, simply keep the maximal correlation you found, and decide that their are the same if the correlation is above a given threshold.