I am currently working on a project where I a data matching algorithm needs to be implemented. An external system passes in all data it knows about a customer, and the syste
For inspiration, look at the Levenshtein distance algorithm. This will give you a reasonable mechanism to weight your comparisons.
I would also add that in my experience you can never match two arbitrary pieces of data into the same entity with absolute certainty. You need to present plausible matches to a user, who can then verify for sure that John Smith on 1920 E. Pine is the same person as Jon Smith on 192 East Pine Road or not.