strategies for finding duplicate mailing addresses

后端 未结 6 1599
悲哀的现实
悲哀的现实 2021-02-10 02:08

I\'m trying to come up with a method of finding duplicate addresses, based on a similarity score. Consider these duplicate addresses:

addr_1 = \'# 3 FAIRMONT LIN         


        
6条回答
  •  不要未来只要你来
    2021-02-10 02:43

    In order to do this right, you need to standardize your addresses according to USPS standards (your address examples appear to be US based). There are many direct marketing service providers that offer CASS (Coding Accuracy Support System) certification of postal addresses. The CASS process will standardize all of your addresses and append zip + 4 to them. Any undeliverable addresses will be flagged which will further reduce your postal mailing costs, if that is your intent. Once all of your addresses are standardized, eliminating duplicates will be trivial.

提交回复
热议问题