Open Source Address Scrubber?

前端 未结 5 1152
甜味超标
甜味超标 2020-12-16 01:40

I have set of names and addresses that have been entered into and excel spreadsheet, but the problem is that the many people that entered the addresses entered them in many

相关标签:
5条回答
  • 2020-12-16 01:51

    Most of the software that I've worked with to do this is very expensive (or to put it another way, marketing departments are naive and have huge budgets).

    This sort of work is a precursor to Geo-coding. This linked Wiki article includes a list of Geocoding software, some of which is free. If you're lucky, some of the free ones may include address standardizing routines.

    If you find a good one, let me know.

    0 讨论(0)
  • 2020-12-16 01:55

    A .NET wrapper for the USPS APIs

    http://www.codeproject.com/KB/cs/USPS_Web_Tools_Wrapper.aspx

    0 讨论(0)
  • 2020-12-16 01:56

    We use Accuzip. It's a lot cheaper than most solutions (~$700/year) and comes with bi-monthly updates. It uses the USPS address standardization API, for which I've written a .NET wrapper. This allows me to run it in real-time (Accuzip, by default, comes only with a batch mode).

    0 讨论(0)
  • 2020-12-16 02:01

    Since I work in the mailing business ...

    A mailable address is not geo-coding. One allows the USPS to deliver mail to and the other tells you where on earth that point is. The USPS does not geo-code their mailable addresses. It's useful for marking areas/regions of people for targeting.

    You're not buying a license to the software, you're buying the data. The post office has lots of rules especially if you're doing this commercially and trying to get a better rate than first class. See USPS Domestic Mail Manual for the complete list of rules. The USPS moves zips and households between zips all the time. The company (I work for) pays the USPS for its updated mailing list so we can keep our DBs updated. Weekly.

    Back to your question. Do you want to change the data into a common format (street -> st) or are you looking for duplicates and want to only store real mailable addresses ?

    for common format; you can break the address into pieces, clean up the white space and apply a dictionary of terms/translations. Then apply some sql to find the duplicates. Keep in mind households (1 main st) are different from persons (john doe, 1 main st).

    for the mailable addresses, well some of you (the readers) won't like this answer, but you want information and that isn't free. Someone spends time or money to acquire and maintain these lists. So, find a business model to acquire funds for the list or go to someone who will do it for you. Data and mail management

    Realistically, Semaphore is pretty cheap, just keep in mind that the address db will have to be updated quarterly and $19/quarter is pretty cheap.

    Another Address Scrubbing product. SAP PostalSoft. I don't know what the data will cost though.

    0 讨论(0)
  • 2020-12-16 02:02

    I actually work in the address verification industry... Jim's answer is a smart accept. Unfortunately for those of us with low budgets, official USPS data is pricey and the systems are complicated. (I know by experience, since the company I work for, SmartyStreets, provides address verification at lower rates than most.)

    The best I can do here to help is recommend a low-cost/free alternative (depending on your volume) such as LiveAddress, where for a list of addresses there's no minimum purchase, and the API is super-cheap and super-easy, comparatively.

    0 讨论(0)
提交回复
热议问题