How to detect duplicate data?

后端 未结 11 1694
半阙折子戏
半阙折子戏 2021-02-01 08:24

I have got a simple contacts database but I\'m having problems with users entering in duplicate data. I have implemented a simple data comparison but unfortunately the duplicate

11条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-02-01 08:28

    This may or may not be related but, minor misspellings might be detected by a Soundex search, e.g., this will allow you to consider Britney Spears, Britanny Spares, and Britny Spears as duplicates.

    Nickname contractions, however, are difficult to consider as duplicates and I doubt if it is wise. There are bound to be multiple people named Bill Smith and William Smith, and you would have to iterate that with Charles->Chuck, Robert->Bob, etc.

    Also, if you are considering, say, Muslim users, the problems become more difficult (there are too many Muslims, for example, that are named Mohammed/Mohammad).

提交回复
热议问题