Probability of getting a duplicate value when calling GetHashCode() on strings

前端 未结 6 1530
遥遥无期
遥遥无期 2020-11-30 10:49

I want to know the probability of getting duplicate values when calling the GetHashCode() method on string instances. For instance, according to th

6条回答
  •  一向
    一向 (楼主)
    2020-11-30 11:50

    Large.

    (Sorry Jon!)

    The probability of getting a hash collision among short strings is extremely large. Given a set of only ten thousand distinct short strings drawn from common words, the probability of there being at least one collision in the set is approximately 1%. If you have eighty thousand strings, the probability of there being at least one collision is over 50%.

    For a graph showing the relationship between set size and probability of collision, see my article on the subject:

    https://docs.microsoft.com/en-us/archive/blogs/ericlippert/socks-birthdays-and-hash-collisions

提交回复
热议问题