MAD method compression function

故事扮演 提交于 2019-12-11 12:48:52

问题


I ran across the question below in an old exam. My answers just feels a bit short and inadequate. Any extra ideas I can look into or reasons I have overlooked would be great. Thanx

Consider the MAD method compression function, mapping an object with hash code i to element [(3i + 7)mod9027]mod6000 of the 6000-element bucket array. Explain why this is a poor choice of compression function, and how it could be improved.

I basically just say that the function could be improved by changing the value for p (or 9027) to an prime number and choosing an other constant for a (or 3) could also help.


回答1:


Rup's comment is essentially the correct answer. 3 and 9027 are both divisible by 3, so 3i + 7 maps onto only 1/3 of the range 0-9026. Then the mapping mod 6000 maps 2/3 of the values to the lower half. So bucket 1 will contain roughly 1/1500 of the values [if I've done the math right] rather than the 1/6000 you would want. Bucket 0 will be empty.




回答2:


if i is uniformly distributed over a large enough range, then (3i + 7)mod9027 will be evenly distributed over 0-9026, but then taking mod 6000 means two thirds of the hashes will be in the first half of the range (0 to 3026 and 6000 to 9026 inclusive), and one third in the second half (3037 to 5999 inclusive).



来源:https://stackoverflow.com/questions/3017108/mad-method-compression-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!