问题
I'm using Python-2.6. I have very little knowledge of hash functions.
I want to use a CRC hash function to hash an IP address like '128.0.0.5' into the range [0, H). Currently I'm thinking of doing
zlib.crc32('128.0.0.5')%H.
Is this okay? There's a few ques. you could try and answer...
does it make any diff. if I hash '128.0.0.5' or its binary '0001110101010..' whatever that is or without the '.'s
zlib.crc32 returns a signed integer. Does modding (%) a neg. with a positive H always give a pos no?
Does %-ing by H affect how good the hash function is? ( I mean is that the best I could do for the available space, with the available xlib.crc32)
Thanks!
回答1:
does it make any diff. if I hash '128.0.0.5' or its binary '0001110101010..' whatever that is or without the '.'s
Not really.
zlib.crc32 returns a signed integer. Does modding (%) a neg. with a positive H always give a pos no?
Yes.
Does %-ing by H affect how good the hash function is? ( I mean is that the best I could do for the available space, with the available xlib.crc32)
You'd better use all the bits of the checksum to make up for their lack of an "avalanche effect". Single-digit variations such as 192.168.1.1
, 192.168.1.2
, etc might produce differences only in the first bits of the checksum, and since %
cares only about the last bits, hashes will collide.
回答2:
Why do you want to hash an IP address into a number? They already have a native integer representation. For example, using netaddr:
>>> import netaddr
>>> ip = netaddr.IPAddress('192.168.1.1')
>>> ip.value
3232235777
>>> netaddr.IPAddress(3232235777)
IPAddress('192.168.1.1')
回答3:
ad 1) It will yield different results, but does not effect the quality of the hash.
ad 2) It will always yield a positive number or zero.
ad 3) As you limit the number of possible buckets, it does affect the quality of the hash.
In general: About how large is your H? Remember that a IPv4 address is nothing more than a 32-bit value. 192.168.0.1 is just a more human readable byte-wise representation. So if your H is larger than 4294967295, there will be no need of hashing.
来源:https://stackoverflow.com/questions/6756063/hashing-an-ip-address-to-a-number-in-0-h