可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm changing some tables to store IP addresses as numbers rather than strings. This is simple with IPv4 where the 32 bit address can fit into an integer column. However, an IPv6 address is 128 bits.
The MySQL documentation only shows numeric types up to 64 bits ("bigint").
Should I stick with char/varchar for IPv6? (Ideally I'd like to use the same column for IPv4 and IPv6, so I'd prefer not to do this).
Is there anything better than using two bigint columns? I would prefer not to have to break the value into upper and lower /64 whenever using the address.
I'm using MariaDB 5.1 - if there's a better solution in a later version of MySQL then that would be nice to know, although not helpfully immediately.
[EDIT] Note that I'm after a recommendation for the best way to do this - it's obvious that there are various ways of doing this (including the existing string representation), but which is (in terms of performance) best? (i.e. if someone has done the analysis already, that would save me doing it, or if I'm missing something obvious, that would be great to know too).
回答1:
I found myself asking this question and from all the posts I read never found any performance comparisons. So here's my attempt.
I've created the following tables, populated with 2,000,000 random ip address from 100 random networks.
CREATE TABLE ipv6_address_binary ( id SERIAL NOT NULL AUTO_INCREMENT PRIMARY KEY, addr BINARY(16) NOT NULL UNIQUE ); CREATE TABLE ipv6_address_twobigints ( id SERIAL NOT NULL AUTO_INCREMENT PRIMARY KEY, haddr BIGINT UNSIGNED NOT NULL, laddr BIGINT UNSIGNED NOT NULL, UNIQUE uidx (haddr, laddr) ); CREATE TABLE ipv6_address_decimal ( id SERIAL NOT NULL AUTO_INCREMENT PRIMARY KEY, addr DECIMAL(39,0) NOT NULL UNIQUE );
Then I SELECT all ip addresses for each network and record the response time. Average response time on the twobigints table is about 1 second while on the binary table it is about one-hundredth of a second.
Here are the queries.
Note:
X_[HIGH/LOW] is the most/least significant 64-bits of X
when NETMASK_LOW is 0 the AND condition is omitted as it always yields true. doesn't affect performance very much.
SELECT COUNT(*) FROM ipv6_address_twobigints WHERE haddr & NETMASK_HIGH = NETWORK_HIGH AND laddr & NETMASK_LOW = NETWORK_LOW SELECT COUNT(*) FROM ipv6_address_binary WHERE addr >= NETWORK AND addr = NETWORK AND addr
Average response times:
Graph:

BINARY_InnoDB 0.0119529819489 BINARY_MyISAM 0.0139244818687 DECIMAL_InnoDB 0.017379629612 DECIMAL_MyISAM 0.0179929423332 BIGINT_InnoDB 0.782350552082 BIGINT_MyISAM 1.07809265852
回答2:
I have always used either a string or two 64-bit integers. The former in the case where I just want to record it, the latter in the case where I need to do calculations on whether a certain address is contained in a certain network, or even whether two networks overlap.
When storing it as integer, the only option is indeed to split it into two 64-bit numbers. As this makes comparisons more cumbersome, I wouldn't do this unless you need numerical calculations, to see whether an IP falls within a certain network.
I would not be too concerned about performance for storing IPv6 addresses in a string - depending on how many lookups you do for the data. Usually, there are very few, or there is simply very little data. Yes, the storage and lookups are less efficient than with numbers, but it's not much more painful than storing e-mail addresses, person names or usernames.
And why would you not be able to mix IPv4 and IPv6 in string fields? They are easy to distinguish when retrieving one. Their range of possible values do not overlap.
In short: use numbers for checking overlaps, use strings elsewhere. The inefficiency of strings is irrelevant compared to the ease of use.
回答3:
To quote: "Have you considered binary (64)"
Storing very large integers in MySQL