I am trying to get a sequential number table from 1 to 20 million. (or 0 to 20 million)
I am rather awestruck at how difficult it's been to get a MySQL-compatible solution to this common problem.
Similar to this: Creating a "Numbers Table" in mysql
But the the answer only goes to 1 million. I am not really understanding the bit shift calculations.
I've seen many SQL answers but most are for databases that aren't MySQL, so I can't adopt the code due to lack of knowledge of both MySQL and the other.
Some references:
SQL, Auxiliary table of numbers
What is the best way to create and populate a numbers table?
Please make sure the code you post is compatible in MySQL and is semicolon delimited so I can run it in PhpMyAdmin. I'd appreciate the table to be named numbers with the column called i
I will benchmark each solution, so the it's archived and hopefully will show up for the next time someone tries to search for this problem.
Benchmarks thus far:
Times are in seconds.
+---------------+------------------+---------+-----------+------------+
| Author | Method | 10,000 | 1,000,000 | 20,000,000 |
+---------------+------------------+---------+-----------+------------+
| Devon Bernard | PHP Many Queries | 0.38847 | 39.32716 | ~ 786.54 |
| Bhare | PHP Few Queries | 0.00831 | 0.94738 | 19.58823 |
| psadac,Bhare | LOAD DATA | 0.00549 | 0.43855 | 10.55236 |
| kjtl | Bitwise | 1.36076 | 1.48300 | 4.79226 |
+---------------+------------------+---------+-----------+------------+
-- To use the bitwise solution you need a view of 2 to the power 25.
-- the following solution is derived from http://stackoverflow.com/questions/9751318/creating-a-numbers-table-in-mysql
-- the following solution ran in 43.8 seconds with the primary key, without it 4.56 seconds.
-- create a view that has 2 to the power 25 minus 1
-- 2 ^ 1
CREATE or replace VIEW `two_to_the_power_01_minus_1` AS select 0 AS `n` union all select 1 AS `1`;
-- 2 ^ 2
CREATE or replace VIEW `two_to_the_power_02_minus_1`
AS select
((`hi`.`n` << 1) | `lo`.`n`) AS `n`
from (`two_to_the_power_01_minus_1` `lo` join `two_to_the_power_01_minus_1` `hi`) ;
-- 2 ^ 4
CREATE or replace VIEW `two_to_the_power_04_minus_1`
AS select
((`hi`.`n` << 2 ) | `lo`.`n`) AS `n`
from (`two_to_the_power_02_minus_1` `lo` join `two_to_the_power_02_minus_1` `hi`) ;
-- 2 ^ 8
CREATE or replace VIEW `two_to_the_power_08_minus_1`
AS select
((`hi`.`n` << 4 ) | `lo`.`n`) AS `n`
from (`two_to_the_power_04_minus_1` `lo` join `two_to_the_power_04_minus_1` `hi`) ;
-- 2 ^ 12
CREATE or replace VIEW `two_to_the_power_12_minus_1`
AS select
((`hi`.`n` << 8 ) | `lo`.`n`) AS `n`
from (`two_to_the_power_08_minus_1` `lo` join `two_to_the_power_04_minus_1` `hi`) ;
-- 2 ^ 13
CREATE or replace VIEW `two_to_the_power_13_minus_1`
AS select
((`hi`.`n` << 1) | `lo`.`n`) AS `n`
from (`two_to_the_power_01_minus_1` `lo` join `two_to_the_power_12_minus_1` `hi`);
-- create a table to store the interim results for speed of retrieval
drop table if exists numbers_2_to_the_power_13_minus_1;
create table `numbers_2_to_the_power_13_minus_1` (
`i` int(11) unsigned
) ENGINE=myisam DEFAULT CHARSET=latin1 ;
-- faster 2 ^ 13
insert into numbers_2_to_the_power_13_minus_1( i )
select n from `two_to_the_power_13_minus_1` ;
-- faster 2 ^ 12
CREATE or replace view `numbers_2_to_the_power_12_minus_1`
AS select
`numbers_2_to_the_power_13_minus_1`.`i` AS `i`
from `numbers_2_to_the_power_13_minus_1`
where (`numbers_2_to_the_power_13_minus_1`.`i` < (1 << 12));
-- faster 2 ^ 25
CREATE or replace VIEW `numbers_2_to_the_power_25_minus_1`
AS select
((`hi`.`i` << 12) | `lo`.`i`) AS `i`
from (`numbers_2_to_the_power_12_minus_1` `lo` join `numbers_2_to_the_power_13_minus_1` `hi`);
-- create table for results
drop table if exists numbers ;
create table `numbers` (
`i` int(11) signed
, primary key(`i`)
) ENGINE=myisam DEFAULT CHARSET=latin1;
-- insert the numbers
insert into numbers(i)
select i from numbers_2_to_the_power_25_minus_1
where i <= 20000000 ;
drop view if exists numbers_2_to_the_power_25_minus_1 ;
drop view if exists numbers_2_to_the_power_12_minus_1 ;
drop table if exists numbers_2_to_the_power_13_minus_1 ;
drop view if exists two_to_the_power_13_minus_1 ;
drop view if exists two_to_the_power_12_minus_1 ;
drop view if exists two_to_the_power_08_minus_1 ;
drop view if exists two_to_the_power_04_minus_1 ;
drop view if exists two_to_the_power_02_minus_1 ;
drop view if exists two_to_the_power_01_minus_1 ;
If speed is a concern, you should use LOAD DATA INFILE which is faster than INSERT according to mysql doc :
http://dev.mysql.com/doc/refman/5.5/en/insert-speed.html
When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times
faster than using INSERT statements. See Section 13.2.6, “LOAD DATA INFILE Syntax”.
Basically you generate a 20 million lines using your favorite language (php ?), then you load it with LOAD DATA INFILE.
I typical way to create such a table is to start with:
select 0 as num union all select 1 union all select 2 union all
select 3 union all select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
Now, in most databases, you can use a with statement and do the following:
with digits as (above query)
select d1.num+10*d2.num+100*d3.num+1000*d4.num+10000*d5.num+100000*d6.num+1000000*d7.num+10000000*87.num as num
from digits d1 cross join
digits d2 cross join
digits d3 cross join
digits d4 cross join
digits d5 cross join
digits d6 cross join
digits d7 cross join
(select 0 as num union all select 1) d8
Unfortunately, in MySQL you either need to create a temporary table or repeat the union all statement:
select d1.num+10*d2.num+100*d3.num+1000*d4.num+10000*d5.num+100000*d6.num+1000000*d7.num+10000000*d7.num as num
from (select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
) d1 cross join
(select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
) d2 cross join
(select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
) d3 cross join
(select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
) d4 cross join
(select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
) d5 cross join
(select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
) d6 cross join
(select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
) d7 cross join
(select 0 as num union all select 1) d8
In MySQL, if you want to put this into a table, you can just use create table numbers as before the select. However, different databases have different syntax for dumping the results of a select into a table.
I am not sure if you are trying to make one call make 20 million rows or call a row making call 20 million times. An example of the second case would be:
<?php
$i =0;
while($i <= 20000000){
$sql = mysql_query("INSERT INTO table_name VALUES ('$i')");
$i +=1;
}
?>
If you are looking for a SQL solution couldn't you try an adaption of
DROP TABLE NumbersTest
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
SELECT TOP 20000000 IDENTITY(int,1,1) AS Number
INTO NumbersTest
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds'
SELECT COUNT(*) FROM NumbersTest
Taken from this post and is reported to make 10,000 rows in an average of 56.3 milliseconds.
Apologies if this answer is referenced already. This took 18.79 secs on my machine (a DELL laptop if it matters)...
It's adapted from an old solution at http://datacharmer.blogspot.co.uk/2006/06/filling-test-tables-quickly.html but, crucially, this won't work with the default 'vanilla' InnoDB engine, and it will be much slower if you try to establish the PK at the outset.
On the plus side you get an extra 13.5 M rows for free!
drop table if exists numbers;
create table numbers ( id int not null) engine = myisam;
delimiter $$
drop procedure if exists fill_numbers $$
create procedure fill_numbers()
deterministic
begin
declare counter int default 1;
insert into numbers values (1);
while counter < 20000000
do
insert into numbers (id)
select id + counter
from numbers;
select count(*) into counter from numbers;
select counter;
end while;
end $$
delimiter ;
call fill_numbers();
Adpoting psadac's answer of using LOAD DATA INFILE and the idea of BULK insertion applied to fwrite:
$fh = fopen("data_num.txt", 'a') or die("can't open file");
$i =1;
while($i <= 20000000) {
$num_string .= "$i\n";
if($i % 1000000 == 0) {
fwrite($fh, $num_string);
$num_string = "";
}
$i +=1;
}
fclose($fh);
$dbh->beginTransaction();
$query = "LOAD DATA INFILE '" . addslashes(realpath("data_num.txt")) . "' INTO TABLE numbers LINES TERMINATED BY '\n';";
$sth = $dbh->prepare($query);
$sth->execute();
$dbh->commit();
unlink("data_num.txt");
I had to use addslashes as I am using windows enviorment.
It's interesting to note that doing the BULK technique by writing only 20 times to file over 20 million resulted in ~10 seconds compared to ~75 seconds by just writing 20 million times. Using string concatenation over pushing values into an array and imploding yielded almost twice as fast.
In response to Devon Bernard's answer, I decided to approach it using PDO Mysql PHP and use the concept of just a few queries. At first I tried to do it with just 1 big query but PHP ran out of memory with default settings, so I decided to tweak to run every 100,000th. Even if you allocate enough memory to hold, there is no significant improvement.
$i = 1;
$inserts = array();
while($i <= 20000000) {
$inserts[] = "($i)";
if($i % 100000 == 0) {
$dbh->beginTransaction();
$query = "INSERT INTO numbers(i) VALUES " . implode(',', $inserts) . ";";
$sth = $dbh->prepare($query);
$sth->execute();
$dbh->commit();
$inserts = array();
}
$i +=1;
}
A simpler & faster solution
(Original code here)
CREATE TABLE `numbers` (
`i` INT(11) SIGNED
, PRIMARY KEY(`i`)
) ENGINE=myisam DEFAULT CHARSET=latin1;
INSERT INTO numbers(i) SELECT @row := @row + 1 FROM
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t2,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t3,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t4,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t5,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t6,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t7,
(SELECT 0 UNION ALL SELECT 1) t8,
(SELECT @row:=0) ti;
On my laptop with MySQL 5.5.29 installed, compare the accepted answer with this one:
+-----------------+-------+---------------+
| Method | Rows | Time consumed |
+-----------------+-------+---------------+
| Accepted answer | 20M+1 | 42.4s |
+-----------------+-------+---------------+
| This one | 20M | 35.9s |
+-----------------+-------+---------------+
About 15% less time, no intermediate views or tables, and easier to read.
来源:https://stackoverflow.com/questions/14298154/mysql-sequential-number-table