MySQL select 10 random rows from 600K rows fast

后端 未结 26 3666
粉色の甜心
粉色の甜心 2020-11-21 05:06

How can I best write a query that selects 10 rows randomly from a total of 600k?

26条回答
  •  不要未来只要你来
    2020-11-21 05:29

    I used this http://jan.kneschke.de/projects/mysql/order-by-rand/ posted by Riedsio (i used the case of a stored procedure that returns one or more random values):

       DROP TEMPORARY TABLE IF EXISTS rands;
       CREATE TEMPORARY TABLE rands ( rand_id INT );
    
        loop_me: LOOP
            IF cnt < 1 THEN
              LEAVE loop_me;
            END IF;
    
            INSERT INTO rands
               SELECT r1.id
                 FROM random AS r1 JOIN
                      (SELECT (RAND() *
                                    (SELECT MAX(id)
                                       FROM random)) AS id)
                       AS r2
                WHERE r1.id >= r2.id
                ORDER BY r1.id ASC
                LIMIT 1;
    
            SET cnt = cnt - 1;
          END LOOP loop_me;
    

    In the article he solves the problem of gaps in ids causing not so random results by maintaining a table (using triggers, etc...see the article); I'm solving the problem by adding another column to the table, populated with contiguous numbers, starting from 1 (edit: this column is added to the temporary table created by the subquery at runtime, doesn't affect your permanent table):

       DROP TEMPORARY TABLE IF EXISTS rands;
       CREATE TEMPORARY TABLE rands ( rand_id INT );
    
        loop_me: LOOP
            IF cnt < 1 THEN
              LEAVE loop_me;
            END IF;
    
            SET @no_gaps_id := 0;
    
            INSERT INTO rands
               SELECT r1.id
                 FROM (SELECT id, @no_gaps_id := @no_gaps_id + 1 AS no_gaps_id FROM random) AS r1 JOIN
                      (SELECT (RAND() *
                                    (SELECT COUNT(*)
                                       FROM random)) AS id)
                       AS r2
                WHERE r1.no_gaps_id >= r2.id
                ORDER BY r1.no_gaps_id ASC
                LIMIT 1;
    
            SET cnt = cnt - 1;
          END LOOP loop_me;
    

    In the article i can see he went to great lengths to optimize the code; i have no ideea if/how much my changes impact the performance but works very well for me.

提交回复
热议问题