Optimizing my mysql statement! - RAND() TOO SLOW

后端 未结 6 757
孤独总比滥情好
孤独总比滥情好 2020-12-11 03:26

So I have a table with over 80,000 records, this one is called system. I also have another table called follows.

I need my statement to randomly select records from

6条回答
  •  悲哀的现实
    2020-12-11 04:06

    There are two main reasons for the slowness :

    • SQL must first issue a random number for each of the rows
    • The rows must then be ordered on the basis of this number to select the top 200 ones

    There is a trick to help this situation, it requires a bit of prep work and the way to implement it (and its relative interest) depends on your actual use case.

    ==> Introduce an extra column with a "random category" value to filter-out most rows

    The idea is to have an integer-valued column with values randomly assigned, once at prep time, with a value between say 0 and 9 (or 1 and 25... whatever). This column then needs to be added to the index used in the query. Finaly, by modifying the query to include a filter on this column = a particular value (say 3), the number of rows which SQL needs to handle is then reduced by 10 (or 25, depending on the number of distinct values we have in the "random category".

    Assuming this new column is called RandPreFilter, we could introduced an index like

    CREATE [UNIQUE ?] INDEX  
    ON system (id, RandPreFilter)
    

    And alter the query as follows

    SELECT system.id
         , system.username
         , system.password
         , system.followed
         , system.isvalid
         , follows.userid
         , follows.systemid
    FROM system
    LEFT JOIN follows ON system.id = follows.systemid
       AND follows.userid = 2 
    WHERE system.followed=0 AND system.isvalid=1
       AND follows.systemid IS NULL
    
       AND RandPreFilter = 1 -- or other numbers, or possibly 
            -- FLOOR(1 + RAND() * 25)
    ORDER BY RAND()
    LIMIT 200
    

提交回复
热议问题