Select random row from a PostgreSQL table with weighted row probabilities

后端 未结 6 1921
耶瑟儿~
耶瑟儿~ 2020-12-03 17:30

Example input:

SELECT * FROM test;
 id | percent   
----+----------
  1 | 50 
  2 | 35   
  3 | 15   
(3 rows)

How would you write such query, that

6条回答
  •  南笙
    南笙 (楼主)
    2020-12-03 17:42

    This should do the trick:

    WITH CTE AS (
        SELECT random() * (SELECT SUM(percent) FROM YOUR_TABLE) R
    )
    SELECT *
    FROM (
        SELECT id, SUM(percent) OVER (ORDER BY id) S, R
        FROM YOUR_TABLE CROSS JOIN CTE
    ) Q
    WHERE S >= R
    ORDER BY id
    LIMIT 1;
    

    The sub-query Q gives the following result:

    1  50
    2  85
    3  100
    

    We then simply generate a random number in range [0, 100) and pick the first row that is at or beyond that number (the WHERE clause). We use common table expression (WITH) to ensure the random number is calculated only once.

    BTW, the SELECT SUM(percent) FROM YOUR_TABLE allows you to have any weights in percent - they don't strictly need to be percentages (i.e. add-up to 100).

    [SQL Fiddle]

提交回复
热议问题