Pick a random attribute from group in Redshift

后端 未结 4 1577
攒了一身酷
攒了一身酷 2021-01-23 04:37

I have a data set in the form.

id  |   attribute
-----------------
1   |   a
2   |   b
2   |   a
2   |   a
3   |   c

Desired output:

4条回答
  •  野性不改
    2021-01-23 05:06

    I found a way to pick up a random attribute for each id, but it's too tricky. Actually I don't think it's a good way, but it works.

    SQL:

    -- (1) uniq dataset 
    WITH uniq_dataset as (select * from dataset group by id, attr)
    SELECT 
      uds.id, rds.attr
    FROM
    -- (2) generate random rank for each id
      (select id, round((random() * ((select count(*) from uniq_dataset iuds where iuds.id = ouds.id) - 1))::numeric, 0) + 1 as random_rk from (select distinct id from uniq_dataset) ouds) uds,
    -- (3) rank table
      (select rank() over(partition by id order by attr) as rk, id ,attr from uniq_dataset) rds
    WHERE
      uds.id = rds.id
    AND 
      uds.random_rk = rds.rk
    ORDER BY
      uds.id;
    

    Result:

     id | attr
    ----+------
      1 | a
      2 | a
      3 | c
    
    OR
    
     id | attr
    ----+------
      1 | a
      2 | b
      3 | c
    

    Here are tables in this SQL.

    -- dataset (original table)
     id | attr
    ----+------
      1 | a
      2 | b
      2 | a
      2 | a
      3 | c
    
    -- (1) uniq dataset
     id | attr
    ----+------
      1 | a
      2 | a
      2 | b
      3 | c
    
    -- (2) generate random rank for each id
     id | random_rk
    ----+----
      1 |  1
      2 |  1 <- 1 or 2
      3 |  1
    
    -- (3) rank table
     rk | id | attr
    ----+----+------
      1 |  1 | a
      1 |  2 | a
      2 |  2 | b
      1 |  3 | c
    

提交回复
热议问题