Window functions and more “local” aggregation

后端 未结 3 2023
旧巷少年郎
旧巷少年郎 2020-12-11 04:31

Suppose I have this table:

select * from window_test;

 k | v
---+---
 a | 1
 a | 2
 b | 3
 a | 4

Ultimately I want to get:



        
相关标签:
3条回答
  • 2020-12-11 04:58

    EDIT: I've came up with the following query — without window functions at all:

    WITH RECURSIVE tree AS (
      SELECT k, v, ''::text as next_k, 0 as next_v, 0 AS level FROM window_test
      UNION ALL
      SELECT c.k, c.v, t.k, t.v + level, t.level + 1
        FROM tree t JOIN window_test c ON c.k = t.k AND c.v + 1 = t.v),
    partitions AS (
      SELECT t.k, t.v, t.next_k,
             coalesce(nullif(t.next_v, 0), t.v) AS next_v, t.level
        FROM tree t
       WHERE NOT EXISTS (SELECT 1 FROM tree WHERE next_k = t.k AND next_v = t.v))
    SELECT min(k) AS k, v AS min_v, max(next_v) AS max_v
      FROM partitions p
     GROUP BY v
     ORDER BY 2;
    

    I've provided 2 working queries now, I hope one of them will suite you.

    SQL Fiddle for this variant.


    Another way how to achieve this is to use a support sequence.

    1. Create a support sequence:

      CREATE SEQUENCE wt_rank START WITH 1;
      
    2. The query:

      WITH source AS (
        SELECT k, v,
               coalesce(lag(k) OVER (ORDER BY v), k) AS prev_k
          FROM window_test
          CROSS JOIN (SELECT setval('wt_rank', 1)) AS ri),
      ranking AS (
        SELECT k, v, prev_k,
               CASE WHEN k = prev_k THEN currval('wt_rank')
                    ELSE nextval('wt_rank') END AS rank
          FROM source)
      SELECT r.k, min(s.v) AS min_v, max(s.v) AS max_v
          FROM ranking r
          JOIN source s ON r.v = s.v
         GROUP BY r.rank, r.k
         ORDER BY 2;
      
    0 讨论(0)
  • 2020-12-11 05:02

    Would this not do the job for you, without the need for windows, partitions or coalescing. It just uses a traditional SQL trick for finding nearest tuples via a self join, and a min on the difference:

    SELECT k, min(v), max(v) FROM (
        SELECT k, v, v + min(d) lim FROM (
            SELECT x.*, y.k n, y.v - x.v d FROM window_test x
            LEFT JOIN window_test y ON x.k <> y.k AND y.v - x.v > 0) 
        z GROUP BY k, v, n)
    w GROUP BY k, lim ORDER BY 2;
    

    I think this is probably a more 'relational' solution, but I'm not sure about its efficiency.

    0 讨论(0)
  • 2020-12-11 05:14

    This returns your desired result with the sample data. Not sure if it will work for real world data:

    select k, 
           min(v) over (partition by group_nr) as min_v,
           max(v) over (partition by group_nr) as max_v
    from (
        select *,
               sum(group_flag) over (order by v,k) as group_nr
        from (
        select *,
               case
                  when lag(k) over (order by v) = k then null
                  else 1
                end as group_flag
        from window_test
        ) t1
    ) t2
    order by min_v;
    

    I left out the DISTINCT though.

    0 讨论(0)
提交回复
热议问题