Min and Max values grouping by consecutive ranges

霸气de小男生 提交于 2021-01-29 05:13:43

问题


I have a table that informs me a error type and line number that error occurred. (The process is irrelevant at this moment). I need to group by error type and show line start and line end for each error type, resulting of a range of each error type. I need to consider gaps of lines

My table and queries was:

create table errors (
    err_type varchar(10),
    line integer);

insert into errors values
('type_A', 1),('type_A', 2),('type_A', 3),
('type_A', 6),('type_A', 7),
('type_B', 9),('type_B', 10),
('type_B', 12),('type_B', 13),('type_B', 14),('type_B', 15),
('type_C', 21);

select * from errors;

My data:

err_type    line
----------------
type_A      1
type_A      2
type_A      3
type_A      6
type_A      7
type_B      9
type_B     10
type_B     12
type_B     13
type_B     14
type_B     15
type_C     21

I need a query to do this:

err_type    line_start   line_end
-------------------------------
type_A      1             3
type_A      6             7
type_B      9            10
type_B     12            15
type_C     21            21

I'm using PostgreSQL, but Oracle has a similar syntax for partitioning over functionality.

Any suggestion?


回答1:


This is a gaps-and-islands problem. I think the simplest method is row_number() and group by:

select err_type, min(line), max(line)
from (select e.*, row_number() over (partition by err_type order by line) as seqnum
      from errors e
     ) e
group by err_type, (line - seqnum)
order by err_type, min(line);

Here is a db<>fiddle.




回答2:


You could build up a query like this:

with base as (
    select errors.*, 
           sign(line - 1 - lag(line, 1, 1) over (
                 partition by err_type 
                 order by line)) as is_start
    from   errors
), parts as (
    select base.*, 
           sum(is_start) over (
                 partition by err_type 
                 order by line) as part
    from   base
)
select   err_type, 
         min(line),
         max(line) 
from     parts
group by err_type, part
order by err_type, part;



回答3:


If you don't want to use window/agg functions.

WITH
  table_min AS
  (
    SELECT
      a.err_type, a.line
    FROM errors a
    LEFT JOIN errors b ON a.err_type = b.err_type AND a.line  = b.line +1
    WHERE b.err_type IS NULL
  ),
  table_max AS
  (
    SELECT
      a.err_type, a.line
    FROM errors a
    LEFT JOIN errors b ON a.err_type = b.err_type AND a.line + 1 = b.line
    WHERE b.err_type IS NULL
  ),
  table_next AS
  (
    SELECT
      mx.err_type, mx.line, mi.line AS next_line_start
    FROM table_min mi
    INNER JOIN table_max mx
      ON mi.err_type = mx.err_type
      AND mi.line > mx.line
  )
SELECT
  a.err_type, a.line AS line_start, b.line AS line_end
FROM table_min a
INNER JOIN table_max b ON a.err_type = b.err_type AND a.line <= b.line
LEFT JOIN table_next n ON a.err_type = n.err_type
WHERE
  (b.line = n.line OR n.next_line_start = a.line OR n.line IS NULL)
ORDER BY a.line


来源:https://stackoverflow.com/questions/55658675/min-and-max-values-grouping-by-consecutive-ranges

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!