Oracle SQL - Identify sequential value ranges

前端 未结 2 494
萌比男神i
萌比男神i 2020-11-29 12:44

Here is my table:

ID  Name      Department
1   Michael   Marketing
2   Alex      Marketing
3   Tom       Marketing
4   John      Sales
5   Brad      Marketin         


        
相关标签:
2条回答
  • 2020-11-29 13:33

    This is easy to do with a technique called Tabibitosan.

    What this technique does is compare the positions of each group's rows to the overall set of rows, in order to work out if rows in the same group are next to each other or not.

    E.g., with your example data, this looks like:

    WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
                        SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 6 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 7 ID, 'Kevin' NAME, 'Production' department FROM dual)
    -- end of mimicking your table with data in it. See the SQL below:
    SELECT ID,
           NAME,
           department,
           row_number() OVER (ORDER BY ID) overall_rn,
           row_number() OVER (PARTITION BY department ORDER BY ID) department_rn,
           row_number() OVER (ORDER BY ID) - row_number() OVER (PARTITION BY department ORDER BY ID) grp
    FROM   your_table;
    
            ID NAME    DEPARTMENT OVERALL_RN DEPARTMENT_RN        GRP
    ---------- ------- ---------- ---------- ------------- ----------
             1 Michael Marketing           1             1          0
             2 Alex    Marketing           2             2          0
             3 Tom     Marketing           3             3          0
             4 John    Sales               4             1          3
             5 Brad    Marketing           5             4          1
             6 Leo     Marketing           6             5          1
             7 Kevin   Production          7             1          6
    

    Here, I've given all the rows across the entire set of data a row number in ascending id order (the overall_rn column), and I've given the rows in each department a row number (the department_rn column), again in ascending id order.

    Now that I've done that, we can subtract one from the other (the grp column).

    Notice how the number in the grp column remains the same for deparment rows that are next to each other, but it changes each time there's a gap.

    E.g. for the Marketing department, rows 1-3 are next to each other and have grp = 0, but the 4th Marketing row is actually on the 5th row of the overall results set, so it now has a different grp number. Since the 5th marketing row is on the 6th row of the overall set, it has the same grp number as the 4th marketing row, so we know they're next to each other.

    Once we have that grp information, it's a simple matter of doing an aggregate query grouping on both the department and our new grp column, using min and max to find the start and end ids:

    WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
                        SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 6 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 7 ID, 'Kevin' NAME, 'Production' department FROM dual)
    -- end of mimicking your table with data in it. See the SQL below:
    SELECT department,
           MIN(ID) start_id,
           MAX(ID) end_id
    FROM   (SELECT ID,
                   NAME,
                   department,
                   row_number() OVER (ORDER BY ID) - row_number() OVER (PARTITION BY department ORDER BY ID) grp
            FROM   your_table)
    GROUP BY department, grp;
    
    DEPARTMENT   START_ID     END_ID
    ---------- ---------- ----------
    Marketing           1          3
    Marketing           5          6
    Sales               4          4
    Production          7          7
    

    N.B., I've assumed that gaps in the id columns aren't important (i.e. if there was no row for id = 6 (so Leo and Kevin's ids were 7 and 8 respectively), then Leo and Brad would still appear in the same group, with a start id = 5 and end id = 7.

    If gaps in the id columns count as indicating a new group, then you could just use the id to label the overall set of rows (i.e. no need to caluclate the overall_rn; just use the id column instead).

    That means your query would become:

    WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
                        SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 7 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
                        SELECT 8 ID, 'Kevin' NAME, 'Production' department FROM dual)
    -- end of mimicking your table with data in it. See the SQL below:
    SELECT department,
           MIN(ID) start_id,
           MAX(ID) end_id
    FROM   (SELECT ID,
                   NAME,
                   department,
                   ID - row_number() OVER (PARTITION BY department ORDER BY ID) grp
            FROM   your_table)
    GROUP BY department, grp;
    
    DEPARTMENT   START_ID     END_ID
    ---------- ---------- ----------
    Marketing           1          3
    Sales               4          4
    Marketing           5          5
    Marketing           7          7
    Production          8          8
    
    0 讨论(0)
  • 2020-11-29 13:43

    I don't have the environment currently but you can try something like this

    select * from tab1 where id in
    (select min(id) from tab1 where Department = 'Marketing'
    union 
    select max(id) from tab1 where Department = 'Marketing')
    
    0 讨论(0)
提交回复
热议问题