Redshift SQL: add and reset a counter with date and group considered

问题

Suppose I have a table below. I'd like to have a counter to count the # of times when a Customer (there are many) is in Segment A. If the Customer jumps to a different Segment between 2 quarters, the counter will reset when the Customer jumps back to Segment A. I am sure there are many ways to do it, but I just can't figure this out..Please help. Thank you!

Quarter    Segment    Customer    *Counter*
Q1 2018    A          A1          1
Q2 2018    A          A1          2
Q3 2018    A          A1          3
Q4 2018    B          A1          1
Q1 2019    B          A1          2
Q2 2019    A          A1          1
Q1 2020    A          A1          *1* I want 1 not 2 here because it's not consecutive

回答1:

This is a type of gaps-and-islands problem. You can solve this with a difference of row numbers. The real problem is dealing with the quarters. But string functions can handle that.

select quarter, customer, segment,
       row_number() over (partition by customer, segment, seqnum - seqnum_cs order by right(quarter, 4), left(quarter, 2)) as counter
from (select t.*,
             row_number() over (partition by customer order by right(quarter, 4), left(quarter, 2)) as seqnum,
             row_number() over (partition by customer, segment order by right(quarter, 4), left(quarter, 2)) as seqnum_cs
      from t
     ) t
order by customer, seqnum;

The key idea here is that the difference of row numbers defines the adjacent rows for a customer with the same status. It can be a bit hard to see why this is the case. However, if you look at the results of the subquery, you will no doubt see and understand why this is works.

来源：https://stackoverflow.com/questions/59402755/redshift-sql-add-and-reset-a-counter-with-date-and-group-considered

标签

sql

amazon-redshift

counter

window-functions

lag