Combine multiple rows with different dates with overlapping variables (to capture first and last change dates)

僤鯓⒐⒋嵵緔 提交于 2021-01-29 22:22:08

问题


I have the following data represented in a table like this:

User Type Date
A Mobile 2019-01-10
A Mobile 2019-01-20
A Desktop 2019-03-01
A Desktop 2019-03-20
A Email 2021-01-01
A Email 2020-01-02
A Desktop 2021-01-03
A Desktop 2021-01-04
A Desktop 2021-01-05

Using PostgreSQL - I want to achieve the following:

User First_Type First Type Initial Date Last_Type Last_Type_Initial_Date
A Mobile 2019-01-10 Desktop 2021-01-03

So for each user, I want to capture the initial date and type but then also, on the same row (but diff columns), have their last type they "switched" to but with the first date the switch occurred and not the last record of activity on that type.


回答1:


Consider using a LAG window function and conditional aggregation join via multiple CTEs and self-joins:

WITH sub AS (
  SELECT "user"
       , "type"
       , "date"
       , CASE 
            WHEN LAG("type") OVER(PARTITION BY "user" ORDER BY "date") = "type"
            THEN 0
            ELSE 1
         END "shift"
  FROM myTable 
), agg AS (
   SELECT "user"
         , MIN(CASE WHEN shift = 1 THEN "date" END) AS min_shift_dt
         , MAX(CASE WHEN shift = 1 THEN "date" END) AS max_shift_dt
   FROM sub
   GROUP BY "user"
)


SELECT agg."user"
     , s1."type" AS first_type
     , s1."date" AS first_type_initial_date
     , s2."type" AS last_type
     , s2."date" AS last_type_initial_date
FROM agg
INNER JOIN sub AS s1
  ON agg."user" = s1."user"
  AND agg.min_shift_dt = s1."date"
  
INNER JOIN sub AS s2
  ON agg."user" = s2."user"
  AND agg.max_shift_dt = s2."date"

Online Demo

user first_type first_type_initial_date last_type last_type_initial_date
A Mobile 2019-01-10 00:00:00 Desktop 2021-01-03 00:00:00



回答2:


Here is my solution with only windows functions and no joins:

with 
  prep as (
select *, 
  lag("Type") over(partition by "User" order by "Date") as "Lasttype"
from your_table_name
)
select distinct "User", 
  first_value("Type") over(partition by "User") as "First_Type", 
  first_value("Date") over(partition by "User") as "First_Type_Initial_Date", 
  last_value("Type") over(partition by "User") as "Last_Type", 
  last_value("Date") over(partition by "User") as "Last_Type_Initial_Date"
from prep 
where "Type" <> "Lasttype" or "Lasttype" is null
;



回答3:


I think this will work, but it sure feels ugly. There might be a better way to do this.

SELECT a.User, a.Type AS First_Type, a.Date AS FirstTypeInitialDate, b.Type AS Last_Type, b.LastTypeInitialDate
FROM table a
INNER JOIN table b ON a.User = b.User
WHERE a.Date = (SELECT MIN(c.Date) FROM table c WHERE c.User = a.User)
    AND b.Date = (SELECT MIN(d.Date) FROM table d WHERE d.User = b.User
       AND d.Type = (SELECT e.Type FROM table e WHERE e.User = d.User 
          AND e.Date = (SELECT MAX(f.Date) FROM table f WHERE f.User = e.User)))


来源:https://stackoverflow.com/questions/65585137/combine-multiple-rows-with-different-dates-with-overlapping-variables-to-captur

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!