Join by nearest date for the table with duplicate records in BigQuery

Deadly 提交于 2021-02-15 07:48:19

问题


I have installs table with installs that have the same user_id but different install_date. I want to get all revenue records joined with nearest install record by install_date that is less then revenue_date because I need it's source field value for next processing. That means that output rows count should be equal to revenue table records. How can it be achieved in BigQuery?

Here is the data:

installs
install_date    user_id     source
--------------------------------
2020-01-10      user_a      source_I           
2020-01-15      user_a      source_II
2020-01-20      user_a      source_III
***info about another users***

revenue
revenue_date    user_id     revenue
--------------------------------------------
2020-01-11      user_a      10
2020-01-21      user_a      20
***info about another users***

回答1:


Consider below solution

select any_value(r).*, 
    array_agg(
        (select as struct i.* except(user_id)) 
        order by install_date desc 
        limit 1
    )[offset(0)].*
from `project.dataset.revenue` r 
join `project.dataset.installs` i 
on i.user_id = r.user_id 
and install_date < revenue_date
group by format('%t', r)  

If applied to sample data in your question - output is




回答2:


You may be able to use left join for this:

select r.*, i.* except (user_id)
from revenue r left join
     (select i.*,
             lead(install_date) over (partition by user_id order by install_date) as next_install_date
      from installs i
     ) i
     on r.user_id = i.user_id and
        r.revenue_date >= i.install_date and
        (r.revenue_date < i.next_install_date or i.next_install_date is null);

I have had problems in the past with left joins and inequalities. However, I think this will work now in BQ.



来源:https://stackoverflow.com/questions/65886871/join-by-nearest-date-for-the-table-with-duplicate-records-in-bigquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!