问题
we have two columns id and monthid.
The output what I'm looking for is to divide year from month Id based on quarter . The output column should be from quarter. If id is active output should be 1 else 0 .If id comes in any of the 1st quarter (eg:only 1) the output is still 1 .
Like this:
id month
-----------------------------------
100 2012-03-01 00:00:00.0
100 2015-09-01 00:00:00.0
100 2016-10-01 00:00:00.0
100 2015-11-01 00:00:00.0
100 2014-01-01 00:00:00.0
100 2013-04-01 00:00:00.0
100 2014-12-01 00:00:00.0
100 2015-02-01 00:00:00.0
100 2014-06-01 00:00:00.0
100 2013-01-01 00:00:00.0
100 2014-05-01 00:00:00.0
100 2016-05-01 00:00:00.0
100 2013-07-01 00:00:00.0
result should be something like
ID YEAR QTR output (1 or 0)
--------------------------------------------------
100 2012 1 1
100 2012 2 0
100 2012 3 0
100 2012 4 0
100 2013 1 1
100 2013 2 1
100 2013 3 1
100 2013 4 0
Below is the one I tried but it doesn't return the expected results. Please help me achieve this.I want when the ouput is 0 as well.
select a.id,a.year,a.month,
CASE WHEN a.month BETWEEN 1 AND 4 THEN 1
ELSE 0 END as output
from
(select id,trim(substring(claim_month_id,1,4)) as year,(INT((MONTH(monthid)-1)/3)+1) as month from test) a
group by a.id,a.year,a.month
Any help would be appreciated.
回答1:
@Ani; there is no hierarchical query in Hive to create four quarters (1,2,3,4) so I create a small table for it. Then I get all patient_id, year and month that exists in ims_patient_activity_diagnosis table. Finally, I did a right join on all possible patient id, year and quarters (1,2,3,4); If the id or year or quarter does not exists in the right join, then there is no activity for that id, year and quarter. I assign activity=0 for those rows. I also inserted patient id=200 to test if there are more patient id in the table. Hope this helps. Thanks.
create table dbo.qtrs(month int);
insert into qtrs values (1),(2),(3),(4);
select DISTINCT NVL(ims.id, qtr.id) as patient_id,
qtr.year as year,
qtr.month as month,
CASE WHEN ims.id > 0 THEN 1 ELSE 0 END as activity
from sandbox_grwi.ims_patient_activity_diagnosis ims
right join (select distinct ims.id,YEAR(ims.month_dt) as year,qtrs.month from sandbox_grwi.ims_patient_activity_diagnosis ims join dbo.qtrs qtrs) qtr
on (ims.id=qtr.id and YEAR(ims.month_dt)=qtr.year and INT((MONTH(month_dt)-1)/3)+1=qtr.month)
sort by patient_id, year, month;
Sample Result:
p_id year month activity
100 2012 1 1
100 2012 2 0
100 2012 3 0
100 2012 4 0
100 2013 1 1
100 2013 2 1
100 2013 3 1
100 2013 4 0
100 2014 1 1
100 2014 2 1
100 2014 3 0
100 2014 4 1
100 2015 1 1
100 2015 2 0
100 2015 3 1
100 2015 4 1
100 2016 1 0
100 2016 2 1
100 2016 3 0
100 2016 4 1
200 2012 1 1
200 2012 2 0
200 2012 3 0
200 2012 4 0
200 2013 1 0
200 2013 2 1
200 2013 3 0
200 2013 4 0
additional sample data:
insert into sandbox_grwi.ims_patient_activity_diagnosis values
(200, '2012-03-01'),
(200, '2013-04-01');
来源:https://stackoverflow.com/questions/48738432/query-to-divide-data