问题
I would like to replace the null value of a particular column by values in the same column I would like to get the result
I have tried below
select
d_day,
COALESCE(val, LAST_VALUE(val, TRUE)
OVER( ORDER BY d_day ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))
as val from data_table
回答1:
One way to do it is by means of two windowing functions, here is an example:
with tmp_table as (
select 1 as ts, 3 as val
union all
select 2 as ts, NULL as val
union all
select 3 as ts, NULL as val
union all
select 4 as ts, 4 as val
union all
select 5 as ts, NULL as val
union all
select 6 as ts, 5 as val
union all
select 7 as ts, 6 as val
)
, rank_table as (
select *, SUM(val) OVER (ORDER BY ts ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as rnk
from tmp_table
)
select *, max(val) over (partition by rnk)
from rank_table
So in your case
with rank_table as (
select *, SUM(val) OVER (ORDER BY d_day ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as rnk
from your_table
)
select *, max(val) over (partition by rnk)
from rank_table
Keep in mind that the first ORDER BY d_day
will make your job run on the single reducer, so if your data is really large it might take some time to finish up.
来源:https://stackoverflow.com/questions/51820146/in-hive-replacing-the-null-value-by-the-same-column-values-using-coalesce