问题
My Hive version is 0.13. I have two tables, table_1 and table_2
table_1 contains:
customer_id | items | price | updated_date
------------+-------+-------+-------------
10 | watch | 1000 | 20170626
11 | bat | 400 | 20170625
table_2 contains:
customer_id | items | price | updated_date
------------+----------+-------+-------------
10 | computer | 20000 | 20170624
I want to update records of table_2 if customer_id already exists in it, if not, it should append to table_2.
As Hive 0.13 does not support update, I tried using join, but it fails.
回答1:
You can use row_number or full join. This is example using row_number:
insert overwrite table_1
select customer_id, items, price, updated_date
from
(
select customer_id, items, price, updated_date,
row_number() over(partition by customer_id order by new_flag desc) rn
from
(
select customer_id, items, price, updated_date, 0 as new_flag
from table_1
union all
select customer_id, items, price, updated_date, 1 as new_flag
from table_2
) all_data
)s where rn=1;
Also see this answer for update using FULL JOIN: https://stackoverflow.com/a/37744071/2700344
来源:https://stackoverflow.com/questions/44753544/how-to-update-table-in-hive-0-13