How to delete and update a record in Hive

后端未结

关注

 15  1363

I have installed Hadoop, Hive, Hive JDBC. which are running fine for me. But I still have a problem. How to delete or update a single record using Hive because delete or upd

相关标签:

15条回答

南笙

2020-11-28 20:09

UPDATE or DELETE a record isn't allowed in Hive, but INSERT INTO is acceptable.
A snippet from Hadoop: The Definitive Guide(3rd edition):

Updates, transactions, and indexes are mainstays of traditional databases. Yet, until recently, these features have not been considered a part of Hive's feature set. This is because Hive was built to operate over HDFS data using MapReduce, where full-table scans are the norm and a table update is achieved by transforming the data into a new table. For a data warehousing application that runs over large portions of the dataset, this works well.

Hive doesn't support updates (or deletes), but it does support INSERT INTO, so it is possible to add new rows to an existing table.

0 讨论(0)
发布评论:

提交评论
- 加载中...
迷失自我

2020-11-28 20:11
Yes, rightly said. Hive does not support UPDATE option. But the following alternative could be used to achieve the result:

Update records in a partitioned Hive table:
1. The main table is assumed to be partitioned by some key.
2. Load the incremental data (the data to be updated) to a staging table partitioned with the same keys as the main table.
3. Join the two tables (main & staging tables) using a LEFT OUTER JOIN operation as below:
  
  insert overwrite table main_table partition (c,d) select t2.a, t2.b, t2.c,t2.d from staging_table t2 left outer join main_table t1 on t1.a=t2.a;
In the above example, the main_table & the staging_table are partitioned using the (c,d) keys. The tables are joined via a LEFT OUTER JOIN and the result is used to OVERWRITE the partitions in the main_table.

A similar approach could be used in the case of un-partitioned Hive table UPDATE operations too.
0 讨论(0)
发布评论:

提交评论
- 加载中...

你的背包

2020-11-28 20:11

If you want to delete all records then as a workaround load an empty file into table in OVERWRITE mode

hive> LOAD DATA LOCAL INPATH '/root/hadoop/textfiles/empty.txt' OVERWRITE INTO TABLE employee;
Loading data to table default.employee
Table default.employee stats: [numFiles=1, numRows=0, totalSize=0, rawDataSize=0]
OK
Time taken: 0.19 seconds

hive> SELECT * FROM employee;
OK
Time taken: 0.052 seconds

0 讨论(0)

上一页 1 2 3