Hive loading in partitioned table

前端未结

关注

 5  629

长发绾君心

I have a log file in HDFS, values are delimited by comma. For example:

2012-10-11 12:00,opened_browser,userid111,deviceid222

Now I want to load

相关标签:

5条回答

梦谈多话

2020-12-05 06:03
Ning Zhang has a great response on the topic at http://grokbase.com/t/hive/user/114frbfg0y/can-i-use-hive-dynamic-partition-while-loading-data-into-tables.

The quick context is that:
1. Load data simply copies data, it doesn't read it so it cannot figure out what to partition
2. Would suggest that you load data into an intermediate table first (or using an external table pointing to all the files) and then letting partition dynamic insert to kick in to load it into a partitioned table
0 讨论(0)
发布评论:

提交评论
- 加载中...
心在旅途

2020-12-05 06:03

I worked this very same scenario, but instead, what we did is create separate HDFS data files for each partition you need to load.

Since our data is coming from a MapReduce job, we used MultipleOutputs in our Reducer class to multiplex the data into their corresponding partition file. Afterwards, it is just a matter of building the script using the Partition from the HDFS file name.

0 讨论(0)
发布评论:

提交评论
- 加载中...

無奈伤痛

2020-12-05 06:08

CREATE TABLE India (

OFFICE_NAME STRING,

OFFICE_STATUS     STRING,

PINCODE           INT,

TELEPHONE   BIGINT,

TALUK       STRING,

DISTRICT    STRING,

POSTAL_DIVISION   STRING,

POSTAL_REGION     STRING,

POSTAL_CIRCLE     STRING

)

PARTITIONED BY (STATE   STRING)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ','

STORED AS TEXTFILE;

5. Instruct hive to dynamically load partitions

SET hive.exec.dynamic.partition = true;

SET hive.exec.dynamic.partition.mode = nonstrict;

0 讨论(0)

醉酒成梦

2020-12-05 06:10
1. As mentioned in @Denny Lee's answer, we need to involve a staging table(invites_stg) managed or external and then INSERT from staging table to partitioned table(invites in this case).
2. Make sure we have these two properties set to:
```
SET hive.exec.dynamic.partition=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
```
3. And finally insert to invites,
```
INSERT OVERWRITE TABLE India PARTITION (STATE) SELECT COL's FROM invites_stg;
```
Refer this link for help: http://www.edupristine.com/blog/hive-partitions-example
0 讨论(0)
发布评论:

提交评论
- 加载中...
谎友^

2020-12-05 06:15

How about

LOAD DATA INPATH '/path/to/HDFS/dir/file.csv' OVERWRITE INTO TABLE DB.EXAMPLE_TABLE PARTITION (PARTITION_COL_NAME='PARTITION_VALUE');

0 讨论(0)
发布评论:

提交评论
- 加载中...