Exporting Hive Table to a S3 bucket

孤街浪徒 提交于 2019-11-30 04:46:32
user495732 Why Me

Yes you have to export and import your data at the start and end of your hive session

To do this you need to create a table that is mapped onto S3 bucket and directory

CREATE TABLE csvexport (
  id BIGINT, time STRING, log STRING
  ) 
 row format delimited fields terminated by ',' 
 lines terminated by '\n' 
 STORED AS TEXTFILE
 LOCATION 's3n://bucket/directory/';

Insert data into s3 table and when the insert is complete the directory will have a csv file

 INSERT OVERWRITE TABLE csvexport 
 select id, time, log
 from csvimport;

Your table is now preserved and when you create a new hive instance you can reimport your data

Your table can be stored in a few different formats depending on where you want to use it.

Thejas

Above Query needs to use EXTERNAL keyword, i.e:

CREATE EXTERNAL TABLE csvexport ( id BIGINT, time STRING, log STRING ) 
row format delimited fields terminated by ',' lines terminated by '\n' 
STORED AS TEXTFILE LOCATION 's3n://bucket/directory/';
INSERT OVERWRITE TABLE csvexport select id, time, log from csvimport;

An another alternative is to use the query

INSERT OVERWRITE DIRECTORY 's3n://bucket/directory/'  select id, time, log from csvimport;

the table is stored in the S3 directory with HIVE default delimiters.

hadooper

If you could access aws console and have the "Access Key Id" and "Secret Access Key" for your account

You can try this too..

CREATE TABLE csvexport(id BIGINT, time STRING, log STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 's3n://"access id":"secret key"@bucket/folder/path';

Now insert the data as other stated above..

INSERT OVERWRITE TABLE csvexport select id, time, log from csvimport;
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!