Hive Export和Import介绍及操作示例

不想你离开。 提交于 2020-05-08 03:16:38

在hive-0.8.0后引入了import/export命令。
Export命令可以导出一张表或分区的数据和元数据信息到一个输出位置,并且导出数据可以被移动到另一个hadoop集群或hive实例,并且可以通过import命令导入数据。
当导出一个分区表,原始数据可能在hdfs的不同位置,export/import命令也支持导出分区表的不同子分区。
本次不考虑复制功能。

语法

EXPORT TABLE tablename [PARTITION (part_column="value"[, ...])]
  TO 'export_target_path'
  
IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column="value"[, ...])]]
  FROM 'source_path'
  [LOCATION 'import_target_path']  

简单的导出导入

# 导出非分区表
export table tb_insert_multi_03 to '/tmp/hive/export/tb_insert_multi_03';
import from  '/tmp/hive/export/tb_insert_multi_03';
# 操作过程
0: jdbc:hive2://node225:10000/db01> desc tb_insert_multi_03;
OK
+-----------+---------------------+----------+--+
| col_name  |      data_type      | comment  |
+-----------+---------------------+----------+--+
| id        | int                 |          |
| name      | string              |          |
| hobby     | array<string>       |          |
| add       | map<string,string>  |          |
+-----------+---------------------+----------+--+
4 rows selected (0.127 seconds)
0: jdbc:hive2://node225:10000/db01> export table tb_insert_multi_03 to '/tmp/hive/export/tb_insert_multi_03';
Copying data from file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_09-30-38_538_7055450388292522332-21/-local-10000/_metadata
Copying file: file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_09-30-38_538_7055450388292522332-21/-local-10000/_metadata
Copying data from hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_03
Copying file: hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_03/000000_0
OK
# 导入前要求表没数据或者没有该表
No rows affected (0.224 seconds)
0: jdbc:hive2://node225:10000/db01> import from  '/tmp/hive/export/tb_insert_multi_03';
FAILED: SemanticException [Error 10119]: Table exists and contains data files
Error: Error while compiling statement: FAILED: SemanticException [Error 10119]: Table exists and contains data files (state=42000,code=10119)
0: jdbc:hive2://node225:10000/db01> truncate table tb_insert_multi_03;
OK
No rows affected (0.24 seconds)
0: jdbc:hive2://node225:10000/db01> import from  '/tmp/hive/export/tb_insert_multi_03';
Copying data from hdfs://ns1/tmp/hive/export/tb_insert_multi_03/data
Copying file: hdfs://ns1/tmp/hive/export/tb_insert_multi_03/data/000000_0
Loading data to table db01.tb_insert_multi_03
OK
No rows affected (0.337 seconds)
0: jdbc:hive2://node225:10000/db01> select * from tb_insert_multi_03;
OK
+------------------------+--------------------------+---------------------------+---------------------------------------------+--+
| tb_insert_multi_03.id  | tb_insert_multi_03.name  | tb_insert_multi_03.hobby  |           tb_insert_multi_03.add            |
+------------------------+--------------------------+---------------------------+---------------------------------------------+--+
| 1                      | xiaoming                 | ["book","TV","code"]      | {"beijing":"chaoyang","shagnhai":"pudong"}  |
| 2                      | lilei                    | ["book","code"]           | {"nanjing":"jiangning","taiwan":"taibei"}   |
| 3                      | lihua                    | ["music","book"]          | {"heilongjiang":"haerbin"}                  |
+------------------------+--------------------------+---------------------------+---------------------------------------------+--+
3 rows selected (0.199 seconds)
0: jdbc:hive2://node225:10000/db01> drop table tb_insert_multi_03;
OK
No rows affected (0.269 seconds)
0: jdbc:hive2://node225:10000/db01> import from  '/tmp/hive/export/tb_insert_multi_03';
Copying data from hdfs://ns1/tmp/hive/export/tb_insert_multi_03/data
Copying file: hdfs://ns1/tmp/hive/export/tb_insert_multi_03/data/000000_0
Loading data to table db01.tb_insert_multi_03
OK
No rows affected (0.477 seconds)
0: jdbc:hive2://node225:10000/db01> select * from tb_insert_multi_03;
OK
+------------------------+--------------------------+---------------------------+---------------------------------------------+--+
| tb_insert_multi_03.id  | tb_insert_multi_03.name  | tb_insert_multi_03.hobby  |           tb_insert_multi_03.add            |
+------------------------+--------------------------+---------------------------+---------------------------------------------+--+
| 1                      | xiaoming                 | ["book","TV","code"]      | {"beijing":"chaoyang","shagnhai":"pudong"}  |
| 2                      | lilei                    | ["book","code"]           | {"nanjing":"jiangning","taiwan":"taibei"}   |
| 3                      | lihua                    | ["music","book"]          | {"heilongjiang":"haerbin"}                  |
+------------------------+--------------------------+---------------------------+---------------------------------------------+--+
3 rows selected (0.187 seconds)

导入时修改表名

# 相关命令
export table tb_insert_multi_03 to '/tmp/hive/export/tb_insert_multi_03';
import table tb_insert_multi_03_new from '/tmp/hive/export/tb_insert_multi_03';

# 操作过程
0: jdbc:hive2://node225:10000/db01> desc tb_insert_multi_03_new;
FAILED: SemanticException [Error 10001]: Table not found tb_insert_multi_03_new
Error: Error while compiling statement: FAILED: SemanticException [Error 10001]: Table not found tb_insert_multi_03_new (state=42S02,code=10001)
0: jdbc:hive2://node225:10000/db01> import table tb_insert_multi_03_new from '/tmp/hive/export/tb_insert_multi_03';
Copying data from hdfs://ns1/tmp/hive/export/tb_insert_multi_03/data
Copying file: hdfs://ns1/tmp/hive/export/tb_insert_multi_03/data/000000_0
Loading data to table db01.tb_insert_multi_03_new
OK
No rows affected (0.518 seconds)
0: jdbc:hive2://node225:10000/db01> desc tb_insert_multi_03_new;
OK
+-----------+---------------------+----------+--+
| col_name  |      data_type      | comment  |
+-----------+---------------------+----------+--+
| id        | int                 |          |
| name      | string              |          |
| hobby     | array<string>       |          |
| add       | map<string,string>  |          |
+-----------+---------------------+----------+--+
4 rows selected (0.11 seconds)
0: jdbc:hive2://node225:10000/db01> select * from tb_insert_multi_03_new;
OK
+----------------------------+------------------------------+-------------------------------+---------------------------------------------+--+
| tb_insert_multi_03_new.id  | tb_insert_multi_03_new.name  | tb_insert_multi_03_new.hobby  |         tb_insert_multi_03_new.add          |
+----------------------------+------------------------------+-------------------------------+---------------------------------------------+--+
| 1                          | xiaoming                     | ["book","TV","code"]          | {"beijing":"chaoyang","shagnhai":"pudong"}  |
| 2                          | lilei                        | ["book","code"]               | {"nanjing":"jiangning","taiwan":"taibei"}   |
| 3                          | lihua                        | ["music","book"]              | {"heilongjiang":"haerbin"}                  |
+----------------------------+------------------------------+-------------------------------+---------------------------------------------+--+
3 rows selected (0.204 seconds)

导出分区

# 相关命令
export table tb_insert_multi_02 partition(part_tag5='first',part_tag6=100) to '/tmp/hive/export/tb_insert_multi_02';
import from '/tmp/hive/export/tb_insert_multi_02';

# 操作过程
0: jdbc:hive2://node225:10000/db01> desc tb_insert_multi_02;
OK
+--------------------------+-----------------------+-----------------------+--+
|         col_name         |       data_type       |        comment        |
+--------------------------+-----------------------+-----------------------+--+
| id                       | int                   |                       |
| name                     | string                |                       |
| hobby                    | array<string>         |                       |
| add                      | map<string,string>    |                       |
| part_tag5                | string                |                       |
| part_tag6                | int                   |                       |
|                          | NULL                  | NULL                  |
| # Partition Information  | NULL                  | NULL                  |
| # col_name               | data_type             | comment               |
|                          | NULL                  | NULL                  |
| part_tag5                | string                |                       |
| part_tag6                | int                   |                       |
+--------------------------+-----------------------+-----------------------+--+
12 rows selected (0.14 seconds)
0: jdbc:hive2://node225:10000/db01> export table tb_insert_multi_02 partition(part_tag5='first',part_tag6=100) to '/tmp/hive/export/tb_insert_multi_02';
Copying data from file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_10-11-24_297_4948890082470094094-21/-local-10000/_metadata
Copying file: file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_10-11-24_297_4948890082470094094-21/-local-10000/_metadata
Copying data from hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_02/part_tag5=first/part_tag6=100
Copying file: hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_02/part_tag5=first/part_tag6=100/000000_0
Copying file: hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_02/part_tag5=first/part_tag6=100/000000_0_copy_1
OK
No rows affected (0.387 seconds)
0: jdbc:hive2://node225:10000/db01> truncate table tb_insert_multi_02;
OK
No rows affected (0.437 seconds)

# 导入前表不能存在,清空也不行
0: jdbc:hive2://node225:10000/db01> import from '/tmp/hive/export/tb_insert_multi_02';
FAILED: SemanticException [Error 10118]: Partition already exists part_tag5=first,part_tag6=100
Error: Error while compiling statement: FAILED: SemanticException [Error 10118]: Partition already exists part_tag5=first,part_tag6=100 (state=42000,code=10118)
0: jdbc:hive2://node225:10000/db01> drop table tb_insert_multi_02;
OK
No rows affected (0.413 seconds)
0: jdbc:hive2://node225:10000/db01> import from '/tmp/hive/export/tb_insert_multi_02';
Copying data from hdfs://ns1/tmp/hive/export/tb_insert_multi_02/part_tag5=first/part_tag6=100
Copying file: hdfs://ns1/tmp/hive/export/tb_insert_multi_02/part_tag5=first/part_tag6=100/000000_0
Copying file: hdfs://ns1/tmp/hive/export/tb_insert_multi_02/part_tag5=first/part_tag6=100/000000_0_copy_1
Loading data to table db01.tb_insert_multi_02 partition (part_tag5=first, part_tag6=100)
OK
No rows affected (1.094 seconds)

导入分区

# 相关命令
export table tb_insert_multi_02  to '/tmp/hive/export/tb_insert_multi_02_new_part';
import table tb_insert_multi_02 partition(part_tag5='first',part_tag6=100) from '/tmp/hive/export/tb_insert_multi_02_new_part';

#操作过程
0: jdbc:hive2://node225:10000/db01> export table tb_insert_multi_02  to '/tmp/hive/export/tb_insert_multi_02_new_part';
Copying data from file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_10-23-26_282_5284710576756694959-21/-local-10000/_metadata
Copying file: file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_10-23-26_282_5284710576756694959-21/-local-10000/_metadata
Copying data from hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_02/part_tag5=first/part_tag6=100
Copying file: hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_02/part_tag5=first/part_tag6=100/000000_0
Copying file: hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_02/part_tag5=first/part_tag6=100/000000_0_copy_1
OK
No rows affected (0.461 seconds)

0: jdbc:hive2://node225:10000/db01> import table tb_insert_multi_02 partition(part_tag5='second',part_tag6=200) from '/tmp/hive/export/tb_insert_multi_02_new_part';
FAILED: SemanticException [Error 10006]: Partition not found  - Specified partition not found in import directory
Error: Error while compiling statement: FAILED: SemanticException [Error 10006]: Partition not found  - Specified partition not found in import directory (state=42000,code=10006)
0: jdbc:hive2://node225:10000/db01> import table tb_insert_multi_02 partition(part_tag5='first',part_tag6=100) from '/tmp/hive/export/tb_insert_multi_02_new_part';
FAILED: SemanticException [Error 10118]: Partition already exists part_tag5=first,part_tag6=100
Error: Error while compiling statement: FAILED: SemanticException [Error 10118]: Partition already exists part_tag5=first,part_tag6=100 (state=42000,code=10118)
# 导入的目标表不能存在
0: jdbc:hive2://node225:10000/db01> drop table tb_insert_multi_02;
OK
No rows affected (0.31 seconds)
0: jdbc:hive2://node225:10000/db01> import table tb_insert_multi_02 partition(part_tag5='first',part_tag6=100) from '/tmp/hive/export/tb_insert_multi_02_new_part';
Copying data from hdfs://ns1/tmp/hive/export/tb_insert_multi_02_new_part/part_tag5=first/part_tag6=100
Copying file: hdfs://ns1/tmp/hive/export/tb_insert_multi_02_new_part/part_tag5=first/part_tag6=100/000000_0
Copying file: hdfs://ns1/tmp/hive/export/tb_insert_multi_02_new_part/part_tag5=first/part_tag6=100/000000_0_copy_1
Loading data to table db01.tb_insert_multi_02 partition (part_tag5=first, part_tag6=100)
OK
No rows affected (1.032 seconds)

导入指定location

试验操作持续不成功

export table tb_insert_multi_03 to '/tmp/hive/export/new/tb_insert_multi_03';
# 持续不成功
import table tb_insert_multi_03 from '/tmp/hive/export/new/tb_insert_multi_03' location '/usr/local/hive-2.1.1/data_dir/tb_insert_multi_03';

# 操作过程
0: jdbc:hive2://node225:10000/db01> export table tb_insert_multi_03 to '/tmp/hive/export/new/tb_insert_multi_03';
Copying data from file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_10-33-00_381_204953706829274488-21/-local-10000/_metadata
Copying file: file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_10-33-00_381_204953706829274488-21/-local-10000/_metadata
Copying data from hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_03
Copying file: hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_03/000000_0
OK
No rows affected (0.243 seconds)
0: jdbc:hive2://node225:10000/db01> import table tb_insert_multi_03 from '/tmp/hive/export/new/tb_insert_multi_03' location '/usr/local/hive-2.1.1/data_dir/tb_insert_multi_03';
FAILED: SemanticException [Error 10120]: The existing table is not compatible with the import spec.   Location does not match
Error: Error while compiling statement: FAILED: SemanticException [Error 10120]: The existing table is not compatible with the import spec.   Location does not match (state=42000,code=10120)
0: jdbc:hive2://node225:10000/db01> drop table tb_insert_multi_03;
OK
No rows affected (0.287 seconds)
0: jdbc:hive2://node225:10000/db01> import table tb_insert_multi_03 from '/tmp/hive/export/new/tb_insert_multi_03' location '/usr/local/hive-2.1.1/data_dir/tb_insert_multi_03';
FAILED: SemanticException [Error 10324]: Import Semantic Analyzer Error
Error: Error while compiling statement: FAILED: SemanticException [Error 10324]: Import Semantic Analyzer Error (state=42000,code=10324)

导入为外部表

# 相关命令
export table tb_insert_multi_03 to '/tmp/hive/export/tb_insert_multi_03_local';
import external table tb_insert_multi_03_external from '/tmp/hive/export/tb_insert_multi_03_local';

# 操作过程
0: jdbc:hive2://node225:10000/db01> export table tb_insert_multi_03 to '/tmp/hive/export/tb_insert_multi_03_local';
Copying data from file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_10-28-26_583_4212076159541457251-21/-local-10000/_metadata
Copying file: file:/tmp/root/a88cded1-b448-4be9-b3b5-1fd40e12e3a2/hive_2018-10-12_10-28-26_583_4212076159541457251-21/-local-10000/_metadata
Copying data from hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_03
Copying file: hdfs://ns1/user/hive/warehouse/db01.db/tb_insert_multi_03/000000_0
OK
No rows affected (0.235 seconds)

0: jdbc:hive2://node225:10000/db01> import external table tb_insert_multi_03_external from '/tmp/hive/export/tb_insert_multi_03_local';
OK
No rows affected (0.173 seconds)
0: jdbc:hive2://node225:10000/db01> desc tb_insert_multi_03_external;
OK
+-----------+---------------------+----------+--+
| col_name  |      data_type      | comment  |
+-----------+---------------------+----------+--+
| id        | int                 |          |
| name      | string              |          |
| hobby     | array<string>       |          |
| add       | map<string,string>  |          |
+-----------+---------------------+----------+--+
4 rows selected (0.107 seconds)
0: jdbc:hive2://node225:10000/db01> select * from tb_insert_multi_03_external;
OK
+---------------------------------+-----------------------------------+------------------------------------+---------------------------------------------+--+
| tb_insert_multi_03_external.id  | tb_insert_multi_03_external.name  | tb_insert_multi_03_external.hobby  |       tb_insert_multi_03_external.add       |
+---------------------------------+-----------------------------------+------------------------------------+---------------------------------------------+--+
| 1                               | xiaoming                          | ["book","TV","code"]               | {"beijing":"chaoyang","shagnhai":"pudong"}  |
| 2                               | lilei                             | ["book","code"]                    | {"nanjing":"jiangning","taiwan":"taibei"}   |
| 3                               | lihua                             | ["music","book"]                   | {"heilongjiang":"haerbin"}                  |
+---------------------------------+-----------------------------------+------------------------------------+---------------------------------------------+--+
3 rows selected (0.258 seconds)

 

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!