How to reduce generating files of SQL “Alter Table/Partition Concatenate” in Hive?

吃可爱长大的小学妹 提交于 2019-12-07 04:39:22

问题


Hive version: 1.2.1

Configuration:

set hive.execution.engine=tez;
set hive.merge.mapredfiles=true;
set hive.merge.smallfiles.avgsize=256000000;
set hive.merge.tezfiles=true;

HQL:

ALTER TABLE `table_name` PARTITION (partion_name1 = 'val1', partion_name2='val2', partion_name3='val3', partion_name4='val4') CONCATENATE;

I use the HQL to merge files of specific table / partition. However, after execution there are still many files in output directory; and their size are far less than 256000000. So how to decrease the number of output files.

BTW, use MapReduce instead of Tez also didn't work.


回答1:


You may set your reducer number to 1 then, it would only create one output file.

You may do it with the following;

set mapred.reduce.tasks=1



回答2:


Maybe u can try insert overwrite table ... partition ( ... ) select * from ...

This one can use the merge setting for tezfiles.



来源:https://stackoverflow.com/questions/33166387/how-to-reduce-generating-files-of-sql-alter-table-partition-concatenate-in-hiv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!