Hive: Merging Configuration Settings not working

后端 未结 1 516
天命终不由人
天命终不由人 2020-12-20 01:09

On Hive 2.2.0, I am filling an orc table from another source table of size 1.34 GB, using the query

INSERT INTO TABLE TableOrc SELECT * FROM Table; ---- (1)
         


        
相关标签:
1条回答
  • 2020-12-20 01:36

    Your initial average file size is smaller than hive.merge.smallfiles.avgsize, that is why merge task started to merge them. First two files merged 65.01 MB + 67.48 MB = 132.49 MB this is bigger than hive.merge.size.per.task that is why merge task will stop to merge this resulted file with more files. It will not be splitted to be exactly 128M. The method it works is quite simple.

    0 讨论(0)
提交回复
热议问题