External table does not return the data in its folder

前端 未结 2 1931
别跟我提以往
别跟我提以往 2021-01-21 03:02

I have created an external table in Hive with at this location :

CREATE EXTERNAL TABLE tb 
(
...
) 
PARTITIONED BY (datehour INT)
ROW FORMAT SERDE \'com.cloudera         


        
2条回答
  •  遇见更好的自我
    2021-01-21 03:22

    When we create an EXTERNAL TABLE with PARTITION, we have to ALTER the EXTERNAL TABLE with the data location for that given partition. However, it need not be the same path as we specify while creating the EXTERNAL TABLE.

    hive> ALTER TABLE tb ADD PARTITION (datehour=0909201401)
    hive> LOCATION '/user/cloudera/data/somedatafor_datehour'
    hive> ;
    

    When we specify LOCATION '/user/cloudera/data' (though its optional) while creating an EXTERNAL TABLE we can take some advantage of doing repair operations on that table. So when we want to copy the files through some process like ETL into that directory, we can sync up the partition with the EXTERNAL TABLE instead of writing ALTER TABLE statement to create another new partition.

    If we already know the directory structure of the partition that HIVE would create, we can simply place the data file in that location like '/user/cloudera/data/datehour=0909201401/data.txt' and run the statement as shown below:

    hive> MSCK REPAIR TABLE tb;  
    

    The above statement will sync up the partition to the hive meta store of the table "tb".

提交回复
热议问题