Drop column of hive table stored as orc

后端 未结 4 527
栀梦
栀梦 2021-01-01 00:03

I have hive table created as below:

create table alpha001(id int, name string) clustered by (id) into 2 buckets store         


        
相关标签:
4条回答
  • 2021-01-01 00:43

    you cant drop column for hive table. On using "ALTER TABLE alpha001 DROP name" it gives error because in hive dropping column is not supported but we cant drop partition.So here ,it searches for Partition name near drop which 'name' column is not.

    So in order to drop column ,you have to 1.drop table and re-create table with column eliminated(but may result in loss of data if its internal table) 2.you can use REPLACE command(refer this https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/Partition/Column)

    0 讨论(0)
  • 2021-01-01 00:48

    You are using the replace command in a wrong way, you need to omit the columns that you want to drop while using replace command.

    describe formatted bucketed_table; OK col_name data_type comment

    col_name data_type comment

    id int firstname string last_name string

    hive>alter table bucketed_table drop column id;

    MismatchedTokenException(58!=196) at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:61 7) at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(Hiv eParser_IdentifiersParser.java:10839) at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:49854 ) at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(Hive Parser.java:13140) at org.apache.hadoop.hive.ql.parse.HiveParser.alterTableStatementSuffix(HiveParser.ja va:9370) at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatement(HiveParser.java:8803) at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:4057) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1786) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1152) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:211) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:438) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1151) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) FAILED: ParseException line 1:32 mismatched input 'column' expecting PARTITION near 'drop' in drop partition statement

    hive> alter table bucketed_table replace columns(firstname string , lastname string); OK Time taken: 0.164 seconds

    hive> select * from bucketed_table; OK bucketed_table.firstname bucketed_table.lastname 3 nikhil 3 nikhil 1 tutu 1 avi 4 tanvi 10 vivek 1 tutu 1 avi 4 tanvi 10 vivek 5 akshay 5 akshay Time taken: 0.138 seconds, Fetched: 12 row(s)

    hive> describe bucketed_table; OK col_name data_type comment firstname string lastname string

    so from above even if you replace the column , if the bucketing is done on that column it will still persist but if that column doesn't have bucketing on it then it will be removed both from metadata and hdfs.

    0 讨论(0)
  • 2021-01-01 00:52

    I just found out that we are not able to drop the columns in table which is stored as ORC. And only way is we can use "create table as select" option to create a new table with desired columns.

    0 讨论(0)
  • 2021-01-01 00:59

    Unfortunately, you can't! The only way you can delete column from existing table is by using REPLACE COLUMNS keyword. But this can be done only for tables with a native SerDe (DynamicSerDe, MetadataTypedColumnsetSerDe, LazySimpleSerDe and ColumnarSerDe).

    Your best bet is recreating the schema. Follows the steps.

    1. Check if the table is external. If it isn't, use the following statement to make it external.

      alter table alpha001 set tblproperties('EXTERNAL'='TRUE');
      
    2. Drop the table. Since the table is an external table, you can drop it without dropping the actual table.

    3. Recreate the table with the new schema. You should be able to access the table with new schema.

    Follows a quick sample.

    create table alpha001(id int, name string) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');
    
    --assuming your table is not EXTERNAL already
    alter table alpha001 set tblproperties('EXTERNAL'='TRUE');
    
    insert into alpha001 values(1,"A");
    
    select * from alpha001;
    
    OK
    1       A
    
    drop table alpha001;
    
    create table alpha001(id int) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');
    
    select * from alpha001;
    
    OK
    1
    Time tak
    

    Hope that helps!

    0 讨论(0)
提交回复
热议问题