Is it possible to compress json in hive external table?

冷暖自知 提交于 2021-02-10 13:33:16

问题


I want to know how to compress json data in hive external table. How can it be done? I have created external table like this:

 CREATE EXTERNAL TABLE tweets (
id BIGINT,created_at STRING,source STRING,favorited BOOLEAN
)ROW FORMAT SERDE "com.cloudera.hive.serde.JSONSerDe" LOCATION "/user/cloudera/tweets";

and I had set the compression properties

set mapred.output.compress=true;
set hive.exec.compress.output=true;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec;

input file : test

{ "id": 596344698102419451, "created_at": "MonApr0101: 32: 06+00002013", "source": "blank", "favorited": false }

after that i have load my json file into hdfs location "/user/cloudera/tweets".

but it is not compressed.

Can you please let me know how to do compression in hive external table ? Can someone help me to compress in hive external table?

Thanks in advance.


回答1:


Just gzip your files and put them as is (*.gz) into the table location




回答2:


Are you need uncompress before you will select it like json. You can't use both serde (json and gzip)



来源:https://stackoverflow.com/questions/37654258/is-it-possible-to-compress-json-in-hive-external-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!