How to see contents of Hive orc files in linux

大城市里の小女人 提交于 2019-11-29 02:33:09

问题


Is there a way to see the contents of an orc file that hive 0.11 and above use. I usually cat gz files and decompress them to see the contents eg: cat part-0000.gz | pigz -d | more Note: pigz is a parallel gz program.

I would like to know if there is something similar to this for orc files.


回答1:


The ORC file dump utility comes with hive (0.11 or higher):

hive --orcfiledump <hdfs-location-of-orc-file>

Source link




回答2:


There is now also a native executable for Linux and MacOS that prints the contents of the orc file in JSON. See the ORC project (http://orc.apache.org/) and build the C++ tools.

% orc-contents examples/TestOrcFile.test1.orc

There is also a native metadata tool:

% orc-metadata ../examples/TestOrcFile.test1.orc

The ORC project also has a standalone uber jar that can do the same from Java.

% java -jar orc-tools-1.2.3-uber.jar data myfile.orc


来源:https://stackoverflow.com/questions/20847024/how-to-see-contents-of-hive-orc-files-in-linux

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!