Inspect Parquet from command line

前端 未结 9 1480
再見小時候
再見小時候 2020-12-07 20:26

How do I inspect the content of a Parquet file from the command line?

The only option I see now is

$ hadoop fs -get my-path local-file
$ parquet-tool         


        
9条回答
  •  自闭症患者
    2020-12-07 20:47

    I'd rather use hdfs NFS Gateway + autofs for easy hdfs file investigation.

    My setup:

    • HDFS NFS Gateway service running on namenode.
    • distribution bundled autofs service on. with following configuration change made to auto.master
    /net    -hosts nobind
    

    I can easily run following command to investigate any hdfs file

    head /net//path/to/hdfs/file
    parquet-tools head /net//path/to/hdfs/par-file
    rsync -rv /local/directory/ /net//path/to/hdfs/parentdir/
    

    forget about the hadoop* hdfs* command ;)

提交回复
热议问题