How do I inspect the content of a Parquet file from the command line?
The only option I see now is
$ hadoop fs -get my-path local-file
$ parquet-tool
I'd rather use hdfs NFS Gateway + autofs for easy hdfs file investigation.
My setup:
/net -hosts nobind
I can easily run following command to investigate any hdfs file
head /net//path/to/hdfs/file
parquet-tools head /net//path/to/hdfs/par-file
rsync -rv /local/directory/ /net//path/to/hdfs/parentdir/
forget about the hadoop* hdfs* command ;)