I am running a few tests on the storage formats available with Hive and using Parquet and ORC as major options. I included ORC once with default compression and once with Sn
We did some benchmark comparing the different file formats (Avro, JSON, ORC, and Parquet) in different use cases.
https://www.slideshare.net/oom65/file-format-benchmarks-avro-json-orc-parquet
The data is all publicly available and benchmark code is all open source at:
https://github.com/apache/orc/tree/branch-1.4/java/bench