I am running a few tests on the storage formats available with Hive and using Parquet and ORC as major options. I included ORC once with default compression and once with Sn
You are seeing this because:
Hive has a vectorized ORC reader but no vectorized parquet reader.
Spark has a vectorized parquet reader and no vectorized ORC reader.
Spark performs best with parquet, hive performs best with ORC.
I've seen similar differences when running ORC and Parquet with Spark.
Vectorization means that rows are decoded in batches, dramatically improving memory locality and cache utilization.
(correct as of Hive 2.0 and Spark 2.1)