Just wondering if Parquet predicate pushdown also works on S3, not only HDFS. Specifically if we use Spark (non EMR).
Further explanation might be helpful since it m
Spark uses the HDFS parquet & s3 libraries so the same logic works. (and in spark 1.6 they've added even a faster shortcut for flat schema parquet files)