is Parquet predicate pushdown works on S3 using Spark non EMR?

后端 未结 5 966
-上瘾入骨i
-上瘾入骨i 2020-12-05 21:08

Just wondering if Parquet predicate pushdown also works on S3, not only HDFS. Specifically if we use Spark (non EMR).

Further explanation might be helpful since it m

5条回答
  •  星月不相逢
    2020-12-05 21:35

    Yes. Filter pushdown does not depend on the underlying file system. It only depends on the spark.sql.parquet.filterPushdown and the type of filter (not all filters can be pushed down).

    See https://github.com/apache/spark/blob/v2.2.0/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala#L313 for the pushdown logic.

提交回复
热议问题