Filtering a spark partitioned table is not working in Pyspark

試著忘記壹切 提交于 2019-12-05 21:25:56

I have stumbled on this issue also. What helped for me was to do this line:

spark.sql("SET spark.sql.hive.manageFilesourcePartitions=False")

and then use spark.sql(query) instead of using dataframe.

I do not know what happens under the hood, but this solved my problem.

Although it might be too late for you (since this question was asked 8 months ago), this might help for other people.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!