Window function is not working on Pyspark sqlcontext

强颜欢笑 提交于 2019-11-29 17:29:54
eliasah

The error kind of says everything :

py4j.protocol.Py4JJavaError: An error occurred while calling o138.select.
: org.apache.spark.sql.AnalysisException: Could not resolve window function 'min'. Note that, using window functions currently requires a HiveContext;

You'll need a version of spark that supports hive (build with hive) than you can declare a hivecontext :

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

and then use that context to perform your window function.

In python :

# sc is an existing SparkContext.
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)

You can read further about the difference between SQLContextand HiveContext here.

SparkSQL has a SQLContext and a HiveContext. HiveContext is a super set of the SQLContext. The Spark community suggest using the HiveContext. You can see that when you run spark-shell, which is your interactive driver application, it automatically creates a SparkContext defined as sc and a HiveContext defined as sqlContext. The HiveContext allows you to execute SQL queries as well as Hive commands. The same behavior occurs for pyspark.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!