I have a dataframe called df with column named employee_id. I am doing:
df.registerTempTable(\"d_f\")
val query = \"\"\"SELECT *, ROW_NUMBER() OVER (ORDER B
Spark 2.0+
Spark 2.0 introduces native implementation of window functions (SPARK-8641) so HiveContext
should be no longer required. Nevertheless similar errors, not related to window functions, can be still attributed to the differences between SQL parsers.
Spark <= 1.6
Window functions have been introduced in Spark 1.4.0 and require HiveContext to work. SQLContext won't work here.
Be sure you you use Spark >= 1.4.0 and create the HiveContext
:
import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc)
Yes It is true,
I am using spark version 1.6.0 and there you need a HiveContext to implement 'dense_rank' method.
From Spark 2.0.0 on words there will be no more 'dense_rank' method.
So for Spark 1.4,1.6 <2.0 you should apply like this.
table hive_employees having three fields :: place : String, name : String, salary : Int
val conf = new SparkConf().setAppName("denseRank test")//.setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val hqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
val result = hqlContext.sql("select empid,empname, dense_rank() over(partition by empsalary order by empname) as rank from hive_employees")
result.show()