Zeppelin - Cannot query with %sql a table I registered with pyspark

后端 未结 4 700
没有蜡笔的小新
没有蜡笔的小新 2021-01-12 09:36

I am new to spark/zeppelin and I wanted to complete a simple exercise, where I will transform a csv file from pandas to Spark data frame and then register the table to query

相关标签:
4条回答
  • 2021-01-12 09:58

    You didn't say which interpreter group you were using. If it's livy then you can't access tables registered in %livy.pyspark from %livy.sql. I got this from here:

    for now %livy.sql can only access tables registered %livy.spark, but not %livy.pyspark and %livy.sparkr.
    

    If you switch to the standard spark interpreter group it should work. I can confirm this for me using Spark 1.6.3 and Zeppelin 0.7.0. Hopefully the people working on the livy interpreter will fix this restriction...

    0 讨论(0)
  • 2021-01-12 10:02

    also related to the different contexts created by spark check the following setting in the spark interpreter

    zeppelin.spark.useHiveContext = false
    

    set the setting to 'false'

    0 讨论(0)
  • 2021-01-12 10:13

    Correct syntax would be:

    sqlContext.registerDataFrameAsTable(spark_clean_df, 'table1')
    sqlContext.sql(select * from table1 where ...)
    
    0 讨论(0)
  • 2021-01-12 10:24

    Zeppelin can create different contexts for different interpreters it is possible that if you executed some code with %spark and some code with %pyspark interpreters your Zeppelin can have two contexts. And when you use %sql it is looking in another context not in %pyspark. Try restart Zeppelin and execute %pyspark code as first statement and than %sql as second.

    If you go to 'Interpreters' tab you can add zeppelin.spark.sql.stacktrace there. And after restart Zeppelin you will see full stack trace in a place where you have 'Table not found' now.

    Actually this is probably answer to your question When registering a table using the %pyspark interpreter in Zeppelin, I can't access the table in %sql

    Try to do

        %pyspark
        sqlContext = sqlc
    

    as first two lines

    0 讨论(0)
提交回复
热议问题