apache-zeppelin

SparkSession return nothing with an HiveServer2 connection throught JDBC

两盒软妹~` 提交于 2019-12-21 04:53:10
问题 I have an issue about reading data from a remote HiveServer2 using JDBC and SparkSession in Apache Zeppelin. Here is the code. %spark import org.apache.spark.sql.Row import org.apache.spark.sql.SparkSession val prop = new java.util.Properties prop.setProperty("user","hive") prop.setProperty("password","hive") prop.setProperty("driver", "org.apache.hive.jdbc.HiveDriver") val test = spark.read.jdbc("jdbc:hive2://xxx.xxx.xxx.xxx:10000/", "tests.hello_world", prop) test.select("*").show() When i

How to set up Zeppelin to work with remote EMR Yarn cluster

…衆ロ難τιáo~ 提交于 2019-12-21 02:01:07
问题 I have Amazon EMR Hadoop v2.6 cluster with Spark 1.4.1, with Yarn resource manager. I want to deploy Zeppelin on separate machine to allow turning off EMR cluster when there is no jobs running. I tried following instruction from here https://zeppelin.incubator.apache.org/docs/install/yarn_install.html with not much of success. Can somebody demystify steps how Zeppelin should connect to existing Yarn cluster from different machine? 回答1: [1] install Zeppelin with proper params: git clone https:

Getting NullPointerException when running Spark Code in Zeppelin 0.7.1

 ̄綄美尐妖づ 提交于 2019-12-20 17:41:55
问题 I have installed Zeppelin 0.7.1 . When I tried to execute the Example spark program(which was available with Zeppelin Tutorial notebook), I am getting the following error java.lang.NullPointerException at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38) at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:391) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext

How can I pass datasets between %pyspark interpreter and %python interpreters in Zeppelin?

时光毁灭记忆、已成空白 提交于 2019-12-20 05:14:18
问题 I'm writing a code where I'm fetching a dataset using an internal library and %pyspark interpreter. However I am unable to pass the dataset to %python interpreter. I tried using string variables and it is working fine, but with dataset I'm using the following code to put dataset in a zeppelin context- z.put("input_data",input_data) and it is throwing the following error: AttributeError: 'DataFrame' object has no attribute '_get_object_id' . Can you please tell me how can I do this? Thanks in

How to group time column into 5 second intervals and count rows using Presto?

夙愿已清 提交于 2019-12-20 04:52:48
问题 I am using Presto and Zeppelin. There are a lot of raw datas. I have to summarize those datas. I wanna group time every 5 seconds. serviceType logType date ------------------------------------------------------ service1 log1 2017-10-24 23:00:23.206 service1 log1 2017-10-24 23:00:23.207 service1 log1 2017-10-24 23:00:25.206 service2 log1 2017-10-24 23:00:24.206 service1 log2 2017-10-24 23:00:27.206 service1 log2 2017-10-24 23:00:29.302 then the result serviceType logType date cnt -------------

Is it possible to customize the skin on Zeppelin?

[亡魂溺海] 提交于 2019-12-18 21:18:42
问题 Is it possible to customize the skin on Zeppelin? In otherwords, replace the Zeppelin logo with something else? 回答1: Yes, it is possible very much. As you know Apache Zeppelin (incubating) is an open source project, so need just to: clone it from github.com/apache/incubator-zeppelin make modifications inside zeppelin-web sub-module it is a standard Angular web-application, so you can change anything build it That is basically it. There are at least 2 companies who are known to successfully

How to put a variable into z ZeppelinContext in javascript in Zeppelin?

て烟熏妆下的殇ゞ 提交于 2019-12-17 19:00:39
问题 In Scala and Python it's: z.put("varname", variable) But in javascript I get (in the console) Uncaught ReferenceError: z is not defined What I really want to do is access a javascript variable from Scala code using z.angular("varname") in Zeppelin, but I'm having no luck :( In full need in one cell something like %angular <script> var myVar = "hello world"; // some magic code here! </script> Then in another cell println(z.angular("myVar")) UPDATE: This is what I have so far, I'm completely

JavaPackage object is not callable error: Pyspark

≯℡__Kan透↙ 提交于 2019-12-14 03:55:48
问题 Operations like dataframe.show() , sQLContext.read.json works fine , but most functions gives "JavaPackage object is not callable error" . eg : when i do dataFrame.withColumn(field_name, monotonically_increasing_id()) I get an error File "/tmp/spark-cd423f35-9572-45ee-b159-1b2732afa2a6/userFiles-3a6e1729-95f4-468b-914c-c706369bf2a6/Transformations.py", line 64, in add_id_column self.dataFrame = self.dataFrame.withColumn(field_name, monotonically_increasing_id()) File "/home/himaprasoon/apps

Failed to run task: 'bower --allow-root install' failed

一世执手 提交于 2019-12-14 01:47:13
问题 I am trying to build Apache zeppelin from the source code. But it breaks at the "zeppelin-web" with the following error [ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:0.0.23:bower (bower install) on project zeppelin-web: Failed to run task: 'bower --allow-root install' failed. (error code 8) -> [Help 1] Here is the full debug log. [ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:0.0.23:bower (bower install) on project zeppelin-web: Failed to

Is it possible to see some error output in Zeppelin paragraphs?

廉价感情. 提交于 2019-12-13 20:09:07
问题 I have a Zeppelin installation and am using the Spark interpreter. However, if I have a syntax or runtime error, I cannot find any details except the "Error" word. For example, I have this code: And I only see the "ERROR" word in the top-right corner. In my own computer, scala would instead print something like: $ scala example.sc ./example.sc:1: error: recursive value a needs type val a = this is an error ^ .example.sc:1: error: not found: value an val a = this is an error ^ two errors found