apache-zeppelin | 易学教程

Zeppeling throwing NullPointerException while configuring

阅读更多关于 Zeppeling throwing NullPointerException while configuring

问题 I am trying to set up zeppelin-0.8.0 on my windos 8 r2 OS. I have already running spark on my console i.e. SPARK_HOME and JAVA_HOME, HADOOP_HOME set up and running fine. But while I am trying to execute printl("hello") in zeppelin spark interpreter it is throwing bellow error ... I already set SPARK_HOME and JAVA_HOME in zeppelin-env.cmd file. Error DEBUG [2019-01-22 10:05:34,129] ({pool-2-thread-2} RemoteInterpreterManagedProcess.java[start]:153) - callbackServer is serving now INFO [2019-01

Zeppelin - Cannot query with %sql a table I registered with pyspark

阅读更多关于 Zeppelin - Cannot query with %sql a table I registered with pyspark

问题 I am new to spark/zeppelin and I wanted to complete a simple exercise, where I will transform a csv file from pandas to Spark data frame and then register the table to query it with sql and visualise it using Zeppelin. But I seem to be failing in the last step. I am using Spark 1.6.1 Here is my code: %pyspark spark_clean_df.registerTempTable("table1") print spark_clean_df.dtypes print sqlContext.sql("select count(*) from table1").collect() Here is the output: [('id', 'bigint'), ('name',

Apache Zeppelin - How to use Helium framework in Apache Zeppelin

阅读更多关于 Apache Zeppelin - How to use Helium framework in Apache Zeppelin

问题 From Zeppelin-0.7, Zeppelin started supporting Helium plugins/packages using Helium Framework. However, I am not able to view any of the plugin on Helium page (localhost:8080/#/helium). As per this JIRA, I placed sample Helium.json (available on s3) under /local-repo/helium-registry-cache. However, after that I got NPE while restarting Apache Zeppelin service. I have tried Zeppelin 0.7 as well as Zeppelin 0.8.0 snaptshot versions. In particular, I want to use map Helium package - Helium-Map

Remove Temporary Tables from Apache SQL Spark

阅读更多关于 Remove Temporary Tables from Apache SQL Spark

问题 I have registertemptable in Apache Spark using Zeppelin below: val hvacText = sc.textFile("...") case class Hvac(date: String, time: String, targettemp: Integer, actualtemp: Integer, buildingID: String) val hvac = hvacText.map(s => s.split(",")).filter(s => s(0) != "Date").map( s => Hvac(s(0), s(1), s(2).toInt, s(3).toInt, s(6))).toDF() hvac.registerTempTable("hvac") After I have done with my queries with this temp table, how do I remove it ? I checked all docs and it seems I am getting

AWS Redshift driver in Zeppelin

阅读更多关于 AWS Redshift driver in Zeppelin

问题 I want to explore my data in Redshift using notebook Zeppelin. A small EMR cluster with Spark is running behind. I am loading databricks' spark-redshift library %dep z.reset() z.load("com.databricks:spark-redshift_2.10:0.6.0") and then import org.apache.spark.sql.DataFrame val query = "..." val url = "..." val port=5439 val table = "..." val database = "..." val user = "..." val password = "..." val df: DataFrame = sqlContext.read .format("com.databricks.spark.redshift") .option("url", s"jdbc

dynamic interactive dashboard with zeppelin notebook

阅读更多关于 dynamic interactive dashboard with zeppelin notebook

问题 I want to have a more interactive dashboard. like reading the data from database , giving it to select box, onchange of select box send the value and run the query. i want to achieve this using zeppelin bcz on selected value i have to display the analytics. what would be the way to achieve this and is this possible to achieve through zeppelin. i tried with select box, but i couldnot save the selected value and send it to next query and execute that. something like select age, count(1) value

ImportError: No module named sparkdl.image.imageIO

阅读更多关于 ImportError: No module named sparkdl.image.imageIO

问题 i'm doing image classification using spark. i have already imported sparkdl jar(added path of jar in the conf/spark.default) ImportError: No module named sparkdl.image.imageIO at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193) at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234) at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152) at org.apache.spark.sql.execution.python.BatchEvalPythonExec$$anonfun$doExecute$1.apply

Zeppelin 0.7.2 version does not support spark 2.2.0

阅读更多关于 Zeppelin 0.7.2 version does not support spark 2.2.0

问题 How to downgrade the spark version? What could be the other solutions? I have to connect my hive tables to spark using spark session. But the spark version is not supported by zeppelin. 回答1: Here are 2 reasons. [1] Zeppelin 0.7.2 marked spark 2.2+ as the unsupported version. https://github.com/apache/zeppelin/blob/v0.7.2/spark/src/main/java/org/apache/zeppelin/spark/SparkVersion.java#L40 public static final SparkVersion UNSUPPORTED_FUTURE_VERSION = SPARK_2_2_0; [2] Even if you change the

zeppelin imported classes not found when using

阅读更多关于 zeppelin imported classes not found when using

问题 I get a weird error when using spark on zeppelin. The imported classes are not found when I use them. The code sample is : %spark import java.io.Serializable import java.text.{ParseException, SimpleDateFormat} import java.util.{Calendar, SimpleTimeZone} class Pos(val pos: String) extends Serializable { if (pos.length != 12) { throw new IllegalArgumentException(s"[${pos}] seems not a valid pos string") } private val cstFormat = new SimpleDateFormat("yyyyMMddHHmm") private val utcFormat = new

Build zeppelin-0.7.0 master branch with spark 2.0 failed with 'yarn install --no-lockfile' failed

阅读更多关于 Build zeppelin-0.7.0 master branch with spark 2.0 failed with 'yarn install --no-lockfile' failed

问题 i tried to build the zeppelin-0.7.0 master branch downloaded from github, but failed . the build command: mvn package -Pyarn -Pbuild-distr -Pspark-2.0 -Dspark.version=2.0.1 -Phadoop-2.6 -Dhadoop.version=2.6.0 -Pscala-2.11 -Ppyspark -DskipTests -X the output stacktrace is: [ERROR] error Command failed with exit code 1. [INFO] info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command. [INFO] ------------------------------------------------------------------------