apache-zeppelin

Spark on kubernetes with zeppelin

情到浓时终转凉″ 提交于 2020-06-27 16:59:09
问题 I am following this guide to run up a zeppelin container in a local kubernetes cluster set up using minikube. https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/quickstart/kubernetes.html I am able to set up zeppelin and run some sample code there. I have downloaded spark 2.4.5 & 2.4.0 source code and built it for kubernetes support with the following command: ./build/mvn -Pkubernetes -DskipTests clean package Once spark is built I created a docker container as explained in the article: bin

Zeppelin k8s: change interpreter pod configuration

风流意气都作罢 提交于 2020-06-17 09:51:48
问题 I've configured my zeppelin on kubernetes using: apiVersion: apps/v1 kind: Deployment metadata: name: zeppelin labels: [...] spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: zeppelin app.kubernetes.io/instance: zeppelin template: metadata: labels: app.kubernetes.io/name: zeppelin app.kubernetes.io/instance: zeppelin spec: serviceAccountName: zeppelin containers: - name: zeppelin image: "apache/zeppelin:0.9.0" imagePullPolicy: IfNotPresent ports: - name: http containerPort:

How Apache Zeppelin computes Spark job progress bar?

淺唱寂寞╮ 提交于 2020-06-17 07:40:29
问题 When starting spark job from Apache Zeppelin notebook interface it shows you a progress bar of job execution. But what does this progress actually mean? Sometimes it shrinks or expands. Is it a progress of current stage or a whole job? 回答1: In the web interface, the progress bar is showing the value returned by the getProgress function (not implemented for every interpeters, such as python). This function returns a percentage. When using the Spark interpreter, the value seems to be the

How Apache Zeppelin computes Spark job progress bar?

时间秒杀一切 提交于 2020-06-17 07:40:07
问题 When starting spark job from Apache Zeppelin notebook interface it shows you a progress bar of job execution. But what does this progress actually mean? Sometimes it shrinks or expands. Is it a progress of current stage or a whole job? 回答1: In the web interface, the progress bar is showing the value returned by the getProgress function (not implemented for every interpeters, such as python). This function returns a percentage. When using the Spark interpreter, the value seems to be the

How to import Delta Lake module in Zeppelin notebook and pyspark?

左心房为你撑大大i 提交于 2020-06-14 07:56:11
问题 I am trying to use Delta Lake in a Zeppelin notebook with pyspark and seems it cannot import the module successfully. e.g. %pyspark from delta.tables import * It fails with the following error: ModuleNotFoundError: No module named 'delta' However, there is no problem to save/read the data frame using delta format. And the module can be loaded successfully if using scala spark %spark Is there any way to use Delta Lake in Zeppelin and pyspark? 回答1: Finally managed to load it on zeppelin pyspark

Zeppelin: How to restart sparkContext in zeppelin

不想你离开。 提交于 2020-05-25 15:27:17
问题 I am using Isolated mode of zeppelins spark interpreter, with this mode it will start a new job for each notebook in spark cluster. I want to kill the job via zeppelin when the notebook execution is completed. For this I did sc.stop this stopped the sparkContext and the job is also stopped from spark cluster. But next time when I try to run the notebook its not starting the sparkContext again. So how to do that? 回答1: It's a bit counter intuitive but you need to access the interpreter menu tab

Zeppelin: How to restart sparkContext in zeppelin

笑着哭i 提交于 2020-05-25 15:27:09
问题 I am using Isolated mode of zeppelins spark interpreter, with this mode it will start a new job for each notebook in spark cluster. I want to kill the job via zeppelin when the notebook execution is completed. For this I did sc.stop this stopped the sparkContext and the job is also stopped from spark cluster. But next time when I try to run the notebook its not starting the sparkContext again. So how to do that? 回答1: It's a bit counter intuitive but you need to access the interpreter menu tab