Apache Zeppelin & Spark Streaming: Twitter Example only works local

百般思念 提交于 2019-12-06 13:35:58

问题


I just added the example project to my Zeppelin Notebook from http://zeppelin-project.org/docs/tutorial/tutorial.html (section "Tutorial with Streaming Data"). The problem I now have is that the application seems only to work local. If I change the Spark interpreter setting "master" from "local[*]" to "spark://master:7077" the application won't bring any result anymore when I'm doing the same SQL statement. Am I doing anything wrong? I already restarted the Zeppelin interpreter, also the whole Zeppelin daemon and the Spark cluster, nothing solved the issue! Can someone help.

I use the following installation:

  • Spark 1.5.1 (prebuild for Hadoop 2.6+), Master + 2x Slaves
  • Zeppelin 0.5.5 (installed on Spark's master node)

EDIT Also the following installation won't work for me:

  • Spark 1.5.0 (prebuild for Hadoop 2.6+), Master + 2x Slaves
  • Zeppelin 0.5.5 (installed on Spark's master node)

Screenshot: local setting (works!)

Screenshot: cluster setting (won't work!)

The job seems to run correctly in cluster mode:


回答1:


I got it after 2 days of trying around!

The difference between the local Zeppelin Spark interpreter and the Spark Cluster seems to be, that the local one has included the Twitter Utils which are needed for executing the Twitter Streaming example, and the Spark Cluster doesn't have this library by default.

Therefore you have to add the dependency manually in the Zeppelin Notebook before starting the application with Spark cluster as master. So the first paragraph of the Notebook must be:

%dep
z.reset
z.load("org.apache.spark:spark-streaming-twitter_2.10:1.5.1")

If an error occures on running this paragraph, just try to restart the Zeppelin server via ./bin/zeppelin-daemon.sh stop (& start)!



来源:https://stackoverflow.com/questions/34296894/apache-zeppelin-spark-streaming-twitter-example-only-works-local

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!