How to set up Zeppelin to work with remote EMR Yarn cluster

守給你的承諾、 提交于 2019-12-03 08:30:25

[1] install Zeppelin with proper params:

git clone https://github.com/apache/incubator-zeppelin.git ~/zeppelin;
cd ~/zeppelin;
mvn clean package -Pspark-1.4 -Dhadoop.version=2.6.0 -Phadoop-2.6 -Pyarn -DskipTests

[2] Update EMR_MASTER EC2 security groups to accept incoming requests from all ports, to communicate with Zeppelin (should be specific port, not yet know which)

[3] Copy directory EMR_MASTER:/etc/hadoop/conf to MY_STANDALONE_SERVER:/home/zeppelin/hadoop-conf.

[4] zeppelin/conf/zeppelin-env.sh should contain:

export MASTER=yarn-client
export HADOOP_CONF_DIR=/home/zeppelin/hadoop-conf

Note: Spark parameters like spark.executor.instances are taken from Interpreter settings, is specified there.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!