Spark not installed on EMR cluster

北战南征 提交于 2019-12-08 13:13:23

问题


I have been using Spark on an EMR cluster for a few weeks now without problems - the setup was with the AMI 3.8.0 and Spark 1.3.1, and I passed '-x' as an argument to Spark (without this it didn't seem to be installed).

I want to upgrade to a more recent version of Spark and today spun up a cluster with the emr-4.1.0 AMI, containing Spark 1.5.0. When the cluster is up it claims to have successfully installed Spark (at least on the cluster management page on AWS) but when I ssh into 'hadoop@[IP address]' I don't see anything in the 'hadoop' directory, where in the previous version Spark was installed (I've also tried with other applications and had the same result, and tried to ssh in as ec2-user but Spark is also not installed there). When I spin up the cluster with the emr-4.1.0 AMI I don't have the option to pass the '-x' argument to Spark, and I'm wondering if there is something I'm missing.

Does anyone know what I'm doing wrong here?

Many thanks.


回答1:


This was actually solved, rather trivially.

In the previous AMI all of the paths to Spark and other applications were soft links available in the hadoop folder. In the newer AMI these have been removed but the applications are still installed and can be accessed by 'spark-shell' (for example) at the command line.



来源:https://stackoverflow.com/questions/33619400/spark-not-installed-on-emr-cluster

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!