How do I set an environment variable in a YARN Spark job?

…衆ロ難τιáo~ 提交于 2019-11-29 08:03:33

So I discovered the answer to this while writing the question (sorry, reputation seekers). The problem is that CDH5 uses Spark 1.0.0, and that I was running the job via YARN. Apparently, YARN mode does not pay any attention to the executor environment and instead uses the environment variable SPARK_YARN_USER_ENV to control its environment. So ensuring SPARK_YARN_USER_ENV contains ACCUMULO_CONF_DIR=/etc/accumulo/conf works, and makes ACCUMULO_CONF_DIR visible in the environment at the indicated point in the question's source example.

This difference in how standalone mode and YARN mode work resulted in SPARK-1680, which is reported as fixed in Spark 1.1.0.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!