Spark submit to yarn as a another user

前端 未结 4 1997
一向
一向 2020-12-16 02:10

Is it possible to submit a spark job to a yarn cluster and choose, either with the command line or inside the jar, which user will \"own\" the job?

The spark-submit

4条回答
  •  再見小時候
    2020-12-16 02:49

    Another (much safer) approach is to use proxy authentication - basically you create a service account and then allow it to impersonate to other users.

    $ spark-submit --help 2>&1 | grep proxy
      --proxy-user NAME           User to impersonate when submitting the application.
    

    Assuming Kerberized / secured cluster.

    I mentioned it's much safer because you don't need to store (and manage) keytabs of alll users you will have to impersonate to.

    To enable impersonation, there are several settings you'd need to enable on Hadoop side to tell which account(s) can impersonate which users or groups and on which servers. Let's say you have created svc_spark_prd service account/ user.

    hadoop.proxyuser.svc_spark_prd.hosts - list of fully-qualified domain names for servers which are allowed to submit impersonated Spark applications. * is allowed but nor recommended for any host.

    Also specify either hadoop.proxyuser.svc_spark_prd.users or hadoop.proxyuser.svc_spark_prd.groups to list users or groups that svc_spark_prd is allowed to impersonate. * is allowed but not recommended for any user/group.

    Also, check out documentation on proxy authentication.

    Apache Livy for example uses this approach to submit Spark jobs on behalf of other end users.

提交回复
热议问题