Hadoop jobs fail when submitted by users other than yarn (MRv2) or mapred (MRv1)

廉价感情. 提交于 2019-12-01 06:03:21

You need to be setting up a staging directory for each user in the cluster. This is not as complicated as it sounds.

Check the following properties:

<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}</value>
<source>core-default.xml</source>
</property>

This basically setups a tmp directory for each user.

Tie this to your staging directory :

<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>${hadoop.tmp.dir}/mapred/staging</value>
<source>mapred-default.xml</source>
</property>

Let me know if this works or if it already setup this way.

These properties should be in yarn-site.xml - if i remember correctly.

This worked for me, I just set this property in MR v1:

<property>
    <name>hadoop.security.authorization</name>
    <value>simple</value>
  </property>

Please go through this:

Access Control Lists ${HADOOP_CONF_DIR}/hadoop-policy.xml defines an access control list for each Hadoop service. Every access control list has a simple format:

The list of users and groups are both comma separated list of names. The two lists are separated by a space.

Example: user1,user2 group1,group2.

Add a blank at the beginning of the line if only a list of groups is to be provided, equivalently a comman-separated list of users followed by a space or nothing implies only a set of given users.

A special value of * implies that all users are allowed to access the service.

Refreshing Service Level Authorization Configuration The service-level authorization configuration for the NameNode and JobTracker can be changed without restarting either of the Hadoop master daemons. The cluster administrator can change ${HADOOP_CONF_DIR}/hadoop-policy.xml on the master nodes and instruct the NameNode and JobTracker to reload their respective configurations via the -refreshServiceAcl switch to dfsadmin and mradmin commands respectively.

Refresh the service-level authorization configuration for the NameNode:

$ bin/hadoop dfsadmin -refreshServiceAcl

Refresh the service-level authorization configuration for the JobTracker:

$ bin/hadoop mradmin -refreshServiceAcl

Of course, one can use the security.refresh.policy.protocol.acl property in ${HADOOP_CONF_DIR}/hadoop-policy.xml to restrict access to the ability to refresh the service-level authorization configuration to certain users/groups.

Examples Allow only users alice, bob and users in the mapreduce group to submit jobs to the MapReduce cluster:

<property>
     <name>security.job.submission.protocol.acl</name>
     <value>alice,bob mapreduce</value>
</property>

Allow only DataNodes running as the users who belong to the group datanodes to communicate with the NameNode:

<property>
     <name>security.datanode.protocol.acl</name>
     <value>datanodes</value>
</property>
Allow any user to talk to the HDFS cluster as a DFSClient:

<property>
     <name>security.client.protocol.acl</name>
     <value>*</value>
</property>
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!