问题
Oozie SSH Action Issue:
Issue: We are trying to run few commands on a particular host machine of our cluster. We chose SSH Action for the same. We have been facing this SSH issue for some time now. What might be the real issue here? Please point me towards the solution.
logs:
AUTH_FAILED: Not able to perform operation [ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 USER@1.2.3.4 mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/ ] | ErrorStream: Warning: Permanently added host,1.2.3.4 (RSA) to the list of known hosts. Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
org.apache.oozie.action.ActionExecutorException: AUTH_FAILED: Not able to perform operation [ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 user@1.2.3.4 mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/ ] | ErrorStream: Warning: Permanently added 1.2.3.4,192.168.34.208 (RSA) to the list of known hosts. Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
at org.apache.oozie.action.ssh.SshActionExecutor.execute(SshActionExecutor.java:589)
at org.apache.oozie.action.ssh.SshActionExecutor.start(SshActionExecutor.java:204)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:211)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:59)
at org.apache.oozie.command.XCommand.call(XCommand.java:277)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:326)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:255)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Not able to perform operation [ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 user@1.2.3.4 mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/ ] | ErrorStream: Warning: Permanently added '1.2.3.4,1.2.3.4' (RSA) to the list of known hosts. Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
at org.apache.oozie.action.ssh.SshActionExecutor.executeCommand(SshActionExecutor.java:340)
at org.apache.oozie.action.ssh.SshActionExecutor.setupRemote(SshActionExecutor.java:373)
at org.apache.oozie.action.ssh.SshActionExecutor$1.call(SshActionExecutor.java:206)
at org.apache.oozie.action.ssh.SshActionExecutor$1.call(SshActionExecutor.java:204)
at org.apache.oozie.action.ssh.SshActionExecutor.execute(SshActionExecutor.java:547)
... 10 more
2013-10-09 12:48:25,982 WARN org.apache.oozie.command.wf.ActionStartXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[0000000-131008185935754-oozie-oozi-W@action1] Suspending Workflow Job id=0000000-131008185935754-oozie-oozi-W 2013-10-09 12:48:27,204 WARN org.apache.oozie.command.coord.CoordActionUpdateXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[0000000-131008185935754-oozie-oozi-W@action1] E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100 2013-10-09 12:59:57,477 INFO org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] STARTED WorkflowKillXCommand for jobId=0000000-131008185935754-oozie-oozi-W 2013-10-09 12:59:57,685 WARN org.apache.oozie.command.coord.CoordActionUpdateXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100 2013-10-09 12:59:57,686 INFO org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] ENDED WorkflowKillXCommand for jobId=0000000-131008185935754-oozie-oozi-W 2013-10-09 13:41:32,654 WARN org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E0725: Workflow instance can not be killed, 0000000-131008185935754-oozie-oozi-W, Error Code: E0725 2013-10-09 13:41:45,199 WARN org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E0725: Workflow instance can not be killed, 0000000-131008185935754-oozie-oozi-W, Error Code: E0725 2013-10-09 13:42:04,869 WARN org.apache.oozie.command.wf.ResumeXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E1100: Command precondition does not hold before execution, [workflow's status is KILLED is not SUSPENDED], Error Code: E1100 2013-10-09 13:45:56,357 WARN org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E0725: Workflow instance can not be killed, 0000000-131008185935754-oozie-oozi-W, Error Code: E0725
Approached tried:
- Password-less SSH set
- User proxies set
- Giving permissions to the required folders
Thanks;
Kasa.
回答1:
I just hit a similar problem. I had a case where I could run as USER:
ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 USER@1.2.3.4 mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/
by hand on the command line and it worked, but when launched via Oozie as USER it failed.
The reason, in my case, it failed is that I set up passwordless ssh between USER on the oozie server and USER on the remote machine. What one needs to do is set up passwordless ssh between oozie on the oozie server and USER on the remote machine. In other words, su to oozie on the oozie server and run the above command by hand. If it fails, it will fail in Oozie. If it works, then it should work in Oozie (assuming all else is correct, like dir permissions, etc.)
Take a look at what user your oozie server is running as:
ps -ef | grep oozie
Whatever user that is needs passwordless ssh to USER on the remote machine.
回答2:
Whatever quux00 has answered is right. I am just adding few points to that. As the command ssh in the ssh-action will be executed by oozie user, then you will need to set oozie as a bash user.
To do that you need to change the /etc/passwd file on all the nodes of the cluster. Look for the below value (similar to it) in the /etc/passwd file.
oozie:x:488:487:Oozie User:/var/lib/oozie:/bin/false
and change it to
oozie:x:488:487:Oozie User:/var/lib/oozie:/bin/bash
which will actually make oozie user a bash user. And then proceed with the password-less authentication between the oozie user and any other user that you want on any of the host machine.
And then try to rerun the oozie job again. And let me know if it works. Hope it helps!!!
回答3:
This is a very tricky problem and I could only hack it. I wasnt satisfied with the answer given so here my my version. Following failed for me( I could see in logs )
ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 user@XXX.XX.XX.XXX mkdir -p oozie-oozi/0000067-130808155814753-oozie-oozi-W/mysshjob--ssh/
But if tried the same command but removed KbdInteractiveDevices=no or changed KbdInteractiveDevices=pam it worked
ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=pam -o StrictHostKeyChecking=no -o ConnectTimeout=20 user@XXX.XX.XX.XXX mkdir -p oozie-oozi/0000067-130808155814753-oozie-oozi-W/mysshjob--ssh/
Anyway I think there was some issue with old ssh key so I tried following and it works
$ ssh-keygen -t dsa
$ cat ~/.ssh/id_dsa.pub > ~/.ssh/authorized_keys2
回答4:
After following all the above suggestion
oozie:x:488:487:Oozie User:/var/lib/oozie:/bin/false
and change it to
oozie:x:488:487:Oozie User:/var/lib/oozie:/bin/bash
Just try these steps:
Create a password-less communication use below process:
sudo su - oozie oozie@localhost: ssh-keygen -t dsa
copy the public key generated to your local remote server like
apps@XXXXXXX
try
ssh apps@XXXXXXX
, you will login to remote without error- go to HUE and select SSH action and give your BASH command like
bash -x yourscript
parameter - save
- submit
来源:https://stackoverflow.com/questions/19272430/oozie-ssh-action