Unable to execute MPICH2 on multiple machines on ubuntu 12.04 (HYDU_sock_connect issue)

烂漫一生 提交于 2019-12-25 08:48:33

问题


I am facing difficulty in executing MPI program on two machines. The OS is Ubuntu 12.04. And the MPI implementation is MPICH2

ssh is working fine:

  root@ubuntu:/home# ssh 192.168.1.9
root@gpuguy's password: 
Welcome to Ubuntu 12.04.3 LTS (GNU/Linux 3.8.0-29-generic i686)

 * Documentation:  https://help.ubuntu.com/

131 packages can be updated.
67 updates are security updates.

Last login: Thu Oct 24 17:36:25 2013 from ubuntu.local
root@gpuguy:~# 

But when I run my MPI programs it fails:

root@ubuntu:/home# mpiexec -f hosts.cfg -n 4 hello
root@192.168.1.9's password:
[proxy:0:0@gpuguy] HYDU_sock_connect (./utils/sock/sock.c:171): unable to get host address for ubuntu (1)
[proxy:0:0@gpuguy] main (./pm/pmiserv/pmip.c:209): unable to connect to server ubuntu at port 42104 (check for firewalls!)

I have already disabled firewall on both machines that is the reason I can do ssh successfully. But how to solve this issue?

My MPI code runs successfully on single machine.


回答1:


For MPICH (or any MPI implementation) to work, you need to have passwordless SSH set up. I should also mention that you really shouldn't have to be logged in as root to make this work. It's generally a very bad idea to be logged in as root all of the time.



来源:https://stackoverflow.com/questions/19565795/unable-to-execute-mpich2-on-multiple-machines-on-ubuntu-12-04-hydu-sock-connect

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!