slurm: How to connect front-end with compute nodes?

岁酱吖の 提交于 2019-12-06 00:14:40

In the configuration file, try removing ControlAddr=127.0.0.1; or replacing with the IP address of ebloc. This 127.0.0.1 address basically means 'myself' and ControlAddr is used by slurmd to connect to the controller.

Remove also NodeHostName=localhost NodeAddr=127.0.0.1 for the same reason.

And make sure that ebloc and ebloc1 and ebloc2 are indeed what hostname -s returns on those machines.

Also make sure no firewall blocs the Slurm ports in any direction between those machines, and that SELinux is disabled or permissive. Make sure slurmd runs, as well as munge.

Brian Andrus

You can only have one PartitionName line per partition. Remove the first one and put:

PartitionName = debug Nodes=ebloc2,ebloc4 Default=YES MaxTime=INFINITE State=UP

or use regexp:

PartitionName = debug Nodes=ebloc[2,4] Default=YES MaxTime=INFINITE State=UP
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!