HPX minimal two node example set-up?

痴心易碎 提交于 2019-12-03 03:10:00

Option 1: When using TCP/IP for the networking layer (usually the default):

In order for an HPX application to be able to find all connected nodes, the following information has to be provided outside of batch environments:

locality 0:
./yourapp --hpx:localities=2 --hpx:agas=node0:7910 --hpx:hpx=node0:7910 

locality 1:
./yourapp --hpx:agas=node0:7910 --hpx:hpx=node1:7910 --hpx:worker

Where node0 and node1 are the hostnames of those nodes and 7910 is an (arbitrary) TCP/IP port to use.

In other words,

  • on node0 you specify the port where HPX will listen for incoming messages on this node (--hpx:hpx=node0:7910) and the port where the main instance of the Active Global Address Space (AGAS) engine will listen (this will be used for other nodes to establish the initial connection (--hpx:agas=node0:7910). You also specify that overall 2 localities will connect (--hpx:localities=2).
  • on node1 (and all other nodes you want to connect) you specify the port where HPX will listen for incoming messages on this node (--hpx:hpx=node1:7910) and the port where the main AGAS engine can be reached on locality 0 (--hpx:agas=node0:7910). You also specify that this locality is a worker (not the 'console'), which is done by the --hpx:worker command line option.

Note that all of those options have one-letter shortcuts (--hpx:localities == -l, --hpx:hpx == -x, --hpx:agas == -a, and --hpx:worker == -w)

You can also run more than one locality on the same physical compute node (or your laptop). In this case it's a bit less tedious to specify things, for instance:

./yourapp -l2 -0 &
./yourapp -1

If you want to use the extended command line options in this case, make sure the ports used for -x are unique across all localities which run on the same node.

Option 2: When using MPI (requires special build time configuration):

Just use mpirun to execute your application. It will pick up the settings either from your batch environment or it will use the commandline options to run things. For instance:

mpirun -N1 -np2 ./yourapp

this will run two instances of your application on the current compute node.

I am unable to make a comment on an existing answer, so I shall repeat some information from the answer of @hkaiser : on the console/master node or what we would normally think of as rank0 you should use a command of the form

`bin/hello_world -l2 --hpx:agas=xx.xx.xx.AA:7910 --hpx:hpx=xx.xx.xx.AA:7910 `

and on the worker node you should use

`bin/hello_world --hpx:agas=xx.xx.xx.AA:7910 --hpx:hpx=xx.xx.xx.BB:7910 --hpx:worker`

But it is important that the ip address that you use is the one returned by the external network of the nodes and not an internal network (in the case of multiple NIC/IP addresses). To be sure I get the right address, I usually run the command

ip route get 8.8.8.8 | awk 'NR==1 {print $NF}'

on each node and use the output from that when testing.

Note that this IP address specification is only necessary when you are launching jobs by hand not using mpirun or srun to launch the jobs as those commands will spawn the jobs on the nodes allocated by the batch system and the communication will be correctly handled by the HPX internals. When using a batch system, but launching jobs by hand anyway (from within an interactive shell for example, you will find that adding the option --hpx:ignore-batch-env to your command line will help stop HPX from picking up unwanted params.

I tried with git commit 0c3174572ef5d2c from the HPX repo this morning and my result looks as follows

Master Node

bin/hello_world --hpx:agas=148.187.68.38:7910 --hpx:hpx=148.187.68.38:7910 -l2 --hpx:threads=4 hello world from OS-thread 3 on locality 1 hello world from OS-thread 6 on locality 1 hello world from OS-thread 2 on locality 1 hello world from OS-thread 7 on locality 1 hello world from OS-thread 5 on locality 1 hello world from OS-thread 0 on locality 1 hello world from OS-thread 4 on locality 1 hello world from OS-thread 1 on locality 1 hello world from OS-thread 0 on locality 0 hello world from OS-thread 2 on locality 0 hello world from OS-thread 1 on locality 0 hello world from OS-thread 3 on locality 0

Worker Node

bin/hello_world --hpx:agas=148.187.68.38:7910 --hpx:hpx=148.187.68.36:7910 --hpx:worker --hpx:threads=8

Note that it is ok to use different numbers of threads on different nodes as I have done here (but usually the nodes are homogeneous so you use the same number of threads).

Parcelport

if you have compiled with support for MPI (for example) and you want to be sure that the TCP parcelport is used, then add

-Ihpx.parcel.tcp.enable=1 -Ihpx.parcel.mpi.enable=0

to your command line (on all nodes) to make HPX selects the TCP parcelport.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!