MPI_ERR_TRUNCATE: On Broadcast

匿名 (未验证) 提交于 2019-12-03 02:31:01

问题:

I have an int I intend to broadcast from root (rank==(FIELD=0)).

int winner  if (rank == FIELD) {     winner = something; }  MPI_Barrier(MPI_COMM_WORLD); MPI_Bcast(&winner, 1, MPI_INT, FIELD, MPI_COMM_WORLD); MPI_Barrier(MPI_COMM_WORLD); if (rank != FIELD) {     cout << rank << " informed that winner is " << winner << endl; } 

But it appears I get

[JM:6892] *** An error occurred in MPI_Bcast [JM:6892] *** on communicator MPI_COMM_WORLD [JM:6892] *** MPI_ERR_TRUNCATE: message truncated [JM:6892] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort 

Found that I can increase the buffer size in Bcast

MPI_Bcast(&winner, NUMPROCS, MPI_INT, FIELD, MPI_COMM_WORLD); 

Where NUMPROCS is number of running processes. (actually seems like I just need it to be 2). Then it runs, but gives unexpected output ...

1 informed that winner is 103 2 informed that winner is 103 3 informed that winner is 103 5 informed that winner is 103 4 informed that winner is 103 

When I cout the winner, it should be -1

回答1:

There is an error early in your code:

if (rank == FIELD) {    // randomly place ball, then broadcast to players    ballPos[0] = rand() % 128;    ballPos[1] = rand() % 64;    MPI_Bcast(ballPos, 2, MPI_INT, FIELD, MPI_COMM_WORLD); } 

This is a very common mistake. MPI_Bcast is a collective operation and it must be called by all processes in order to complete. What happens in your case is that this broadcast is not called by all processes in MPI_COMM_WORLD (but only by the root) and hence interferes with the next broadcast operation, namely the one inside the loop. The second broadcast operation actually receives messages sent by the first one (two int elements) into a buffer for just one int and hence the truncation error message. In Open MPI each broadcast uses internally the same message tag values and hence different broadcasts can interfere with each other in not issued in sequence. This is compliant with the (old) MPI standard - one cannot have more than one outstanding collective operations in MPI-2.2 (in MPI-3.0 one can have several outstanding non-blocking collective operations). You should rewrite the code as:

if (rank == FIELD) {    // randomly place ball, then broadcast to players    ballPos[0] = rand() % 128;    ballPos[1] = rand() % 64; } MPI_Bcast(ballPos, 2, MPI_INT, FIELD, MPI_COMM_WORLD); 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!