I wonder when do I need to use barrier? Do I need it before/after a scatter/gather for example? Or should OMPI ensure all processes have reached that point before scatter/ga
May MPI_Barrier() is not often used, but it is useful. In fact, even if you were use the synchronous communication, the MPI_Send/Recv() can only make sure the two processes is synchronized. In my project, a cuda+MPI project, all i used is asynchronous communication. I found that in some cases if i dont use the MPI_Barrier() followed by the Wait() function, the situation that two processes(gpu) want to transmit data to each other at the same time is very likely to happen, which could badly reduce the program efficiency. The bug above ever divers me mad and take me a few days to find it. Therefore you may think carefully whether use the MPI_Barrier() when you used the MPI_Isend/Irecv in your program. Sometimes sync the processes is not only neccessary but also MUST, especially ur program is dealing with the device.