Super EDIT:
Adding the broadcast step, will result in ncols to get printed by the two processes by the master node (from which I can check the output). But why? I mean, all variables that are broadcast have already a value in the line of their declaration!!! (off-topic image).
I have some code based on this example.
I checked that cluster configuration is OK, with this simple program, which also printed the IP of the machine that it would run onto:
int main (int argc, char *argv[])
{
  int rank, size;
  MPI_Init (&argc, &argv);      /* starts MPI */
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /* get current process id */
  MPI_Comm_size (MPI_COMM_WORLD, &size);        /* get number of processes */
  printf( "Hello world from process %d of %d\n", rank, size );
  // removed code that printed IP address
  MPI_Finalize();
  return 0;
}
which printed every machine's IP twice.
EDIT_2
If I print (only) the grid, like in the example, I am getting for one computer:
Processes grid pattern:
0 1
2 3
and for two:
Processes grid pattern:
Executables were not the same in both nodes!
When I configured the cluster, I had a very hard time, so when I had a problem with mounting, I just skipped it. So now, the changes would appear only in one node. The code would behave weirdly, unless some (or all) of the code was the same.
来源:https://stackoverflow.com/questions/31833360/mpi-code-does-not-work-with-2-nodes-but-with-1