openmpi

Configure MPI hostsfile to use multiple user identities

戏子无情 提交于 2019-12-01 00:16:26
I want to run a program with mpirun on different sets of machines (All linux machines with Open Mpi 1.5). Right now I have machines where I can log on with username A , and another set of machines where I use the username B . All machines are accessible via ssh, but I can't figure out how to achieve this. My hosts file would be like this : localhost #username local machine_set_A_1 #username A machine_set_A_2 #username A ... machine_set_B_1 #username B machine_set_B_2 #username B ... Is it possible to achieve this. Thank you. The OpenSSH client supports per-host configurations, something

Is it possible to send data from a Fortran program to Python using MPI?

帅比萌擦擦* 提交于 2019-11-30 22:39:34
问题 I am working on a tool to model wave energy converters, where I need to couple two software packages to each other. One program is written in Fortran, the other one in C++. I need to send information from the Fortran program to the C++ program at each time step. However, the data first needs to be processed in Python before it is sent to the C++ program. I have received a tip to use MPI to transfer the data between the programs. I am now trying to send a simple string from the Fortran code to

mpirun - not enough slots available

守給你的承諾、 提交于 2019-11-30 17:09:37
Usually when I use mpirun, I can "overload" it, using more processors than there acctually are on my computer. For example, on my four-core mac, I can run mpirun -np 29 python -c "print 'hey'" no problem. I'm on another machine now, which is throwing the following error: $ mpirun -np 25 python -c "print 'hey'" -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 25 slots that were requested by the application: python Either request fewer slots for your application, or make more slots available for use. ----

mpirun - not enough slots available

懵懂的女人 提交于 2019-11-30 16:28:01
问题 Usually when I use mpirun, I can "overload" it, using more processors than there acctually are on my computer. For example, on my four-core mac, I can run mpirun -np 29 python -c "print 'hey'" no problem. I'm on another machine now, which is throwing the following error: $ mpirun -np 25 python -c "print 'hey'" -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 25 slots that were requested by the

Having Open MPI related issues while making CUDA 5.0 samples (Mac OS X ML)

雨燕双飞 提交于 2019-11-30 09:11:11
问题 When I'm trying to make CUDA 5.0 samples an error appears: Makefile:79: * MPI not found, not building simpleMPI.. Stop. I've tried to download and build the latest version of Open MPI reffering to Open MPI "FAQ / Platforms / OS X / 6. How do I not use the OS X-bundled Open MPI ?" page and it did not solve the error. make -j 4 2>&1 | tee make.out [ lots of output ] make[2]: *** [mpi/man/man3/MPI_Comm_disconnect.3] Error 127 make[2]: *** Waiting for unfinished jobs.... make[1]: *** [all

fault tolerance in MPICH/OpenMPI

核能气质少年 提交于 2019-11-30 08:17:49
问题 I have two questions- Q1 . Is there a more efficient way to handle the error situation in MPI, other than check-point/rollback? I see that if a node "dies", the program halts abruptly.. Is there any way to go ahead with the execution after a node dies ?? (no issues if it is at the cost of accuracy) Q2 . I read in "http://stackoverflow.com/questions/144309/what-is-the-best-mpi-implementation", that OpenMPI has better fault tolerance and recently MPICH-2 has also come up with similar features..

Initializing MPI cluster with snowfall R

早过忘川 提交于 2019-11-30 05:16:25
问题 I've been trying to run Rmpi and snowfall on my university's clusters but for some reason no matter how many compute nodes I get allocated, my snowfall initialization keeps running on only one node. Here's how I'm initializing it: sfInit(parallel=TRUE, cpus=10, type="MPI") Any ideas? I'll provide clarification as needed. 回答1: To run an Rmpi-based program on a cluster, you need to request multiple nodes using your batch queueing system, and then execute your R script from the job script via a

what does it mean configuring MPI for shared memory?

老子叫甜甜 提交于 2019-11-30 03:56:13
I have a bit of research related question. Currently I have finished implementation of structure skeleton frame work based on MPI (specifically using openmpi 6.3 ). the frame work is supposed to be used on single machine. now, I am comparing it with other previous skeleton implementations (such as scandium , fast-flow , ..) One thing I have noticed is that the performance of my implementation is not as good as the other implementations. I think this is because, my implementation is based on MPI (thus a two sided communication that require the match of send and receive operation) while the

When do I need to use MPI_Barrier()?

穿精又带淫゛_ 提交于 2019-11-30 01:31:35
I wonder when do I need to use barrier? Do I need it before/after a scatter/gather for example? Or should OMPI ensure all processes have reached that point before scatter/gather-ing? Similarly, after a broadcast can I expect all processes to already receive the message? Markus Mayr All collective operations in MPI before MPI-3.0 are blocking, which means that it is safe to use all buffers passed to them after they return. In particular, this means that all data was received when one of these functions returns. (However, it does not imply that all data was sent!) So MPI_Barrier is not necessary

Having Open MPI related issues while making CUDA 5.0 samples (Mac OS X ML)

时光毁灭记忆、已成空白 提交于 2019-11-29 13:14:21
When I'm trying to make CUDA 5.0 samples an error appears: Makefile:79: * MPI not found, not building simpleMPI.. Stop. I've tried to download and build the latest version of Open MPI reffering to Open MPI "FAQ / Platforms / OS X / 6. How do I not use the OS X-bundled Open MPI ?" page and it did not solve the error. make -j 4 2>&1 | tee make.out [ lots of output ] make[2]: *** [mpi/man/man3/MPI_Comm_disconnect.3] Error 127 make[2]: *** Waiting for unfinished jobs.... make[1]: *** [all-recursive] Error 1 make: *** [all-recursive] Error 1 I'm realy confused for now I have no idea what to do. As