mpi | 易学教程

Parallel Merge Sort using MPI

阅读更多关于 Parallel Merge Sort using MPI

问题 i implemented Parallel Merge sort in this code using the tree Structural scheme. but it doesn't sort the Array! could you take look at it and tell me what wrong is ? for communication among the processor i used the normal MPI_send() and MPI_recv(). however i used numbers -0- and -1- and -2- as tags for the fifth argument of MPI_recv() . for 8 processors the tree structural scheme gives the Array to the processor with rank-0- then it splits the array in half an gives the right half to

Using MPI-IO to write Fortran-formatted files

阅读更多关于 Using MPI-IO to write Fortran-formatted files

问题 I am trying to save a solution using the OVERFLOW-PLOT3D q-file format (defined here: http://overflow.larc.nasa.gov/files/2014/06/Appendix_A.pdf). For a single grid, it is basically, READ(1) NGRID READ(1) JD,KD,LD,NQ,NQC READ(1) REFMACH,ALPHA,REY,TIME,GAMINF,BETA,TINF, & IGAM,HTINF,HT1,HT2,RGAS1,RGAS2, & FSMACH,TVREF,DTVREF READ(1) ((((Q(J,K,L,N),J=1,JD),K=1,KD),L=1,LD),N=1,NQ) All of the variables are double precision numbers, excepts for NGRID, JD, KD, LD, NQ, NQC and IGAM which are

Using MPI-IO to write Fortran-formatted files

阅读更多关于 Using MPI-IO to write Fortran-formatted files

Spreading a job over different nodes of a cluster in sun grid engine (SGE)

阅读更多关于 Spreading a job over different nodes of a cluster in sun grid engine (SGE)

问题 I'm tryin get sun gridending (sge) to run the separate processes of an MPI job over all of the nodes of my cluster. What is happening is that each node has 12 processors, so SGE is assigning 12 of my 60 processes to 5 separate nodes. I'd like it to assign 2 processes to each of the 30 nodes available, because with 12 processes (dna sequence alignments) running on each node, the nodes are running out of memory. So I'm wondering if it's possible to explicitly get SGE to assign the processes to

Reusable private dynamically allocated arrays in OpenMP

阅读更多关于 Reusable private dynamically allocated arrays in OpenMP

问题 I am using OpenMP and MPI to parallelize some matrix operations in c. Some of the functions operating on the matrix are written in Fortran. The Fortran functions require a buffer array to be passed in which is only used internally in the function. Currently I am allocating buffers in each parallel section similar to the code below. int i = 0; int n = 1024; // Actually this is read from command line double **a = createNbyNMat(n); #pragma omp parallel { double *buf; buf = malloc(sizeof(double)

Reusable private dynamically allocated arrays in OpenMP

阅读更多关于 Reusable private dynamically allocated arrays in OpenMP

MPI communication complexity

阅读更多关于 MPI communication complexity

问题 I'm studying the communication complexity of a parallel implementation of Quicksort in MPI and I've found something like this in a book: "A single process gathers p regular samples from each of the other p-1 processes. Since relatively few values are being passed, message latency is likely to be the dominant term of this step. Hence the communication complexity of the gather is O(log p)" (O is actually a theta and p is the number of processors). The same affirmation is made for the broadcast

MPI communication complexity

阅读更多关于 MPI communication complexity

how to install Openmpi for xcode?

阅读更多关于 how to install Openmpi for xcode?

问题 I'm trying to run some MPI programs in xcode 4. I installed openmpi from MacPort by typing sudo port install openmpi and the installation finished normally. Then I added opt/local/include/openmpi to my user header search paths, dragged the "libmpi.dylib" and "libmpi_cxx.dylib" into my project. But then when I tried to run the program, I got the following error message: Undefined symbols for architecture x86_64: "_MPI_Comm_accept", referenced from: MPI::Intracomm::Accept(char const*, MPI::Info

NBody problem parallelization gives different results for the same input

阅读更多关于 NBody problem parallelization gives different results for the same input

问题 This an MPI version of the NBody problem. I already have an OpenMP version and its results are the same as the nbody version with one thread, but the MPI results differ, above all at the last interactions. At the first interactions, the outputs are quite similar but at the end, the outputs differ a lot. #include <stdlib.h> #include <algorithm> #include <iostream> #include <fstream> #include <sstream> #include <cstring> #include <vector> #include <cstdlib> #include <chrono> #include <stdio.h>