openmp | 易学教程

Indent openmp directives as C/C++ code in emacs

阅读更多关于 Indent openmp directives as C/C++ code in emacs

问题 In a previous question, I learned how to indent macros as regular C code on emacs. I just need this because of #pragma omp directives from OpenMP, but I'd like to maintain all other kind of macros, like #if and #endif , indented as the default. (c-set-offset (quote cpp-macro) 0 nil) The rule above treats all macros as the same. My question is: Is there a way to specialize this rule? 回答1: If you look at M-x describe-variable c-offsets-alist , which defines a list of variables that represent

OpenMP/C++: Parallel for loop with reduction afterwards - best practice?

阅读更多关于 OpenMP/C++: Parallel for loop with reduction afterwards - best practice?

问题 Given the following code... for (size_t i = 0; i < clusters.size(); ++i) { const std::set<int>& cluster = clusters[i]; // ... expensive calculations ... for (int j : cluster) velocity[j] += f(j); } ...which I would like to run on multiple CPUs/cores. The function f does not use velocity . A simple #pragma omp parallel for before the first for loop will produce unpredictable/wrong results, because the std::vector<T> velocity is modified in the inner loop. Multiple threads may access and (try

OpenMP Crashing with Large Arrays

阅读更多关于 OpenMP Crashing with Large Arrays

问题 I'm using Fortran and OpenMP, but I keep encountering an issue when I try to parallelize loops using OpenMP when there are large arrays. For example, the following code: PROGRAM main IMPLICIT NONE INTEGER, PARAMETER :: NUMLOOPS = 300000 REAL(8) :: TESTMAT(NUMLOOPS) INTEGER :: i,j !$OMP PARALLEL SHARED(TESTMAT) !$OMP DO DO i=1,NUMLOOPS TESTMAT(i) = i END DO !$OMP END DO !$OMP END PARALLEL write(*,*) SUM(TESTMAT)/(NUMLOOPS) END PROGRAM main compiled using this Makefile: .SUFFIXES: .f90 F90 =

OpenMP parallel for reduction delivers wrong results

阅读更多关于 OpenMP parallel for reduction delivers wrong results

问题 I am working with a signal matrix and my goal is to calculate the sum of all elements of a row. The matrix is represented by the following struct: typedef struct matrix { float *data; int rows; int cols; int leading_dim; } matrix; I have to mention the matrix is stored in column-major order (http://en.wikipedia.org/wiki/Row-major_order#Column-major_order), which should explain the formula column * tan_hd.rows + row for retrieving the correct indices. for(int row = 0; row < tan_hd.rows; row++)

Is _mm256_store_ps() function is atomic ? while using alongside openmp

阅读更多关于 Is _mm256_store_ps() function is atomic ? while using alongside openmp

问题 I am trying to create a simple program that uses Intel's AVX technology and perform vector multiplication and addition. Here I am using Open MP alongside this. But it is getting segmentation fault due to the function call _mm256_store_ps(). I have tried with OpenMP atomic features like atomic, critical, etc so that if this function is atomic in nature and multiple cores are attempting to execute at the same time, but it is not working. #include<stdio.h> #include<time.h> #include<stdlib.h>

How to yield/resume OpenMP untied tasks correctly?

阅读更多关于 How to yield/resume OpenMP untied tasks correctly?

问题 I wrote a small C program to assess OpenMP's capability to yield to another task when idle time in a task occurs (e.g. wait for communicated data): #include <stdio.h> #include <sys/time.h> #include <omp.h> #define NTASKS 10 double wallClockTime(void) { struct timeval t; gettimeofday(&t, NULL); return (double)(t.tv_sec + t.tv_usec/1000000.); } void printStatus(char *status, int taskNum, int threadNum) { #pragma omp critical(printStatus) { int i; for (i = 0; i < taskNum; i++) printf(" ");

Compilation error when using Xcode 9.0 with clang (cannot specify -o when generating multiple output files)

阅读更多关于 Compilation error when using Xcode 9.0 with clang (cannot specify -o when generating multiple output files)

问题 I updated my Xcode yesterday (version 9.0) and since then I cannot compile my code with clang anymore. It works great with with apple native compiler, but gives a compilation error with clang from macports. I will explain with more details now... I usually use clang 4.0 because it has openmp support and I change in Xcode by creating a user-defined setting as in the following figure. Image with how to use clang 4.0 from macports in Xcode This has been working perfectly for some time until I

How to parallelize an array shift with OpenMP?

阅读更多关于 How to parallelize an array shift with OpenMP?

问题 How can I parallelize an array shift with OpenMP? I've tryed a few things but didn't get any accurate results for the following example (which rotates the elements of an array of Carteira objects, for a permutation algorithm): void rotaciona(int i) { Carteira aux = this->carteira[i]; for(int c = i; c < this->size - 1; c++) { this->carteira[c] = this->carteira[c+1]; } this->carteira[this->size-1] = aux; } Thank you very much! 回答1: This is an example of a loop with loop-carried dependencies,

Upload with paperclip very slow (unicorn)

阅读更多关于 Upload with paperclip very slow (unicorn)

问题 Sitting here with a simple rails 3 app in which I have a simple Gallery model and each gallery has many images. The image model is extended with paperclip and with the following options has_attached_file :local, :styles => { :large => "800x800>", :medium => "300x300>", :thumb => "100x100#", :small => "60x60#" } In my galleries_controller I have the following action that is implemented in order to work with the jQuery-File-Upload plugin. thereby the json response. def add_image gallery =

OpenMP and C++11 multithreading

阅读更多关于 OpenMP and C++11 multithreading

问题 I am currently working on a project that mixes high-performance computing (HPC) and interactivity. As such, the HPC part relies on OpenMP (mainly for-loops with lots of identical computations) but it is included in a larger framework with a GUI and multithreading, currently achieved with c++11 threads ( std::thread and std::async ). I have read Does OpenMP play nice with C++ promises and futures? and Why do c++11 threads become unjoinable when using nested OpenMP pragmas? that it is no good