openmp

Red-Black Gauss Seidel and OpenMP

≡放荡痞女 提交于 2019-12-10 21:05:36
问题 I was trying to prove a point with OpenMP compared to MPICH, and I cooked up the following example to demonstrate how easy it was to do some high performance in OpenMP. The Gauss-Seidel iteration is split into two separate runs, such that in each sweep every operation can be performed in any order, and there should be no dependency between each task. So in theory each processor should never have to wait for another process to perform any kind of synchronization. The problem I am encountering,

omp max reduction with storage of index

自闭症网瘾萝莉.ら 提交于 2019-12-10 19:45:36
问题 Using c++ openmp 3.1 I implemented a max reduction which stores the maximum value of integer variable (score) of an vector of objects (s). But I also want to store the vector index to acces the (s) object with the maximum score. My current unsuccesfull implementation looks like this: //s is a vector of sol objects which contain apart from other variables an integer score variable s[].score int bestscore = 0; int bestant = 0; #pragma omp parallel shared(bestant) {//start parallel session

TBB concurrent_vector with openmp

我们两清 提交于 2019-12-10 19:37:23
问题 Can we use TBB concurrent_vector with openmp? Will concurrent updates be allowed? 回答1: Yes, TBB's concurrent data structures are meant to be thread-safe, which means whatever threading paradigms, such as OpenMP, TBB, Cilk, PPL, and etc, are okay to use TBB's concurrent data structures. This is because concurrent_vector is simply a data structure class rather than threading-related control code. Furthermore, TBB's mutex can be also used within OpenMP, Cilk, and PPL. 回答2: Per Section 1.11 of

OpenGL with OpenMP always segfault

一个人想着一个人 提交于 2019-12-10 19:30:12
问题 I have in my program a loop that will fill an 3D cube with pixels (GL_POINTS), so to speed up things a little I thought i could use OpenMP and separate this for loop in my multi-core processor. The problem is that any time I use OpenMP in the loop the program segfaults, here is the code of the loop: glBegin(GL_POINTS); #pragma omp parallel for for (int a = 0; a < m_width * m_height; a++) { uint8_t r, g, b; r = m_data[a * m_channels]; g = m_data[a * m_channels + 1]; b = m_data[a * m_channels +

OpenMP custom reduction variable

ε祈祈猫儿з 提交于 2019-12-10 18:59:44
问题 I've been assigned to implement the idea of a reduction variable without using the reduction clause. I set up this basic code to test it. int i = 0; int n = 100000000; double sum = 0.0; double val = 0.0; for (int i = 0; i < n; ++i) { val += 1; } sum += val; so at the end sum == n . Each thread should set val as a private variable, and then the addition to sum should be a critical section where the threads converge, e.g. int i = 0; int n = 100000000; double sum = 0.0; double val = 0.0; #pragma

Using OpenMP critical and ordered

末鹿安然 提交于 2019-12-10 18:38:11
问题 I've quite new to Fortran and OpenMP, but I'm trying to get my bearings. I have a piece of code for calculating variograms which I'm attempting to parallelize. However, I seem to be getting race conditions, as some of the results are off by a thousandth or so. The problem seems to be the reductions. Using OpenMP reductions work and give the correct results, but they are not desirable, because the reductions actually happen in another subroutine (I copied the relevant lines into the OpenMP

Why do I get undefined behavior when using OpenMP's firstprivate with std::vector on Intel compiler?

∥☆過路亽.° 提交于 2019-12-10 18:29:59
问题 I have a problem when using OpenMP in combination with firstprivate and std::vector on the Intel c++ compiler. Take the following three functions: #include <omp.h> void pass_vector_by_value(std::vector<double> p) { #pragma omp parallel { //do sth } } void pass_vector_by_value_and_use_firstprivate(std::vector<double> p) { #pragma omp parallel firstprivate(p) { //do sth } } void create_vector_locally_and_use_firstprivate() { std::vector<double> p(3, 7); #pragma omp parallel firstprivate(p) { /

Suppress OpenMP debug messages when running Tensorflow on CPU

蹲街弑〆低调 提交于 2019-12-10 18:23:54
问题 When running a Python program on Linux that includes import tensorflow (installed without GPU support), a bunch of OpenMP debug messages are written to stdout, even when no functions from the tensorflow module are ever called. Here's an excerpt: OMP: Info #212: KMP_AFFINITY: decoding x2APIC ids. OMP: Info #210: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: 0-3 OMP: Info #156: KMP_AFFINITY: 4 available OS procs OMP:

linking OpenMP statically with GCC

拈花ヽ惹草 提交于 2019-12-10 17:47:06
问题 Given the following file print.cpp #include <stdio.h> int main() { printf("asdf\n"); } I can link this statically like this g++ -static print.cpp or like this g++ -static-libgcc -Wl,-Bstatic -lc print.cpp -o print But now let's add a little OpenMP and call the file print_omp.cpp #include <omp.h> #include <stdio.h> int main() { printf("%d\n", omp_get_num_threads()); } I can link this statically like this (I checked it with ldd ) g++ -fopenmp -static print_omp.cpp However, this does not work g+

Choose OpenMP pragma according to condition

不羁岁月 提交于 2019-12-10 17:38:14
问题 I have a code that I want to optimise that should run in a variaty of threads ammount. After running some tests using different scheduling techniques in a for loop that I have, I came to the conclusion that what suits best is to perform a dynamic scheduling when I have only one thread and guided otherwise. Is that even possible in openMP? To be more precise I want to be able to do something like the following: if(omp_get_max_threads()>1) #pragma omp parallel for .... scheduling(guided) else