openmp

Atomic access to non-atomic memory location in C++11 and OpenMP?

此生再无相见时 提交于 2019-12-08 21:45:54
问题 OpenMP, in contrast to C++11, works with atomicity from a perspective of memory operations, not variables. That allows, e.g., to use atomic reads/writes for integers being stored in a vector with unknown size at compile time: std::vector<int> v; // non-atomic access (e.g., in a sequential region): v.resize(n); ... v.push_back(i); ... // atomic access in a multi-threaded region: #pragma omp atomic write // seq_cst v[k] = ...; #pragma omp atomic read // seq_cst ... = v[k]; In C++11, this is not

Dependency on VCOMP90.DLL in VS2008 Pro OpenMP project

被刻印的时光 ゝ 提交于 2019-12-08 21:08:24
问题 I have a DLL project in VS 2008 Pro which uses OpenMP. I use /MT as 'code generation' option, because I want all my dependencies statically linked into my DLL, since I do not want to distribute many libraries to my clients - everything shall be included in this one DLL file. The problem is that my resulting DLL still depends on VCOMP90.DLL. How can I get rid of this dependency? Some information: /openmp is set in compiler options I statically link against vcomp.lib include is set using

How to parallelize reading lines from an input file when lines get independently processed?

China☆狼群 提交于 2019-12-08 20:53:53
问题 I just started off with OpenMP using C++. My serial code in C++ looks something like this: #include <iostream> #include <string> #include <sstream> #include <vector> #include <fstream> #include <stdlib.h> int main(int argc, char* argv[]) { string line; std::ifstream inputfile(argv[1]); if(inputfile.is_open()) { while(getline(inputfile, line)) { // Line gets processed and written into an output file } } } Because each line is pretty much independently processed, I was attempting to use OpenMP

OpenMP Several “shared”-directives?

邮差的信 提交于 2019-12-08 20:21:10
问题 Hey there, I have a very long list of shared variables in OpenMP so I have to split lines in fortran and use the "&"-syntax to make sure the lines stick together! Something like that: !$OMP PARALLEL DEFAULT(private) SHARED(vars...., & more_vars..., & more_vars... & ) That gives me errors when compiling without OpenMP, since only the first like is recognized as a comment! The problem now is that I can't add a "!" in front of those lines with a "&" in front to support compiling without OpenMP:

MKL Performance on Intel Phi

本小妞迷上赌 提交于 2019-12-08 19:47:46
问题 I have a routine that performs a few MKL calls on small matrices (50-100 x 1000 elements) to fit a model, which I then call for different models. In pseudo-code: double doModelFit(int model, ...) { ... while( !done ) { cblas_dgemm(...); cblas_dgemm(...); ... dgesv(...); ... } return result; } int main(int argc, char **argv) { ... c_start = 1; c_stop = nmodel; for(int c=c_start; c<c_stop; c++) { ... result = doModelFit(c, ...); ... } } Call the above version 1. Since the models are independent

How to implement argmax with OpenMP?

余生长醉 提交于 2019-12-08 19:17:31
I am trying to implement a argmax with OpenMP. If short, I have a function that computes a floating point value: double toOptimize(int val); I can get the integer maximizing the value with: double best = 0; #pragma omp parallel for reduction(max: best) for(int i = 2 ; i < MAX ; ++i) { double v = toOptimize(i); if(v > best) best = v; } Now, how can I get the value i corresponding to the maximum? Edit: I am trying this, but would like to make sure it is valid: double best_value = 0; int best_arg = 0; #pragma omp parallel { double local_best = 0; int ba = 0; #pragma omp for reduction(max: best

C++ OpenMP Fibonacci: 1 thread performs much faster than 4 threads

时光总嘲笑我的痴心妄想 提交于 2019-12-08 13:37:02
问题 I'm trying to understand why the following runs much faster on 1 thread than on 4 threads on OpenMP. The following code is actually based on a similar question: OpenMP recursive tasks but when trying to implement one of the suggested answers, I don't get the intended speedup, which suggests I've done something wrong (and not sure what it is). Do people get better speed when running the below on 4 threads than on 1 thread? I'm getting a 10 times slowdown when running on 4 cores (I should be

Use OpenMP to find minimum for sets in parallel, C++

醉酒当歌 提交于 2019-12-08 12:00:38
问题 I'm implementing Boruvka's algorithm in C++ to find minimum spanning tree for a graph. This algorithm finds a minimum-weight edge for each supervertex (a supervertex is a connected component, it is simply a vertex in the first iteration) and adds them into the MST. Once an edge is added, we update the connected components and repeat the find-min-edge, and merge-supervertices process, until all the vertices in the graph are in one connected component. Since find-min-edge for each supervertex

OpenMP causes heisenbug segfault

柔情痞子 提交于 2019-12-08 11:17:01
问题 I'm trying to parallelize a pretty massive for-loop in OpenMP. About 20% of the time it runs through fine, but the rest of the time it crashes with various segfaults such as; *** glibc detected *** ./execute: double free or corruption (!prev): <address> *** *** glibc detected *** ./execute: free(): invalid next size (fast): <address> *** [2] <PID> segmentation fault ./execute My general code structure is as follows; <declare and initialize shared variables here> #pragma omp parallel private

C++ OpenMP working really slow on matrix-vector product

孤者浪人 提交于 2019-12-08 11:15:16
问题 So, I'm making matrix-vector product using openMP, but I've noticed it's working reallllly slow. After some times trying to figure out whats wrong I just deleted all code in parallel section and its still SLOW. What can be problem here? (n = 1000) Here is time results for 1, 2 and 4 cores. seq_method time = 0.001047194215062 parrallel_method (1) time = 0.001050273191140 seq - par = -0.000003078976079 seq/par = 0.997068404578433 parrallel_method (2) time = 0.001961992426004 seq - par = -0