问题
There is no atomic minimal operation in OpenMP, also no intrinsic in Intel MIC's instruction set.
#pragmma omp critial is very insufficient in the performance.
I want to know if there is a high performance implement of atomic minimal for Intel MIC.
回答1:
According to the OpenMP 4.0 Specifications (Section 2.12.6), there is a lot of fast atomic minimal operations you can do by using the #pragma omp atomic construct in place of #pragma omp critical (and thereby avoid the huge overhead of its lock).
Overview of the possibilities with the #pragma omp atomic construct
Let x be your thread-shared variable:
With
#pragma omp atomic readyou can atomically let your shared variablexbe read:v = x;With
#pragma omp atomic writeyou can atomically assign a new value to your shared variablex; the new value expression (expr) has to bex-independant:x = expr;With
#pragma omp atomic updateyou can atomically update your shared variablex; in fact you can only assign a new value as a binary operation (binop) between anx-independant expression andx:x++; x--; ++x; --x; x binop= expr; x = x binop expr; x = expr binop x;With
#pragma omp atomic captureyou can atomically let your shared variablexbe read and updated (in the order you want); in factcaptureis a combination of thereadandupdateconstruct:You have short forms for
updateand thenread:v = ++x; v = --x; v = x binop= expr; v = x = x binop expr; v = x = expr binop x;And their structured-block analogs:
{--x; v = x;} {x--; v = x;} {++x; v = x;} {x++; v = x;} {x binop= expr; v = x;} {x = x binop expr; v = x;} {x = expr binop x; v = x;}And you have a few short forms for
readand thenupdate:v = x++; v = x--;And again their structured-block analogs:
{v = x; x++;} {v = x; ++x;} {v = x; x--;} {v = x; --x;}And finally you have additional
readthenupdate, which only exists in structured-block forms :{v = x; x binop= expr;} {v = x; x = x binop expr;} {v = x; x = expr binop x;} {v = x; x = expr;}
In the preceding expressions:
xandvare both l-value expressions with scalar type;expris an expression with scalar type;binopis one of+,*,-,/,&,^,|,<<or>>;binop,binop=,++and--are not overloaded operators.
来源:https://stackoverflow.com/questions/17938715/high-performance-implement-of-atomic-minimal-operation