Problems with setting OMP_THREAD_LIMIT during runtime (c++ gcc 4.4.7)

孤者浪人 提交于 2019-12-08 05:23:15

问题


Heluuu,

I have a rather large program that I'm attempting to thread. So far, this has been succesful, and the basics are all working as intended.

I now want to do some fancy work with cascading threads in nested mode. Essentially, I want the main parallel region to use any free threads in lower parallel regions.

To detail the current system, the main parallel region starts 10 threads. I have 12 cores, so I can use 2 more threads. There is a second parallel region where some heavy computing happens, and I want the first two threads to reach this point to start a new team there, each with 2 threads. Every new entry to the lower parallel region after this will continue in serial.

So, this should look like the following.
Main region: 10 threads started.
Lower region: 2 new threads started.

Thread 1: 2 threads in lower region.
Thread 2: 2 threads in lower region.
Thread 3-10: 1 thread in lower region.

Please keep in mind that these numbers are for the sake of clarity in providing a concrete description of my situation, and not the absolute and only case in which the program operates.

The code:

main() {
    ...
    ...
    omp_set_num_threads(n);
    omp_set_dynamic(x);

    #pragma omp parallel
    {
        #pragma omp for
        for (int i = 0; i < iterations; i++) {
            ...
            Compute();
            ...
        }
    }
}    

And in Compute

bool Compute() {
    ...
    float nThreads = omp_get_thread_limit() - omp_get_num_threads();
    nThreads = ceil(nThreads / omp_get_num_threads());
    omp_set_num_threads((int)nThreads);
    #pragma omp parallel
    {
        ...
        #pragma omp for
        for (int i = 0; i < nReductSize; i++) {
            ...
        }
    }
}

Now, my problem is that setting the uppermost limit for the whole program (i.e. OMP_THREAD_LIMIT) only works from outside the program. Using

export OMP_THREAD_LIMIT=5  

from the bash command line works great. But I want to do it internally. So far, I've tried

putenv("OMP_THREAD_LIMIT=12");
setenv("OMP_THREAD_LIMIT", "12", 1);

but when I call omp_get_thread_limit() or getenv("OMP_THREAD_LIMIT") I get wacky return values. Even when I set the variable with export, calling getenv("OMP_THREAD_LIMIT"); returns 0.
So, I would ask for your help in this: How do I properly set OMP_THREAD_LIMIT at runtime?

This is the main function where I set the thread defaults. It is executed well before any threading occurs:

#ifdef _OPENMP
    const char *name = "OMP_THREAD_LIMIT";
    const char *value = "5";
    int overwrite = 1;
    int success = setenv(name, value, overwrite);
    cout << "Var set (0 is success): " << success << endl;
#endif

Oh, and setenv reports success in setting the variable.

Compiler says
gcc44 (GCC) 4.4.7 20120313 (Red Hat 4.4.7-1)

Flags
CCFLAGS = -c -O0 -fopenmp -g -msse -msse2 -msse3 -mfpmath=sse -std=c++0x

OpenMP version is 3.0.


回答1:


This is correct implementation of OpenMP, and it ignores changes in environment from inside the program. As stated in OpenMP 3.1 Standard, page 159:

Modifications to the environment variables after the program has started, even if modified by the program itself, are ignored by the OpenMP implementation.

You are doing exactly what is said in this paragraph.

OpenMP allows changing of such parameters only via omp_set_* functions, but there are no such function for thread-limit-var ICV:

However, the settings of some of the ICVs can be modified during the execution of the OpenMP program by the use of the appropriate directive clauses or OpenMP API routines.

I think, you may use num_threads clause of #pragma omp parallel to achieve what you want.




回答2:


Changing the behavior of OpenMP using OMP_THREAD_LIMIT (or any other OMP_* environment variable) is not possible after the program has started; these are intended for use by the user. You could have the user invoke your program through a script that sets OMP_THREAD_LIMIT and then calls your program, but that's probably not what you need to do in this case.

OMP_NUM_THREADS, omp_set_num_threads, and the num_threads clause are usually used to set the number of threads operating in a region.




回答3:


It might be offtopic, but you may want to try openmp collapse instead of handcrafting here.



来源:https://stackoverflow.com/questions/18654540/problems-with-setting-omp-thread-limit-during-runtime-c-gcc-4-4-7

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!