OpenMP not waiting all threads finish before end C program

我怕爱的太早我们不能终老 提交于 2021-02-17 03:19:36

问题


I have the following problem: My C program must count the number of occurrences of a list of words in a text file.

I use OpenMP for this, and the program, in theory, has the correct logic. When I put some printfs inside a For Loop the result of the program is correct and always the same.

When I remove printfs the result is incorrect, and with each execution its value changes. Given this scenario I think the reason is related to the execution time. With printfs the execution time is increased, so there is time to finish counting all threads and the program to work correctly. Without prinfts, the execution time decreases exponentially (0.000893 ms), leaving no time to finish all threads / calculations and for this reason the program prints a different result for each execution.

The parallelized code is as follows:

#pragma omp parallel for schedule(dynamic) num_threads(threadNumber) private(word, wordExists) shared(keyWordsOcurrences)
          for (line = 0; line < NUM_LINES; line++)
            {
                // divides the line into words separated by space
                word = strtok(lines[line], " ");
                while (word != NULL)
                {
                    // checks if the word being read is one of the monitored words
                    wordExists = checkWordOcurrences(word);
                    if (wordExists)
                    {
                        #pragma omp critical
                        keyWordsOcurrences[wordExists - 1] += 1;
                    }
                    word = strtok(NULL, " ");
                }
            }

The checkWordOcurrences function called is where I put the printf responsible to make my code work properly in every execution (increasing execution time).

int checkWordOcurrences(char *word)
{
    int res = 0;
    int i;

    for (i = 0; i < QTD_WORDS; i++)
    {
        // **this is the almighty Printf that makes everything work properly, and without it things stop working :(**
        printf("palavra %d %s - palavra 2 %s \n", i, keyWords[i], word);
        // compares current word with monitored words
        if (!strcmp(keyWords[i], word))
        {
            // if it's monitored word, returns its index (+1 because the first word has index 0 and the return type is checked as true or false)
            res = i + 1;
        }
    }

    // returns word index or 0, if current word is not monitored
    return res;
}

Can someone explain to me what may be happening and / or how to solve it?


回答1:


There is an implicit barrier at the end of the OpenMP for construct and at the end of each parallel region, so it is not possible for the program to finish before all the threads have finished their assigned work.

The problem is most likely caused by the use of strtok. It is not a thread-safe function since the position of the search point is stored internally in the C library. When one thread is in the middle of tokenising something and another thread calls strtok(lines[line], " ");, this overwrites the pointer to the string being searched and now all other threads calling strtok(NULL, " "); are tokenising the newly set string instead of the string they were in the middle of tokenising before. It is a classical case of data race.

The solution is to use strtok_r instead.

#pragma omp parallel for schedule(dynamic) num_threads(threadNumber) private(word, wordExists) shared(keyWordsOcurrences)
          for (line = 0; line < NUM_LINES; line++)
            {
                char *saveptr;
                // divides the line into words separated by space
                word = strtok_r(lines[line], " ", &saveptr);
                while (word != NULL)
                {
                    // checks if the word being read is one of the monitored words
                    wordExists = checkWordOcurrences(word);
                    if (wordExists)
                    {
                        #pragma omp critical
                        keyWordsOcurrences[wordExists - 1] += 1;
                    }
                    word = strtok_r(NULL, " ", &saveptr);
                }
            }

On a separate account, critical is a very heavyweight synchronisation construct implemented with locks. Simple increments such as keyWordsOcurrences[wordExists - 1] += 1; can be protected with atomic updates instead, which are way quicker:

if (wordExists)
{
    #pragma omp atomic update
    keyWordsOcurrences[wordExists - 1] += 1;
}

If QTD_WORDS isn't a very large number, you may also use array reduction:

#pragma omp parallel for schedule(dynamic) num_threads(threadNumber) \
                         private(word, wordExists) \
                         reduction(+:keyWordsOcurrences[0:QTD_WORDS])
          for (line = 0; line < NUM_LINES; line++)
            {
                char *saveptr;
                // divides the line into words separated by space
                word = strtok_r(lines[line], " ", &saveptr);
                while (word != NULL)
                {
                    // checks if the word being read is one of the monitored words
                    wordExists = checkWordOcurrences(word);
                    if (wordExists)
                    {
                        keyWordsOcurrences[wordExists - 1] += 1;
                    }
                    word = strtok_r(NULL, " ", &saveptr);
                }
            }

Array reduction for C and C++ is a relatively new OpenMP feature though and requires a compiler that supports OpenMP 4.5. It is possible to do it by hand for older compilers, but that goes way out of the scope of the original question.



来源:https://stackoverflow.com/questions/64745737/openmp-not-waiting-all-threads-finish-before-end-c-program

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!