Is my Python multithreading code affected by the Global Interpreter Lock [duplicate]

北慕城南 提交于 2019-12-12 10:04:28

问题


I'm trying to speed up my code using ThreadPool.

The input is a dictionary with about 80000 elements each one of which is a list of 25 elements. I have to produce an output list for each element in the dictionary, by processing and combining the elements in each list.

All the lists can be analyzed independently, so this setting should be easily parallelizable.

Here's the basic setting I used for pool.map:

from multiprocessing.dummy import Pool as ThreadPool

pool   = ThreadPool(NUM_THREADS)
output = pool.map(thread_compute, iterable_list)
pool.close()
pool.join
  • Approach 1 (wrong): define the dictionary as global and have each thread get a key of the dictionary as input.

    # in the main
    global input_dict
    iterable_list = input_dict.keys()
    
    # thread_compute function
    def thread_compute(key_i):
        list_to_combine = input_dict[key_i]
        # call a function
        processed_list = do_stuff_function(list_to_combine)
        return processed_list
    

I soon realized approach 1 will not work because of the global variable input_dict, even though it's never accessed for write operations (and so should be thread safe), is protected by the GIL (link 1, link 2) - a globally enforced lock when trying to safely access Python objects from within seperate threads.

  • Approach 2 (not working): I created a list containing the same elements as input_dict where each entry is a list of 25 items. This list is not a global variable and each thread should be able to access an element (a 25 item list) without any overhead due to the GIL.

    # in the main
    items = list(input_dict.items())
    iterable_list = [items[i][1] for i in range(len(items))]
    
    # making sure the content is correct
    assert(len(iterable_list) == len(input_dict))
    for i in xrange(len(input_dict.keys())):
        assert(iterable_list[i] == input_dict[input_dict.keys()[i]])
    
    # thread_compute function
    def thread_compute(list_of_25_i):
        # call a function
        processed_list = do_stuff_function(list_of_25_i)
        return processed_list
    

Here are the execution times for 1, 2, 4 and 16 threads:

   1:  Done, (t=36.6810s).
   2:  Done, (t=46.5550s).
   4:  Done, (t=48.2722s).
   16: Done, (t=48.9660s).

Why does adding threads cause such an increase in time? I am definitely sure that this problem can benefit from multithreading, and think that the overhead of adding threads cannot be solely responsible of the increase.


回答1:


If your do_stuff_function is CPU-bound, then running it in multiple thread will not help, because the GIL only allows 1 thread to be executed at a time.

The way around this in Python is to use multiple process, just replace

from multiprocessing.dummy import Pool

with

from multiprocessing import Pool



回答2:


Try using Pool (docs) as it uses processes instead of threads.

  • Threads in Python are concurrent as opposed to parallel which basically means that the interpreter will switch very fast between executing several threads and while executing one, lock the interpreter for all other threads, giving the impression of running operations parallel.

  • Processes on the other hand are much more costly to spawn in terms of time, as they are basically own instances of the interpreter and the data they operate on must be serialized, and sent from the main to the worker process. There it's deserialized, computed, results are serialized and it is sent back again.

However, the time it takes to spawn processes might have a negative impact on the performance overall, considering the moderate processing times you posted.



来源:https://stackoverflow.com/questions/39728974/is-my-python-multithreading-code-affected-by-the-global-interpreter-lock

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!