why doesn't a simple python producer/consumer multi-threading program speed up by adding the number of workers?

若如初见. 提交于 2019-12-10 19:19:10

问题


The code below is almost identical to the python official Queue example at http://docs.python.org/2/library/queue.html

from Queue import Queue
from threading import Thread
from time import time
import sys

num_worker_threads = int(sys.argv[1])
source = xrange(10000)

def do_work(item):
    for i in xrange(100000):
        pass

def worker():
    while True:
        item = q.get()
        do_work(item)
        q.task_done()

q = Queue()

for item in source:
    q.put(item)

start = time()

for i in range(num_worker_threads):
    t = Thread(target=worker)
    t.daemon = True
    t.start()

q.join()

end = time()

print(end - start)

These are the results on a Xeon 12-core processor:

$ ./speed.py 1
12.0873839855

$ ./speed.py 2
15.9101941586

$ ./speed.py 4
27.5713479519

I expected that increasing the number of workers reduce the response time but instead, it is increasing. I did the experiment again and again but the result didn't change.

Am I missing something obvious? or the python queue/threading doesn't work well?


回答1:


Python is rather poor at multi-threading. Due to a global lock only one thread normally makes progress at a time. See http://wiki.python.org/moin/GlobalInterpreterLock




回答2:


Yeah, Maxim's right concerning the GIL. But as soon as you do something worth doing in the worker, the situation changes in most cases. Typical things to be done in the threads involve waiting for I/O or other things in which a thread-switch can be done quite fine. If you don't just count numbers in your workers but instead simulate working with a sleep, the situation changes dramatically:

#!/usr/bin/env python

from Queue import Queue
from threading import Thread
from time import time, sleep
import sys

num_worker_threads = int(sys.argv[1])
source = xrange(1000)

def do_work(item):
    for i in xrange(10):
        sleep(0.001)

def worker():
    while True:
        item = q.get()
        do_work(item)
        q.task_done()

q = Queue()

for item in source:
    q.put(item)

start = time()

for i in range(num_worker_threads):
    t = Thread(target=worker)
    t.daemon = True
    t.start()

q.join()

end = time()

This gives the following results:

for i in 1 2 3 4 5 6 7 8 9 10; do echo -n "$i "; ./t.py $i; done
1 11.0209097862
2 5.50820493698
3 3.65133094788
4 2.73591113091
5 2.19623804092
6 1.83647704124
7 1.57275605202
8 1.38150596619
9 1.23809313774
10 1.1111137867



来源:https://stackoverflow.com/questions/16665367/why-doesnt-a-simple-python-producer-consumer-multi-threading-program-speed-up-b

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!