why doesn't a simple python producer/consumer multi-threading program speed up by adding the number of workers?

问题

The code below is almost identical to the python official Queue example at http://docs.python.org/2/library/queue.html

from Queue import Queue
from threading import Thread
from time import time
import sys

num_worker_threads = int(sys.argv[1])
source = xrange(10000)

def do_work(item):
    for i in xrange(100000):
        pass

def worker():
    while True:
        item = q.get()
        do_work(item)
        q.task_done()

q = Queue()

for item in source:
    q.put(item)

start = time()

for i in range(num_worker_threads):
    t = Thread(target=worker)
    t.daemon = True
    t.start()

q.join()

end = time()

print(end - start)

These are the results on a Xeon 12-core processor:

$ ./speed.py 1
12.0873839855

$ ./speed.py 2
15.9101941586

$ ./speed.py 4
27.5713479519

I expected that increasing the number of workers reduce the response time but instead, it is increasing. I did the experiment again and again but the result didn't change.

Am I missing something obvious? or the python queue/threading doesn't work well?

回答1:

Python is rather poor at multi-threading. Due to a global lock only one thread normally makes progress at a time. See http://wiki.python.org/moin/GlobalInterpreterLock

回答2:

Yeah, Maxim's right concerning the GIL. But as soon as you do something worth doing in the worker, the situation changes in most cases. Typical things to be done in the threads involve waiting for I/O or other things in which a thread-switch can be done quite fine. If you don't just count numbers in your workers but instead simulate working with a sleep, the situation changes dramatically:

#!/usr/bin/env python

from Queue import Queue
from threading import Thread
from time import time, sleep
import sys

num_worker_threads = int(sys.argv[1])
source = xrange(1000)

def do_work(item):
    for i in xrange(10):
        sleep(0.001)

def worker():
    while True:
        item = q.get()
        do_work(item)
        q.task_done()

q = Queue()

for item in source:
    q.put(item)

start = time()

for i in range(num_worker_threads):
    t = Thread(target=worker)
    t.daemon = True
    t.start()

q.join()

end = time()

This gives the following results:

for i in 1 2 3 4 5 6 7 8 9 10; do echo -n "$i "; ./t.py $i; done
1 11.0209097862
2 5.50820493698
3 3.65133094788
4 2.73591113091
5 2.19623804092
6 1.83647704124
7 1.57275605202
8 1.38150596619
9 1.23809313774
10 1.1111137867

来源：https://stackoverflow.com/questions/16665367/why-doesnt-a-simple-python-producer-consumer-multi-threading-program-speed-up-b

标签

python

multithreading

queue

producer-consumer