I've used processing for Python. It mimicks the API of the threading module and is thus quite easy to use.
If you happen to use map/imap
or a generator/list comprehension, converting your code to use processing
is straightforward:
def do_something(x):
return x**(x*x)
results = [do_something(n) for n in range(10000)]
can be parallelized with
import processing
pool = processing.Pool(processing.cpuCount())
results = pool.map(do_something, range(10000))
which will use however many processors you have to calculate the results. There are also lazy (Pool.imap
) and asynchronous variants (Pool.map_async
).
There is a queue class which implements Queue.Queue, and workers that are similar to threads.
Gotchas
processing
is based on fork()
, which has to be emulated on Windows. Objects are transferred via pickle
/unpickle
, so you have to make sure that this works. Forking a process that has acquired resources already might not be what you want (think database connections), but in general it works. It works so well that it has been added to Python 2.6 on the fast track (cf. PEP-317).