I am about to write some computationally-intensive Python code that'll almost certainly spend most of its time inside numpy's linear algebra functions.
The problem at hand is embarrassingly parallel. Long story short, the easiest way for me to take advantage of that would be by using multiple threads. The main barrier is almost certainly going to be the Global Interpreter Lock (GIL).
To help design this, it would be useful to have a mental model for which numpy operations can be expected to release the GIL for their duration. To this end, I'd appreciate any rules of thumb, dos and don'ts, pointers etc.
In case it matters, I'm using 64-bit Python 2.7.1 on Linux, with numpy 1.5.1 and scipy 0.9.0rc2, built with Intel MKL 10.3.1.
You will probably find answers to all your questions regarding NumPy and parallel programming on the official wiki.
Also, have a look at this recipe page -- it contains example code on how to use NumPy with multiple threads.
Quite some numpy routines release GIL, so they can be efficiently parallel in threads (info). Maybe you don't need to do anything special!
You can use this question to find whether the routines you need are among the ones that release GIL. In short, search for ALLOW_THREADS or nogil in the source.
(Also note that MKL has the ability to use multiple threads for a routine, so that's another easy way to get parallelism, although possibly not the fastest kind).
来源:https://stackoverflow.com/questions/6200437/numpy-and-global-interpreter-lock