I have experience in coding OpenMP for Shared Memory machines (in both C and FORTRAN) to carry out simple tasks like matrix addition, multiplication etc. (Just to see how it
"Cython supports native parallelism through the cython.parallel module. To use this kind of parallelism, the GIL must be released (see Releasing the GIL). It currently supports OpenMP, but later on more backends might be supported." Cython Documentation