parallel-processing | 易学教程

Is this proper use of numpy seeding for parallel code?

阅读更多关于 Is this proper use of numpy seeding for parallel code?

问题 I am running n instances of the same code in parallel and want each instance to use independent random numbers. For this purpose, before I start the parallel computations I create a list of random states, like this: import numpy.random as rand rand_states = [(rand.seed(rand.randint(2**32-1)),rand.get_state())[1] for j in range(n)] I then pass one element of rand_states to each parallel process, in which I basically do rand.set_state(rand_state) data = rand.rand(10,10) To make things

Vectorize/accelerate numpy function with two arguments of different dimensions

阅读更多关于 Vectorize/accelerate numpy function with two arguments of different dimensions

问题 I am not sure if this has been asked before. I couldn't find many relevant results on SO or Google. Anyway, here is what I am trying to do. I have a function that is created at runtime which takes in 4 parameters. my_func(t, x, p, a) t is a scalar float. x is a 1D numpy array of floats. p and a are dictionaries. I have a numpy array T and a 2D numpy array X. I want to call the function with each element of T and each column of X and store the result in another numpy array. The function

Scaling issues with OpenMP

阅读更多关于 Scaling issues with OpenMP

问题 I have written a code for a special type of 3D CFD Simulation, the Lattice-Boltzmann Method (quite similar to a code supplied with the Book "The Lattice Boltzmann Method" by Timm Krüger et alii). Multithreading the program with OpenMP I have experienced issues that I can't quite understand: The results prove to be strongly dependent on the overall domain size. The basic principle is that each cell of a 3D domain gets assigned certain values for 19 distribution functions (0-18) in discrete

Scaling issues with OpenMP

阅读更多关于 Scaling issues with OpenMP

How to handle really large objects returned from the joblib.Parallel()?

阅读更多关于 How to handle really large objects returned from the joblib.Parallel()?

问题 I have the following code, where I try to parallelize: import numpy as np from joblib import Parallel, delayed lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]] arr = np.array(lst) w, v = np.linalg.eigh(arr) def proj_func(i): return np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1)) proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w))) proj returns a really large list and it's causing memory issues. Is there a way I could work around this? I had thought about returning a

How to handle really large objects returned from the joblib.Parallel()?

阅读更多关于 How to handle really large objects returned from the joblib.Parallel()?

Why can GPU do matrix multiplication faster than CPU?

阅读更多关于 Why can GPU do matrix multiplication faster than CPU?

问题 I've been using GPU for a while without questioning it but now I'm curious. Why can GPU do matrix multiplication much faster than CPU? Is it because of the parallel processing? But I didn't write any parallel processing code. Does it do it automatically by itself? Any intuition / high-level explanation will be appreciated! Thanks. 回答1: How do you parallelize the computations? GPU's are able to do a lot of parallel computations. A Lot more than a CPU could do. Look at this example of vector

How does random_number() work in parallel?

阅读更多关于 How does random_number() work in parallel?

问题 How does random_number() works in parallel with OpenMP? If I run my program without parallelization I always get the same result, but with parallelization I get different (but similar) results every time. 回答1: There is no guarantee about thread safety or threading performance about random_number in general. The Fortran standard does not know OpenMP at all. Individual compilers may offer you some guarantees, but they will be only valid for the version present in the particular compiler. For

Parallel.ForEach and DataTable - Isn't DataTable.NewRow() a thread safe “read” operation?

阅读更多关于 Parallel.ForEach and DataTable - Isn't DataTable.NewRow() a thread safe “read” operation?

问题 I'm converting an existing application to take advantage of multiple processors. I have some nested loops, and I've converted the inner-most loop into a Parallel.Foreach loop. In the original application, inside the inner-most loop, the code would call DataTable.NewRow() to instantiate a new DataRow of the appropriate layout, populate the columns and add the populated DataRow into the DataTable with DataTable.Add() . But since DataTable is only thread-safe for read operations, I have

How to return a generator using joblib.Parallel()?

阅读更多关于 How to return a generator using joblib.Parallel()?

问题 I have a piece of code below where the joblib.Parallel() returns a list. import numpy as np from joblib import Parallel, delayed lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]] arr = np.array(lst) w, v = np.linalg.eigh(arr) def proj_func(i): return np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1)) proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w))) Instead of a list, how do I return a generator using joblib.Parallel() ? Edit: I have updated the code as suggested by