parallel-processing

Is this proper use of numpy seeding for parallel code?

独自空忆成欢 提交于 2020-04-30 10:01:03
问题 I am running n instances of the same code in parallel and want each instance to use independent random numbers. For this purpose, before I start the parallel computations I create a list of random states, like this: import numpy.random as rand rand_states = [(rand.seed(rand.randint(2**32-1)),rand.get_state())[1] for j in range(n)] I then pass one element of rand_states to each parallel process, in which I basically do rand.set_state(rand_state) data = rand.rand(10,10) To make things

Vectorize/accelerate numpy function with two arguments of different dimensions

ⅰ亾dé卋堺 提交于 2020-04-22 02:15:48
问题 I am not sure if this has been asked before. I couldn't find many relevant results on SO or Google. Anyway, here is what I am trying to do. I have a function that is created at runtime which takes in 4 parameters. my_func(t, x, p, a) t is a scalar float. x is a 1D numpy array of floats. p and a are dictionaries. I have a numpy array T and a 2D numpy array X. I want to call the function with each element of T and each column of X and store the result in another numpy array. The function

Scaling issues with OpenMP

删除回忆录丶 提交于 2020-04-16 05:17:12
问题 I have written a code for a special type of 3D CFD Simulation, the Lattice-Boltzmann Method (quite similar to a code supplied with the Book "The Lattice Boltzmann Method" by Timm Krüger et alii). Multithreading the program with OpenMP I have experienced issues that I can't quite understand: The results prove to be strongly dependent on the overall domain size. The basic principle is that each cell of a 3D domain gets assigned certain values for 19 distribution functions (0-18) in discrete

Scaling issues with OpenMP

情到浓时终转凉″ 提交于 2020-04-16 05:17:03
问题 I have written a code for a special type of 3D CFD Simulation, the Lattice-Boltzmann Method (quite similar to a code supplied with the Book "The Lattice Boltzmann Method" by Timm Krüger et alii). Multithreading the program with OpenMP I have experienced issues that I can't quite understand: The results prove to be strongly dependent on the overall domain size. The basic principle is that each cell of a 3D domain gets assigned certain values for 19 distribution functions (0-18) in discrete

How to handle really large objects returned from the joblib.Parallel()?

女生的网名这么多〃 提交于 2020-04-11 23:00:58
问题 I have the following code, where I try to parallelize: import numpy as np from joblib import Parallel, delayed lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]] arr = np.array(lst) w, v = np.linalg.eigh(arr) def proj_func(i): return np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1)) proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w))) proj returns a really large list and it's causing memory issues. Is there a way I could work around this? I had thought about returning a

How to handle really large objects returned from the joblib.Parallel()?

家住魔仙堡 提交于 2020-04-11 22:59:54
问题 I have the following code, where I try to parallelize: import numpy as np from joblib import Parallel, delayed lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]] arr = np.array(lst) w, v = np.linalg.eigh(arr) def proj_func(i): return np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1)) proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w))) proj returns a really large list and it's causing memory issues. Is there a way I could work around this? I had thought about returning a

Why can GPU do matrix multiplication faster than CPU?

谁都会走 提交于 2020-04-10 04:00:46
问题 I've been using GPU for a while without questioning it but now I'm curious. Why can GPU do matrix multiplication much faster than CPU? Is it because of the parallel processing? But I didn't write any parallel processing code. Does it do it automatically by itself? Any intuition / high-level explanation will be appreciated! Thanks. 回答1: How do you parallelize the computations? GPU's are able to do a lot of parallel computations. A Lot more than a CPU could do. Look at this example of vector

How does random_number() work in parallel?

风流意气都作罢 提交于 2020-03-25 17:57:14
问题 How does random_number() works in parallel with OpenMP? If I run my program without parallelization I always get the same result, but with parallelization I get different (but similar) results every time. 回答1: There is no guarantee about thread safety or threading performance about random_number in general. The Fortran standard does not know OpenMP at all. Individual compilers may offer you some guarantees, but they will be only valid for the version present in the particular compiler. For

Parallel.ForEach and DataTable - Isn't DataTable.NewRow() a thread safe “read” operation?

情到浓时终转凉″ 提交于 2020-03-25 03:00:45
问题 I'm converting an existing application to take advantage of multiple processors. I have some nested loops, and I've converted the inner-most loop into a Parallel.Foreach loop. In the original application, inside the inner-most loop, the code would call DataTable.NewRow() to instantiate a new DataRow of the appropriate layout, populate the columns and add the populated DataRow into the DataTable with DataTable.Add() . But since DataTable is only thread-safe for read operations, I have

How to return a generator using joblib.Parallel()?

吃可爱长大的小学妹 提交于 2020-03-21 10:47:07
问题 I have a piece of code below where the joblib.Parallel() returns a list. import numpy as np from joblib import Parallel, delayed lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]] arr = np.array(lst) w, v = np.linalg.eigh(arr) def proj_func(i): return np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1)) proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w))) Instead of a list, how do I return a generator using joblib.Parallel() ? Edit: I have updated the code as suggested by