parallel-processing

Can I run several different functions in parallel at once?

假装没事ソ 提交于 2020-06-29 03:55:28
问题 I have to run several regression models for a project. The workflow looks something like this: glm(y ~ variables, data=data1) glm(y ~ variables, data=data2) glm(y ~ variables, data=data3) glm(y ~ variables, data=data4) I then run a different model on the same data: lm(z ~ other_variables, data=data1) lm(z ~ other_variables, data=data2) lm(z ~ other_variables, data=data3) lm(z ~ other_variables, data=data4) Running these models take something like 8 hours, so I want to parallelize this

Can I run several different functions in parallel at once?

故事扮演 提交于 2020-06-29 03:54:30
问题 I have to run several regression models for a project. The workflow looks something like this: glm(y ~ variables, data=data1) glm(y ~ variables, data=data2) glm(y ~ variables, data=data3) glm(y ~ variables, data=data4) I then run a different model on the same data: lm(z ~ other_variables, data=data1) lm(z ~ other_variables, data=data2) lm(z ~ other_variables, data=data3) lm(z ~ other_variables, data=data4) Running these models take something like 8 hours, so I want to parallelize this

Parallel for loop over numpy matrix

你说的曾经没有我的故事 提交于 2020-06-27 19:41:07
问题 I am looking at the joblib examples but I can't figure out how to do a parallel for loop over a matrix. I am computing a pairwise distance metric between the rows of a matrix. So I was doing: N, _ = data.shape upper_triangle = [(i, j) for i in range(N) for j in range(i + 1, N)] dist_mat = np.zeros((N,N)) for (i, j) in upper_triangle: dist_mat[i,j] = dist_fun(data[i], data[j]) dist_mat[j,i] = dist_mat[i,j] where dist_fun takes two vectors and computes a distance. How can I make this loop

Parallel for loop over numpy matrix

依然范特西╮ 提交于 2020-06-27 19:40:13
问题 I am looking at the joblib examples but I can't figure out how to do a parallel for loop over a matrix. I am computing a pairwise distance metric between the rows of a matrix. So I was doing: N, _ = data.shape upper_triangle = [(i, j) for i in range(N) for j in range(i + 1, N)] dist_mat = np.zeros((N,N)) for (i, j) in upper_triangle: dist_mat[i,j] = dist_fun(data[i], data[j]) dist_mat[j,i] = dist_mat[i,j] where dist_fun takes two vectors and computes a distance. How can I make this loop

Kotlin: coroutineScope is slower than GlobalScope

不羁岁月 提交于 2020-06-27 12:51:48
问题 I'm learning coroutines, and I encounter the following surprising (for me) behavior. I want to have a parallel map. I consider 4 solutions: Just map , no parallelism pmap from here. Modification of item 2: I removed coroutineScope and use GlobalScope . Java's parallelStream . The code: import kotlinx.coroutines.* import kotlin.streams.toList import kotlin.system.measureNanoTime inline fun printTime(msg: String, f: () -> Unit) = println("${msg.padEnd(15)} time: ${measureNanoTime(f) / 1e9}")

Optimizing Numeric Program with SIMD

孤街醉人 提交于 2020-06-27 03:58:05
问题 I am try to optimizing the performance of the following naive program without changing the algorithm : naive (int n, const int *a, const int *b, int *c) //a,b are two array with given size n; { for (int k = 0; k < n; k++) for (int i = 0; i < n - k; ++i) c[k] += a[i + k] * b[i]; } My idea is as follows : First, I use OpenMP for the outer loop. For the inner loop, as it is imbalanced, I specify n-k to determine whether to use AXV2 SIMD intrinsic or simply reduce . And finally, I find that it

Sharing large pandas DataFrame with multiprocessing for loop in Python

别来无恙 提交于 2020-06-26 14:32:07
问题 Using Python 2.7 on a Windows machine, I have a large pandas DataFrame (about 7 million rows and 20+ columns) from a SQL query that I'd like to filter by looping through IDs then run calculations on the resulting filtered data. I'd also like to do this in parallel. I know that if I try to do this with standard methods from the multiprocessing package in Windows, each process will generate a new instance of that large DataFrame for its own use and my memory will be eaten up. So I'm trying to

Parallel processing strings Delphi full available CPU usage

不打扰是莪最后的温柔 提交于 2020-06-25 08:58:33
问题 The goal is to achieve full usage of the available cores, in converting floats to strings in a single Delphi application. I think this problem applies to the general processing of string. Yet in my example I am specifically using the FloatToStr method. What I am doing (I've kept this very simple so there is little ambiguity around the implementation): Using Delphi XE6 Create thread objects which inherit from TThread, and start them. In the thread execute procedure it will convert a large

Gnu Parallel : nested parallelism

放肆的年华 提交于 2020-06-24 22:19:12
问题 Is it possible to call gnu parallel from within multiple runs of a script that are in-turn spawned by gnu parallel? I have a python script that runs for 100s of sequential iterations, and somewhere within each iteration, 4 values are being computed in parallel (using gnu parallel). Now I want to spawn multiple such scripts at the same time, again, using gnu parallel. Is this possible? Will gnu parallel take care of good utilization of available cores? For example, if in the inner loop, out of

How to merge two pandas dataframe in parallel (multithreading or multiprocessing)

…衆ロ難τιáo~ 提交于 2020-06-24 07:38:14
问题 Without doing in parallel programming I can merger left and right dataframe on key column using below code, but it will be too slow since both are very large. is there any way I can do it in parallelize efficiently ? I have 64 cores, and so practically I can use 63 of them to merge these two dataframe. left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'], 'A': ['A0', 'A1', 'A2', 'A3'], 'B': ['B0', 'B1', 'B2', 'B3']}) right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'], 'C': ['C0', 'C1', 'C2