vectorization

Multiplying every element of one array by every element of another array

送分小仙女□ 提交于 2020-03-20 06:35:48
问题 Say I have two arrays, import numpy as np x = np.array([1, 2, 3, 4]) y = np.array([5, 6, 7, 8]) What's the fastest, most Pythonic, etc., etc. way to get a new array, z , with a number of elements equal to x.size * y.size , in which the elements are the products of every pair of elements (x_i, y_j) from the two input arrays. To rephrase, I'm looking for an array z in which z[k] is x[i] * y[j] . A simple but inefficient way to get this is as follows: z = np.empty(x.size * y.size) counter = 0

Multiplying every element of one array by every element of another array

眉间皱痕 提交于 2020-03-20 06:35:05
问题 Say I have two arrays, import numpy as np x = np.array([1, 2, 3, 4]) y = np.array([5, 6, 7, 8]) What's the fastest, most Pythonic, etc., etc. way to get a new array, z , with a number of elements equal to x.size * y.size , in which the elements are the products of every pair of elements (x_i, y_j) from the two input arrays. To rephrase, I'm looking for an array z in which z[k] is x[i] * y[j] . A simple but inefficient way to get this is as follows: z = np.empty(x.size * y.size) counter = 0

How to use Vectorization with NumPy arrays to calculate geodesic distance using Geopy library for a large dataset?

核能气质少年 提交于 2020-03-14 11:04:39
问题 I am trying to calculate geodesic distance from a dataframe which consists of four columns of latitude and longitude data with around 3 million rows. I used the apply lambda method to do it but it took 18 minutes to finish the task. Is there a way to use Vectorization with NumPy arrays to speed up the calculation? Thank you for answering. My code using apply and lambda method: from geopy import distance df['geo_dist'] = df.apply(lambda x: distance.distance( (x['start_latitude'], x['start

Speeding up calculation of symmetric matrices; use of outer

ε祈祈猫儿з 提交于 2020-03-05 06:55:12
问题 I need to speed up a calculation that produces a symmetric matrix. Currently I have something like this: X <- 1:50 Y<- 1:50 M <- outer(X, Y, FUN = myfun) where myfun is a quite complicated, vectorized, but symmetrical function (myfun(x, y) = myfun(y, x)). So my code unnecessarily wastes time calculating the lower triangular matrix as well as the upper triangular matrix. How can I avoid that duplication without using slow for-loops? 回答1: If your function is slow and timing scales with size of

Same operation with einsum and tensordot with Numpy

旧时模样 提交于 2020-02-29 05:11:58
问题 Let's say I have two 3D arrays A and B of shape (3, 4, N) and (4, 3, N) . I can compute the dot product between slices along the third axis with with_einsum = np.eisum('ikl,kjl->ijl', A, B) Is it possible to perform the same operation with numpy.tensordot ? 回答1: With np.einsum('ikl,kjl->ijl', A, B) , there is axis alignment requirement with string - l that stays with the inputs and the output. As such, using np.tensordot might not necessarily result in performance improvement, but since the

Same operation with einsum and tensordot with Numpy

拜拜、爱过 提交于 2020-02-29 05:10:07
问题 Let's say I have two 3D arrays A and B of shape (3, 4, N) and (4, 3, N) . I can compute the dot product between slices along the third axis with with_einsum = np.eisum('ikl,kjl->ijl', A, B) Is it possible to perform the same operation with numpy.tensordot ? 回答1: With np.einsum('ikl,kjl->ijl', A, B) , there is axis alignment requirement with string - l that stays with the inputs and the output. As such, using np.tensordot might not necessarily result in performance improvement, but since the

Efficiently replace elements in array based on dictionary - NumPy / Python

邮差的信 提交于 2020-02-28 04:04:09
问题 First, of all, my apologies if this has been answered elsewhere. All I could find were questions about replacing elements of a given value, not elements of multiple values. background I have several thousand large np.arrays, like so: # generate dummy data input_array = np.zeros((100,100)) input_array[0:10,0:10] = 1 input_array[20:56, 21:43] = 5 input_array[34:43, 70:89] = 8 In those arrays, I want to replace values, based on a dictionary: mapping = {1:2, 5:3, 8:6} approach At this time, I am

Efficiently replace elements in array based on dictionary - NumPy / Python

删除回忆录丶 提交于 2020-02-28 04:04:00
问题 First, of all, my apologies if this has been answered elsewhere. All I could find were questions about replacing elements of a given value, not elements of multiple values. background I have several thousand large np.arrays, like so: # generate dummy data input_array = np.zeros((100,100)) input_array[0:10,0:10] = 1 input_array[20:56, 21:43] = 5 input_array[34:43, 70:89] = 8 In those arrays, I want to replace values, based on a dictionary: mapping = {1:2, 5:3, 8:6} approach At this time, I am

Vectorizing a “pure” function with numpy, assuming many duplicates

你。 提交于 2020-02-26 10:32:07
问题 I want to apply a "black box" Python function f to a large array arr . Additional assumptions are: Function f is "pure", e.g. is deterministic with no side effects. Array arr has a small number of unique elements. I can achieve this with a decorator that computes f for each unique element of arr as follows: import numpy as np from time import sleep from functools import wraps N = 1000 np.random.seed(0) arr = np.random.randint(0, 10, size=(N, 2)) def vectorize_pure(f): @wraps(f) def f_vec(arr)

R dplyr::mutate with ifelse conditioned on a global variable recycles result from first row

我的梦境 提交于 2020-02-25 06:10:20
问题 I am curious why an ifelse() statement within a call to dplyr::mutate() only seems to apply to the first row of my data frame. This returns a single value, which is recycled down the entire column. Since the expressions evaluated in either case of the ifelse() are only valid in the context of my data frame, I would expect the condition check and resulting expression evaluations to be performed on the columns as a whole, not just their first elements. Here's an example: I have a variable