vectorization | 易学教程

Multiplying every element of one array by every element of another array

阅读更多关于 Multiplying every element of one array by every element of another array

问题 Say I have two arrays, import numpy as np x = np.array([1, 2, 3, 4]) y = np.array([5, 6, 7, 8]) What's the fastest, most Pythonic, etc., etc. way to get a new array, z , with a number of elements equal to x.size * y.size , in which the elements are the products of every pair of elements (x_i, y_j) from the two input arrays. To rephrase, I'm looking for an array z in which z[k] is x[i] * y[j] . A simple but inefficient way to get this is as follows: z = np.empty(x.size * y.size) counter = 0

Multiplying every element of one array by every element of another array

阅读更多关于 Multiplying every element of one array by every element of another array

How to use Vectorization with NumPy arrays to calculate geodesic distance using Geopy library for a large dataset?

阅读更多关于 How to use Vectorization with NumPy arrays to calculate geodesic distance using Geopy library for a large dataset?

问题 I am trying to calculate geodesic distance from a dataframe which consists of four columns of latitude and longitude data with around 3 million rows. I used the apply lambda method to do it but it took 18 minutes to finish the task. Is there a way to use Vectorization with NumPy arrays to speed up the calculation? Thank you for answering. My code using apply and lambda method: from geopy import distance df['geo_dist'] = df.apply(lambda x: distance.distance( (x['start_latitude'], x['start

Speeding up calculation of symmetric matrices; use of outer

阅读更多关于 Speeding up calculation of symmetric matrices; use of outer

问题 I need to speed up a calculation that produces a symmetric matrix. Currently I have something like this: X <- 1:50 Y<- 1:50 M <- outer(X, Y, FUN = myfun) where myfun is a quite complicated, vectorized, but symmetrical function (myfun(x, y) = myfun(y, x)). So my code unnecessarily wastes time calculating the lower triangular matrix as well as the upper triangular matrix. How can I avoid that duplication without using slow for-loops? 回答1: If your function is slow and timing scales with size of

Same operation with einsum and tensordot with Numpy

阅读更多关于 Same operation with einsum and tensordot with Numpy

问题 Let's say I have two 3D arrays A and B of shape (3, 4, N) and (4, 3, N) . I can compute the dot product between slices along the third axis with with_einsum = np.eisum('ikl,kjl->ijl', A, B) Is it possible to perform the same operation with numpy.tensordot ? 回答1: With np.einsum('ikl,kjl->ijl', A, B) , there is axis alignment requirement with string - l that stays with the inputs and the output. As such, using np.tensordot might not necessarily result in performance improvement, but since the

Same operation with einsum and tensordot with Numpy

阅读更多关于 Same operation with einsum and tensordot with Numpy

Efficiently replace elements in array based on dictionary - NumPy / Python

阅读更多关于 Efficiently replace elements in array based on dictionary - NumPy / Python

问题 First, of all, my apologies if this has been answered elsewhere. All I could find were questions about replacing elements of a given value, not elements of multiple values. background I have several thousand large np.arrays, like so: # generate dummy data input_array = np.zeros((100,100)) input_array[0:10,0:10] = 1 input_array[20:56, 21:43] = 5 input_array[34:43, 70:89] = 8 In those arrays, I want to replace values, based on a dictionary: mapping = {1:2, 5:3, 8:6} approach At this time, I am

Efficiently replace elements in array based on dictionary - NumPy / Python

阅读更多关于 Efficiently replace elements in array based on dictionary - NumPy / Python

Vectorizing a “pure” function with numpy, assuming many duplicates

阅读更多关于 Vectorizing a “pure” function with numpy, assuming many duplicates

问题 I want to apply a "black box" Python function f to a large array arr . Additional assumptions are: Function f is "pure", e.g. is deterministic with no side effects. Array arr has a small number of unique elements. I can achieve this with a decorator that computes f for each unique element of arr as follows: import numpy as np from time import sleep from functools import wraps N = 1000 np.random.seed(0) arr = np.random.randint(0, 10, size=(N, 2)) def vectorize_pure(f): @wraps(f) def f_vec(arr)

R dplyr::mutate with ifelse conditioned on a global variable recycles result from first row

阅读更多关于 R dplyr::mutate with ifelse conditioned on a global variable recycles result from first row

问题 I am curious why an ifelse() statement within a call to dplyr::mutate() only seems to apply to the first row of my data frame. This returns a single value, which is recycled down the entire column. Since the expressions evaluated in either case of the ifelse() are only valid in the context of my data frame, I would expect the condition check and resulting expression evaluations to be performed on the columns as a whole, not just their first elements. Here's an example: I have a variable