numpy-broadcasting

How to find the pairwise differences between rows of two very large matrices using numpy?

一笑奈何 提交于 2019-11-27 15:50:39
Given two matrices, I want to compute the pairwise differences between all rows. Each matrix has 1000 rows and 100 columns so they are fairly large. I tried using a for loop and pure broadcasting but the for loop seem to be working faster. Am I doing something wrong? Here is the code: from numpy import * A = random.randn(1000,100) B = random.randn(1000,100) start = time.time() for a in A: sum((a - B)**2,1) print time.time() - start # pure broadcasting start = time.time() ((A[:,newaxis,:] - B)**2).sum(-1) print time.time() - start The broadcasting method takes about 1 second longer and it's

Numpy array broadcasting rules

有些话、适合烂在心里 提交于 2019-11-27 14:35:30
问题 I'm having some trouble understanding the rules for array broadcasting in Numpy. Obviously, if you perform element-wise multiplication on two arrays of the same dimensions and shape, everything is fine. Also, if you multiply a multi-dimensional array by a scalar it works. This I understand. But if you have two N-dimensional arrays of different shapes, it's unclear to me exactly what the broadcasting rules are. This documentation/tutorial explains that: In order to broadcast, the size of the

What are the rules for comparing numpy arrays using ==?

£可爱£侵袭症+ 提交于 2019-11-27 04:47:01
问题 For example, trying to make sense of these results: >>> x array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> (x == np.array([[1],[2]])).astype(np.float32) array([[ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32) >>> (x == np.array([1,2])) False >>> (x == np.array([[1]])).astype(np.float32) array([[ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32) >>> (x == np.array([1])).astype(np.float32) array([ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],

Numpy `ValueError: operands could not be broadcast together with shape …`

谁说我不能喝 提交于 2019-11-26 22:00:06
问题 Im using python 2.7 and am attempting a forcasting on some random data from 1.00000000 to 3.0000000008. There are approx 196 items in my array and I get the error ValueError: operands could not be broadcast together with shape (2) (50) I do not seem to be able to resolve this issue on my own. Any help or links to relevant documentation would be greatly appreciated. Here is the code I am using that generates this error nsample = 50 sig = 0.25 x1 = np.linspace(0,20, nsample) X = np.c_[x1, np

Subtracting numpy arrays of different shape efficiently

可紊 提交于 2019-11-26 21:19:41
问题 Using the excellent broadcasting rules of numpy you can subtract a shape (3,) array v from a shape (5,3) array X with X - v The result is a shape (5,3) array in which each row i is the difference X[i] - v . Is there a way to subtract a shape (n,3) array w from X so that each row of w is subtracted form the whole array X without explicitly using a loop? 回答1: You need to extend the dimensions of X with None/np.newaxis to form a 3D array and then do subtraction by w . This would bring in

NumPy Broadcasting: Calculating sum of squared differences between two arrays

纵饮孤独 提交于 2019-11-26 19:08:33
I have the following code. It is taking forever in Python. There must be a way to translate this calculation into a broadcast... def euclidean_square(a,b): squares = np.zeros((a.shape[0],b.shape[0])) for i in range(squares.shape[0]): for j in range(squares.shape[1]): diff = a[i,:] - b[j,:] sqr = diff**2.0 squares[i,j] = np.sum(sqr) return squares You can use np.einsum after calculating the differences in a broadcasted way , like so - ab = a[:,None,:] - b out = np.einsum('ijk,ijk->ij',ab,ab) Or use scipy's cdist with its optional metric argument set as 'sqeuclidean' to give us the squared

Numpy broadcasting to the 4th dimension: … vs. : vs None

微笑、不失礼 提交于 2019-11-26 18:37:57
问题 In a montecarlo simulation I have the following 7 pokercards for 2 players and 3 different montecarlo runs. self.cards: array([[[ 6., 12.], [ 1., 6.], [ 3., 3.], [ 8., 8.], [ 1., 1.], [ 4., 4.], [ 2., 2.]], [[ 6., 7.], [ 1., 1.], [ 3., 3.], [ 2., 2.], [ 12., 12.], [ 5., 5.], [ 10., 10.]], [[ 6., 3.], [ 1., 11.], [ 2., 2.], [ 6., 6.], [ 12., 12.], [ 6., 6.], [ 7., 7.]]]) The corresponding suits are: self.suits array([[[ 2., 1.], [ 1., 2.], [ 2., 2.], [ 2., 2.], [ 1., 1.], [ 2., 2.], [ 2., 2.]]

Numpy - create matrix with rows of vector

99封情书 提交于 2019-11-26 18:00:53
I have a vector [x,y,z,q] and I want to create a matrix: [[x,y,z,q], [x,y,z,q], [x,y,z,q], ... [x,y,z,q]] with m rows. I think this could be done in some smart way, using broadcasting, but I can only think of doing it with a for loop. Certainly possible with broadcasting after adding with m zeros along the columns, like so - np.zeros((m,1),dtype=vector.dtype) + vector Now, NumPy already has an in-built function np.tile for exactly that same task - np.tile(vector,(m,1)) Sample run - In [496]: vector Out[496]: array([4, 5, 8, 2]) In [497]: m = 5 In [498]: np.zeros((m,1),dtype=vector.dtype) +

Vectorized NumPy linspace for multiple start and stop values

流过昼夜 提交于 2019-11-26 17:51:41
I need to create a 2D array where each row may start and end with a different number. Assume that first and last element of each row is given and all other elements are just interpolated according to length of the rows In a simple case let's say I want to create a 3X3 array with same start at 0 but different end given by W below: array([[ 0., 1., 2.], [ 0., 2., 4.], [ 0., 3., 6.]]) Is there a better way to do this than the following: D=np.ones((3,3))*np.arange(0,3) D=D/D[:,-1] W=np.array([2,4,6]) # last element of each row assumed given Res= (D.T*W).T Here's an approach using broadcasting -

How to find the pairwise differences between rows of two very large matrices using numpy?

谁都会走 提交于 2019-11-26 17:18:47
问题 Given two matrices, I want to compute the pairwise differences between all rows. Each matrix has 1000 rows and 100 columns so they are fairly large. I tried using a for loop and pure broadcasting but the for loop seem to be working faster. Am I doing something wrong? Here is the code: from numpy import * A = random.randn(1000,100) B = random.randn(1000,100) start = time.time() for a in A: sum((a - B)**2,1) print time.time() - start # pure broadcasting start = time.time() ((A[:,newaxis,:] - B)