Numpy: find the euclidean distance between two 3-D arrays

北战南征 提交于 2019-12-22 08:47:07

问题


Given, two 3-D arrays of dimensions (2,2,2):

A = [[[ 0,  0],
    [92, 92]],

   [[ 0, 92],
    [ 0, 92]]]

B = [[[ 0,  0],
    [92,  0]],

   [[ 0, 92],
    [92, 92]]]

How do you find the Euclidean distance for each vector in A and B efficiently?

I have tried for-loops but these are slow, and I'm working with 3-D arrays in the order of (>>2, >>2, 2).

Ultimately I want a matrix of the form:

C = [[d1, d2],
     [d3, d4]]

Edit:

I've tried the following loop, but the biggest issue with it is that loses the dimensions I want to keep. But the distances are correct.

[numpy.sqrt((A[row, col][0] - B[row, col][0])**2 + (B[row, col][1] -A[row, col][1])**2) for row in range(2) for col in range(2)]

回答1:


Thinking in a NumPy vectorized way that would be performing element-wise differentiation, squaring and summing along the last axis and finally getting square root. So, the straight-forward implementation would be -

np.sqrt(((A - B)**2).sum(-1))

We could perform the squaring and summing along the last axis in one go with np.einsum and thus make it more efficient, like so -

subs = A - B
out = np.sqrt(np.einsum('ijk,ijk->ij',subs,subs))

Another alternative with numexpr module -

import numexpr as ne
np.sqrt(ne.evaluate('sum((A-B)**2,2)'))

Since, we are working with a length of 2 along the last axis, we could just slice those and feed it to evaluate method. Please note that slicing isn't possible inside the evaluate string. So, the modified implementation would be -

a0 = A[...,0]
a1 = A[...,1]
b0 = B[...,0]
b1 = B[...,1]
out = ne.evaluate('sqrt((a0-b0)**2 + (a1-b1)**2)')

Runtime test

Function definitions -

def sqrt_sum_sq_based(A,B):
    return np.sqrt(((A - B)**2).sum(-1))

def einsum_based(A,B):
    subs = A - B
    return np.sqrt(np.einsum('ijk,ijk->ij',subs,subs))

def numexpr_based(A,B):
    return np.sqrt(ne.evaluate('sum((A-B)**2,2)'))

def numexpr_based_with_slicing(A,B):
    a0 = A[...,0]
    a1 = A[...,1]
    b0 = B[...,0]
    b1 = B[...,1]
    return ne.evaluate('sqrt((a0-b0)**2 + (a1-b1)**2)')

Timings -

In [288]: # Setup input arrays
     ...: dim = 2
     ...: N = 1000
     ...: A = np.random.rand(N,N,dim)
     ...: B = np.random.rand(N,N,dim)
     ...: 

In [289]: %timeit sqrt_sum_sq_based(A,B)
10 loops, best of 3: 40.9 ms per loop

In [290]: %timeit einsum_based(A,B)
10 loops, best of 3: 22.9 ms per loop

In [291]: %timeit numexpr_based(A,B)
10 loops, best of 3: 18.7 ms per loop

In [292]: %timeit numexpr_based_with_slicing(A,B)
100 loops, best of 3: 8.23 ms per loop

In [293]: %timeit np.linalg.norm(A-B, axis=-1) #@dnalow's soln
10 loops, best of 3: 45 ms per loop



回答2:


for completeness:

np.linalg.norm(A-B, axis=-1)


来源:https://stackoverflow.com/questions/40319433/numpy-find-the-euclidean-distance-between-two-3-d-arrays

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!