Autocorrelation of a multidimensional array in numpy

匿名 (未验证) 提交于 2019-12-03 01:27:01

问题:

I have a two dimensional array, i.e. an array of sequences which are also arrays. For each sequence I would like to calculate the autocorrelation, so that for a (5,4) array, I would get 5 results, or an array of dimension (5,7).

I know I could just loop over the first dimension, but that's slow and my last resort. Is there another way?

Thanks!

EDIT:

Based on the chosen answer plus the comment from mtrw, I have the following function:

def xcorr(x):   """FFT based autocorrelation function, which is faster than numpy.correlate"""   # x is supposed to be an array of sequences, of shape (totalelements, length)   fftx = fft(x, n=(length*2-1), axis=1)   ret = ifft(fftx * np.conjugate(fftx), axis=1)   ret = fftshift(ret, axes=1)   return ret 

Note that length is a global variable in my code, so be sure to declare it. I also didn't restrict the result to real numbers, since I need to take into account complex numbers as well.

回答1:

Using FFT-based autocorrelation:

import numpy from numpy.fft import fft, ifft  data = numpy.arange(5*4).reshape(5, 4) print data ##[[ 0  1  2  3] ## [ 4  5  6  7] ## [ 8  9 10 11] ## [12 13 14 15] ## [16 17 18 19]] dataFT = fft(data, axis=1) dataAC = ifft(dataFT * numpy.conjugate(dataFT), axis=1).real print dataAC ##[[   14.     8.     6.     8.] ## [  126.   120.   118.   120.] ## [  366.   360.   358.   360.] ## [  734.   728.   726.   728.] ## [ 1230.  1224.  1222.  1224.]] 

I'm a little confused by your statement about the answer having dimension (5, 7), so maybe there's something important I'm not understanding.

EDIT: At the suggestion of mtrw, a padded version that doesn't wrap around:

import numpy from numpy.fft import fft, ifft  data = numpy.arange(5*4).reshape(5, 4) padding = numpy.zeros((5, 3)) dataPadded = numpy.concatenate((data, padding), axis=1) print dataPadded ##[[  0.   1.   2.   3.   0.   0.   0.   0.] ## [  4.   5.   6.   7.   0.   0.   0.   0.] ## [  8.   9.  10.  11.   0.   0.   0.   0.] ## [ 12.  13.  14.  15.   0.   0.   0.   0.] ## [ 16.  17.  18.  19.   0.   0.   0.   0.]] dataFT = fft(dataPadded, axis=1) dataAC = ifft(dataFT * numpy.conjugate(dataFT), axis=1).real print numpy.round(dataAC, 10)[:, :4] ##[[   14.     8.     3.     0.     0.     3.     8.] ## [  126.    92.    59.    28.    28.    59.    92.] ## [  366.   272.   179.    88.    88.   179.   272.] ## [  734.   548.   363.   180.   180.   363.   548.] ## [ 1230.   920.   611.   304.   304.   611.   920.]] 

There must be a more efficient way to do this, especially because autocorrelation is symmetric and I don't take advantage of that.



回答2:

For really large arrays it becomes important to have n = 2 ** p, where p is an integer. This will save you huge amounts of time. For example:

def xcorr(x):   l = 2 ** int(np.log2(length * 2 - 1))   fftx = fft(x, n = l, axis = 1)   ret = ifft(fftx * np.conjugate(fftx), axis = 1)   ret = fftshift(ret, axes=1)   return ret 

This might give you wrap-around errors. For large arrays the auto correlation should be insignificant near the edges, though.



回答3:

Maybe it's just a preference, but I wanted to follow from the definition. I personally find it a bit easier to follow that way. This is my implementation for an arbitrary nd array.

 from itertools import product from numpy import empty, roll  def autocorrelate(x):     """     Compute the multidimensional autocorrelation of an nd array.     input: an nd array of floats     output: an nd array of autocorrelations     """      # used for transposes     t = roll(range(x.ndim), 1)      # pairs of indexes     # the first is for the autocorrelation array     # the second is the shift     ii = [list(enumerate(range(1, s - 1))) for s in x.shape]      # initialize the resulting autocorrelation array     acor = empty(shape=[len(s0) for s0 in ii])      # iterate over all combinations of directional shifts     for i in product(*ii):         # extract the indexes for         # the autocorrelation array          # and original array respectively         i1, i2 = asarray(i).T          x1 = x.copy()         x2 = x.copy()          for i0 in i2:             # clip the unshifted array at the end             x1 = x1[:-i0]             # and the shifted array at the beginning             x2 = x2[i0:]              # prepare to do the same for              # the next axis             x1 = x1.transpose(t)             x2 = x2.transpose(t)          # normalize shifted and unshifted arrays         x1 -= x1.mean()         x1 /= x1.std()         x2 -= x2.mean()         x2 /= x2.std()          # compute the autocorrelation directly         # from the definition         acor[tuple(i1)] = (x1 * x2).mean()      return acor 



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!