可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a two dimensional array, i.e. an array of sequences which are also arrays. For each sequence I would like to calculate the autocorrelation, so that for a (5,4) array, I would get 5 results, or an array of dimension (5,7).
I know I could just loop over the first dimension, but that's slow and my last resort. Is there another way?
Thanks!
EDIT:
Based on the chosen answer plus the comment from mtrw, I have the following function:
def xcorr(x): """FFT based autocorrelation function, which is faster than numpy.correlate""" # x is supposed to be an array of sequences, of shape (totalelements, length) fftx = fft(x, n=(length*2-1), axis=1) ret = ifft(fftx * np.conjugate(fftx), axis=1) ret = fftshift(ret, axes=1) return ret
Note that length is a global variable in my code, so be sure to declare it. I also didn't restrict the result to real numbers, since I need to take into account complex numbers as well.
回答1:
Using FFT-based autocorrelation:
import numpy from numpy.fft import fft, ifft data = numpy.arange(5*4).reshape(5, 4) print data ##[[ 0 1 2 3] ## [ 4 5 6 7] ## [ 8 9 10 11] ## [12 13 14 15] ## [16 17 18 19]] dataFT = fft(data, axis=1) dataAC = ifft(dataFT * numpy.conjugate(dataFT), axis=1).real print dataAC ##[[ 14. 8. 6. 8.] ## [ 126. 120. 118. 120.] ## [ 366. 360. 358. 360.] ## [ 734. 728. 726. 728.] ## [ 1230. 1224. 1222. 1224.]]
I'm a little confused by your statement about the answer having dimension (5, 7), so maybe there's something important I'm not understanding.
EDIT: At the suggestion of mtrw, a padded version that doesn't wrap around:
import numpy from numpy.fft import fft, ifft data = numpy.arange(5*4).reshape(5, 4) padding = numpy.zeros((5, 3)) dataPadded = numpy.concatenate((data, padding), axis=1) print dataPadded ##[[ 0. 1. 2. 3. 0. 0. 0. 0.] ## [ 4. 5. 6. 7. 0. 0. 0. 0.] ## [ 8. 9. 10. 11. 0. 0. 0. 0.] ## [ 12. 13. 14. 15. 0. 0. 0. 0.] ## [ 16. 17. 18. 19. 0. 0. 0. 0.]] dataFT = fft(dataPadded, axis=1) dataAC = ifft(dataFT * numpy.conjugate(dataFT), axis=1).real print numpy.round(dataAC, 10)[:, :4] ##[[ 14. 8. 3. 0. 0. 3. 8.] ## [ 126. 92. 59. 28. 28. 59. 92.] ## [ 366. 272. 179. 88. 88. 179. 272.] ## [ 734. 548. 363. 180. 180. 363. 548.] ## [ 1230. 920. 611. 304. 304. 611. 920.]]
There must be a more efficient way to do this, especially because autocorrelation is symmetric and I don't take advantage of that.
回答2:
For really large arrays it becomes important to have n = 2 ** p, where p is an integer. This will save you huge amounts of time. For example:
def xcorr(x): l = 2 ** int(np.log2(length * 2 - 1)) fftx = fft(x, n = l, axis = 1) ret = ifft(fftx * np.conjugate(fftx), axis = 1) ret = fftshift(ret, axes=1) return ret
This might give you wrap-around errors. For large arrays the auto correlation should be insignificant near the edges, though.
回答3:
Maybe it's just a preference, but I wanted to follow from the definition. I personally find it a bit easier to follow that way. This is my implementation for an arbitrary nd array.
from itertools import product from numpy import empty, roll def autocorrelate(x): """ Compute the multidimensional autocorrelation of an nd array. input: an nd array of floats output: an nd array of autocorrelations """ # used for transposes t = roll(range(x.ndim), 1) # pairs of indexes # the first is for the autocorrelation array # the second is the shift ii = [list(enumerate(range(1, s - 1))) for s in x.shape] # initialize the resulting autocorrelation array acor = empty(shape=[len(s0) for s0 in ii]) # iterate over all combinations of directional shifts for i in product(*ii): # extract the indexes for # the autocorrelation array # and original array respectively i1, i2 = asarray(i).T x1 = x.copy() x2 = x.copy() for i0 in i2: # clip the unshifted array at the end x1 = x1[:-i0] # and the shifted array at the beginning x2 = x2[i0:] # prepare to do the same for # the next axis x1 = x1.transpose(t) x2 = x2.transpose(t) # normalize shifted and unshifted arrays x1 -= x1.mean() x1 /= x1.std() x2 -= x2.mean() x2 /= x2.std() # compute the autocorrelation directly # from the definition acor[tuple(i1)] = (x1 * x2).mean() return acor