Fast interpolation over 3D array

问题

I have a 3D array that I need to interpolate over one axis (the last dimension). Let's say y.shape = (nx, ny, nz), I want to interpolate in nz for every (nx, ny). However, I want to interpolate for a different value in each [i, j].

Here's some code to exemplify. If I wanted to interpolate to a single value, say new_z, I'd use scipy.interpolate.interp1d like this

# y is a 3D ndarray
# x is a 1D ndarray with the abcissa values
# new_z is a number
f = scipy.interpolate.interp1d(x, y, axis=-1, kind='linear')
result = f(new_z)

However, for this problem what I actually want is to interpolate to a different new_z for each y[i, j]. So I do this:

# y is a 3D ndarray
# x is a 1D ndarray with the abcissa values
# new_z is a 2D array
result = numpy.empty(y.shape[:-1])
for i in range(nx):
    for j in range(ny):
        f = scipy.interpolate.interp1d(x, y[i, j], axis=-1, kind='linear')
        result[i, j] = f(new_z[i, j])

Unfortunately, with multiple loops this becomes inefficient and slow. Is there a better way to do this kind of interpolation? Linear interpolation is sufficient. A possibility is to implement this in Cython, but I was trying to avoid that because I want to have the flexibility of changing to cubic interpolation and don't want to do it by hand in Cython.

回答1:

To speedup high order interpolate, you can call interp1d() only once, and then use the _spline attribute and the low level function _bspleval() in the _fitpack module. Here is the code:

from scipy.interpolate import interp1d
import numpy as np

nx, ny, nz = 30, 40, 50
x = np.arange(0, nz, 1.0)
y = np.random.randn(nx, ny, nz)
new_x = np.random.random_integers(1, (nz-1)*10, size=(nx, ny))/10.0

def original_interpolation(x, y, new_x):
    result = np.empty(y.shape[:-1])
    for i in xrange(nx):
        for j in xrange(ny):
            f = interp1d(x, y[i, j], axis=-1, kind=3)
            result[i, j] = f(new_x[i, j])
    return result

def fast_interpolation(x, y, new_x):
    from scipy.interpolate._fitpack import _bspleval
    f = interp1d(x, y, axis=-1, kind=3)
    xj,cvals,k = f._spline
    result = np.empty_like(new_x)
    for (i, j), value in np.ndenumerate(new_x):
        result[i, j] = _bspleval(value, x, cvals[:, i, j], k, 0)
    return result

r1 = original_interpolation(x, y, new_x)
r2 = fast_interpolation(x, y, new_x)

>>> np.allclose(r1, r2)
True

%timeit original_interpolation(x, y, new_x)
%timeit fast_interpolation(x, y, new_x)
1 loops, best of 3: 3.78 s per loop
100 loops, best of 3: 15.4 ms per loop

回答2:

I don't think interp1d has a method for doing this fast, so you can't avoid the loop here.

Cython you can probably still avoid by coding up the linear interpolation using np.searchsorted, something like this (not tested):

def interp3d(x, y, new_x):
    assert x.ndim == 1 and y.ndim == 3 and new_x.ndim == 2
    assert y.shape[:2] == new_x.shape and x.shape == y.shape[2:]

    nx, ny = y.shape[:2]
    new_x = new_x.ravel()
    j = np.arange(len(new_x))
    k = np.searchsorted(x, new_x).clip(1, len(x) - 1)
    y = y.reshape(-1, x.shape[0])
    p = (new_x - x[k-1]) / (x[k] - x[k-1])
    result = (1 - p) * y[j,k-1] + p * y[j,k]
    return result.reshape(nx, ny)

Doesn't help with cubic interpolation, though.

EDIT: made it a function and fixed off-by-one errors. Some timings vs. Cython (500x500x500 grid):

In [58]: %timeit interp3d(x, y, new_x)
10 loops, best of 3: 82.7 ms per loop

In [59]: %timeit cyfile.interp3d(x, y, new_x)
10 loops, best of 3: 86.3 ms per loop

In [60]: abs(interp3d(x, y, new_x) - cyfile.interp3d(x, y, new_x)).max()
Out[60]: 2.2204460492503131e-16

Though, one can argue that the Cython code is easier to read.

回答3:

As the numpy suggestion above was taking too long, I could wait so here's the cython version for future reference. From some loose benchmarks it is about 3000 times faster (granted, it is only linear interpolation and doesn't to as much as interp1d but it's ok for this purpose).

import numpy as N
cimport numpy as N
cimport cython

DTYPEf = N.float64
ctypedef N.float64_t DTYPEf_t

@cython.boundscheck(False) # turn of bounds-checking for entire function
@cython.wraparound(False)  # turn of bounds-checking for entire function
cpdef interp3d(N.ndarray[DTYPEf_t, ndim=1] x, N.ndarray[DTYPEf_t, ndim=3] y,
               N.ndarray[DTYPEf_t, ndim=2] new_x):
    """
    interp3d(x, y, new_x)

    Performs linear interpolation over the last dimension of a 3D array,
    according to new values from a 2D array new_x. Thus, interpolate
    y[i, j, :] for new_x[i, j].

    Parameters
    ----------
    x : 1-D ndarray (double type)
        Array containg the x (abcissa) values. Must be monotonically
        increasing.
    y : 3-D ndarray (double type)
        Array containing the y values to interpolate.
    x_new: 2-D ndarray (double type)
        Array with new abcissas to interpolate.

    Returns
    -------
    new_y : 3-D ndarray
        Interpolated values.
    """
    cdef int nx = y.shape[0]
    cdef int ny = y.shape[1]
    cdef int nz = y.shape[2]
    cdef int i, j, k
    cdef N.ndarray[DTYPEf_t, ndim=2] new_y = N.zeros((nx, ny), dtype=DTYPEf)

    for i in range(nx):
        for j in range(ny):
            for k in range(1, nz):
                 if x[k] > new_x[i, j]:
                     new_y[i, j] = (y[i, j, k] - y[i, j, k - 1]) * \
                  (new_x[i, j] - x[k-1]) / (x[k] - x[k - 1]) + y[i, j, k - 1]
                     break
    return new_y

回答4:

Building on @pv.'s answer, and vectorising the inner loop, the following gives a substantial speedup (EDIT: changed the expensive numpy.tile to using numpy.lib.stride_tricks.as_strided):

import numpy
from scipy import interpolate

nx = 30
ny = 40
nz = 50

y = numpy.random.randn(nx, ny, nz)
x = numpy.float64(numpy.arange(0, nz))

# We select some locations in the range [0.1, nz-0.1]
new_z = numpy.random.random_integers(1, (nz-1)*10, size=(nx, ny))/10.0

# y is a 3D ndarray
# x is a 1D ndarray with the abcissa values
# new_z is a 2D array

def original_interpolation():
    result = numpy.empty(y.shape[:-1])
    for i in range(nx):
        for j in range(ny):
            f = interpolate.interp1d(x, y[i, j], axis=-1, kind='linear')
            result[i, j] = f(new_z[i, j])

    return result

grid_x, grid_y = numpy.mgrid[0:nx, 0:ny]
def faster_interpolation():
    flat_new_z = new_z.ravel()
    k = numpy.searchsorted(x, flat_new_z)
    k = k.reshape(nx, ny)

    lower_index = [grid_x, grid_y, k-1]
    upper_index = [grid_x, grid_y, k]

    tiled_x = numpy.lib.stride_tricks.as_strided(x, shape=(nx, ny, nz), 
        strides=(0, 0, x.itemsize))

    z_upper = tiled_x[upper_index]
    z_lower = tiled_x[lower_index]

    z_step = z_upper - z_lower
    z_delta = new_z - z_lower

    y_lower = y[lower_index]
    result = y_lower + z_delta * (y[upper_index] - y_lower)/z_step

    return result

# both should be the same (giving a small difference)
print numpy.max(
        numpy.abs(original_interpolation() - faster_interpolation()))

That gives the following times on my machine:

In [8]: timeit foo.original_interpolation()
10 loops, best of 3: 102 ms per loop

In [9]: timeit foo.faster_interpolation()
1000 loops, best of 3: 564 us per loop

Going to nx = 300, ny = 300 and nz = 500, gives a 130x speedup:

In [2]: timeit original_interpolation()
1 loops, best of 3: 8.27 s per loop

In [3]: timeit faster_interpolation()
10 loops, best of 3: 60.1 ms per loop

You'd need a write your own algorithm for cubic interpolation, but it shouldn't be so hard.

回答5:

You could use map_coordinates for that:

from numpy import random, meshgrid, arange
from scipy.ndimage import map_coordinates

(nx, ny, nz) = (4, 5, 6)
# some random array
A = random.rand(nx, ny, nz)

# random floating-point indices in [0, nz-1]
Z = random.rand(nx, ny)*(nz-1)

# regular integer indices of shape (nx,ny)
X, Y = meshgrid(arange(nx), arange(ny), indexing='ij')

coords = (X, Y, Z) # X, Y, and Z are of shape (nx, ny)

print map_coordinates(A, coords, order=1, cval=-999.)

回答6:

Although there are several nice answers, they're still doing 250k interpolations in a fixed 500-long array:

j250k = np.searchsorted( X500, X250k )  # indices in [0, 500)

This can be sped up with a LUT, LookUp Table, with say 5k slots:

lut = np.interp( np.arange(5000), X500, np.arange(500) ).round().astype(int)
xscale = (X - X.min()) * (5000 - 1) \
        / (X.max() - X.min()) 
j = lut.take( xscale.astype(int), mode="clip" )  # take(floats) in numpy 1.7 ?

#---------------------------------------------------------------------------
# X     |    |       | |             |
# j     0    1       2 3             4 ...
# LUT   |....|.......|.|.............|....  -> int j (+ offset in [0, 1) )
#---------------------------------------------------------------------------

searchsorted is pretty fast, time ~ ln2 500, so this is probably not much faster.
But LUTs are very fast in C, a simple speed / memory tradeoff.

来源：https://stackoverflow.com/questions/13488631/fast-interpolation-over-3d-array

标签

python

numpy

scipy

interpolation