问题
I'd like to quad or cube interpolate a long series of floats (or vectors) in 1d, where long could be 1E+05 or 1E+06 (or more). For some reason SciPi's handy interp1d()'s time overhead to prepare the interpolators scales as almost n^3 for both quadratic and cubic splines, taking over a minute for a few thousand points.
According to comments here (a question I will delete, I'm keeping it there temporarily for comment access) it takes a milli-second on other computers, so something is obviously pathologically wrong here.
My installation is a bit old, but SciPy's .interp1d() has been around for quite a while.
np.__version__ '1.13.0'
scipy.__version__ '0.17.0'
What can I do to try to figure out this incredible slowness for interpolation?
import time
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
times = []
for n in np.logspace(1, 3.5, 6).astype(int):
x = np.arange(n, dtype=float)
y = np.vstack((np.cos(x), np.sin(x)))
start = time.clock()
bob = interp1d(x, y, kind='quadratic', assume_sorted=True)
times.append((n, time.clock() - start))
n, tim = zip(*times)
plt.figure()
plt.plot(n, tim)
plt.xscale('log')
plt.yscale('log')
plt.show()
回答1:
Short answer: update your scipy installation.
Longer answer: pre-0.19, interp1d was based on splmake which is using linear algebra with full matrices. In scipy 0.19, it was refactored to use banded linear algebra. As a result, (below is with scipy 0.19.1)
In [14]: rndm = np.random.RandomState(1234)
In [15]: for n in [100, 1000, 10000, int(1e6)]:
...: x = np.sort(rndm.uniform(size=n))
...: y = np.sin(x)
...: %timeit interp1d(x, y, kind='quadratic', assume_sorted=True)
...:
...:
244 µs ± 4.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
422 µs ± 4.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2.17 ms ± 50.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
227 ms ± 4.45 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [16]:
In [16]: for n in [100, 1000, 10000, int(1e6)]:
...: x = np.sort(rndm.uniform(size=n))
...: y = np.sin(x)
...: %timeit interp1d(x, y, kind='cubic', assume_sorted=True)
...:
...:
241 µs ± 4.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
462 µs ± 4.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2.64 ms ± 37.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
276 ms ± 1.91 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Your other options are CubicSpline if you want cubic splines specifically (this is new in scipy 0.18) or make_interp_spline if you want also quadratic splines (new in scipy 0.19; this is what interp1d is using under the hood).
来源:https://stackoverflow.com/questions/49427533/why-would-scipys-interp1d-take-over-a-minute-to-build-an-interpolator