Using scipy.stats.gaussian_kde with 2 dimensional data

后端 未结 4 1970
独厮守ぢ
独厮守ぢ 2020-12-31 20:13

I\'m trying to use the scipy.stats.gaussian_kde class to smooth out some discrete data collected with latitude and longitude information, so it shows up as somewhat similar

4条回答
  •  情话喂你
    2020-12-31 20:36

    I think you are mixing up kernel density estimation with interpolation or maybe kernel regression. KDE estimates the distribution of points if you have a larger sample of points.

    I'm not sure which interpolation you want, but either the splines or rbf in scipy.interpolate will be more appropriate.

    If you want one-dimensional kernel regression, then you can find a version in scikits.statsmodels with several different kernels.

    update: here is an example (if this is what you want)

    >>> data = 2 + 2*np.random.randn(2, 100)
    >>> kde = stats.gaussian_kde(data)
    >>> kde.evaluate(np.array([[1,2,3],[1,2,3]]))
    array([ 0.02573917,  0.02470436,  0.03084282])
    

    gaussian_kde has variables in rows and observations in columns, so reversed orientation from the usual in stats. In your example, all three points are on a line, so it has perfect correlation. That is, I guess, the reason for the singular matrix.

    Adjusting the array orientation and adding a small noise, the example works, but still looks very concentrated, for example you don't have any sample point near (3,3):

    >>> data = np.array([[1.1, 1.1],
                  [1.2, 1.2],
                  [1.3, 1.3]]).T
    >>> data = data + 0.01*np.random.randn(2,3)
    >>> kde = stats.gaussian_kde(data)
    >>> kde.evaluate(np.array([[1,2,3],[1,2,3]]))
    array([  7.70204299e+000,   1.96813149e-044,   1.45796523e-251])
    

提交回复
热议问题