Integrate 2D kernel density estimate

前端未结

关注

 3  1479

梦如初夏 2020-12-06 02:50

I have a x,y distribution of points for which I obtain the KDE through scipy.stats.gaussian_kde. This is my code and how the output looks (the

3条回答

醉梦人生 (楼主)

2020-12-06 03:44

A direct way is to integrate

import matplotlib.pyplot as plt
import sklearn
from scipy import integrate
import numpy as np

mean = [0, 0]
cov = [[5, 0], [0, 10]]
x, y = np.random.multivariate_normal(mean, cov, 5000).T
plt.plot(x, y, 'o')
plt.show()

sample = np.array(zip(x, y))
kde = sklearn.neighbors.KernelDensity().fit(sample)
def f_kde(x,y):
    return np.exp((kde.score_samples([[x,y]])))

point = x1, y1
integrate.nquad(f_kde, [[-np.inf, x1],[-np.inf, y1]])

The problem is that, this is very slow if you do it in a large scale. For example, if you want to plot the x,y line at x (0,100), it would take a long time to calculate.

Notice: I used kde from sklearn, but I believe you can also change it into other form as well.

Using the kernel as defined in the original question:

import numpy as np
from scipy import stats
from scipy import integrate

def integ_func(kde, x1, y1):

    def f_kde(x, y):
        return kde((x, y))

    integ = integrate.nquad(f_kde, [[-np.inf, x1], [-np.inf, y1]])

    return integ

# Obtain data from file.
data = np.loadtxt('data.dat', unpack=True)
# Perform a kernel density estimate (KDE) on the data
kernel = stats.gaussian_kde(data)

# Define the number that will determine the integration limits
x1, y1 = 2.5, 1.5
print integ_func(kernel, x1, y1)

0 讨论(0)

查看其它3个回答