Integrate 2D kernel density estimate

前端 未结 3 1479
梦如初夏
梦如初夏 2020-12-06 02:50

I have a x,y distribution of points for which I obtain the KDE through scipy.stats.gaussian_kde. This is my code and how the output looks (the

3条回答
  •  醉梦人生
    2020-12-06 03:44

    A direct way is to integrate

    import matplotlib.pyplot as plt
    import sklearn
    from scipy import integrate
    import numpy as np
    
    mean = [0, 0]
    cov = [[5, 0], [0, 10]]
    x, y = np.random.multivariate_normal(mean, cov, 5000).T
    plt.plot(x, y, 'o')
    plt.show()
    
    sample = np.array(zip(x, y))
    kde = sklearn.neighbors.KernelDensity().fit(sample)
    def f_kde(x,y):
        return np.exp((kde.score_samples([[x,y]])))
    
    point = x1, y1
    integrate.nquad(f_kde, [[-np.inf, x1],[-np.inf, y1]])
    

    The problem is that, this is very slow if you do it in a large scale. For example, if you want to plot the x,y line at x (0,100), it would take a long time to calculate.

    Notice: I used kde from sklearn, but I believe you can also change it into other form as well.


    Using the kernel as defined in the original question:

    import numpy as np
    from scipy import stats
    from scipy import integrate
    
    def integ_func(kde, x1, y1):
    
        def f_kde(x, y):
            return kde((x, y))
    
        integ = integrate.nquad(f_kde, [[-np.inf, x1], [-np.inf, y1]])
    
        return integ
    
    # Obtain data from file.
    data = np.loadtxt('data.dat', unpack=True)
    # Perform a kernel density estimate (KDE) on the data
    kernel = stats.gaussian_kde(data)
    
    # Define the number that will determine the integration limits
    x1, y1 = 2.5, 1.5
    print integ_func(kernel, x1, y1)
    

提交回复
热议问题