How does the stats.gaussian_kde method calcute the pdf?

人走茶凉 提交于 2019-12-24 17:55:57

问题


I am using the scipy.stats.gaussian_kde method from scipy to generate random samples from the data.

It works fine! What I have now found out is that the method also has inbuilt functions to calculate the probability density function of the given set of points (my data).

I would like to know how it calculates the pdf provided a set of points.

Here is small example:

import numpy as np
import scipy.stats
from scipy import stats

def getDistribution1(data):
    kernel = stats.gaussian_kde(data,bw_method=0.06)
    class rv(stats.rv_continuous):
        def _rvs(self, *x, **y):
            return kernel.resample(int(self._size)) #random variates
        def _cdf(self, x):
            return kernel.integrate_box_1d(0,max(x)) #Integrate pdf between two bounds (-inf to x here!)
        def _pdf(self, x):
            return kernel.evaluate(x)  #Evaluate the estimated pdf on a provided set of points
    return rv(name='kdedist')

test_data = np.random.random(100) # random test data 
distribution_data = getDistribution1(test_data)
pdf_data = distribution_data.pdf(test_data) # the pdf of the data

In the above piece of code, there exists three methods,

  1. rvs to generate random samples based on data
  2. cdf which is the integral of the pdf from 0 to max(data)
  3. pdf which is the pdf of the data

The reason I need this pdf is because now I am trying to calculate weights for my data based on probability. So that I can give each of my data point a probability which I can then use as my weights.

I would also like to know from here how I should proceed to calculate my weights?

P.S. Forgive me for asking the same question in cross validated, there seems to be no response!


回答1:


The online docs have a link to the source code, which for gaussian_kde is here: https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/kde.py#L193



来源:https://stackoverflow.com/questions/30186868/how-does-the-stats-gaussian-kde-method-calcute-the-pdf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!