probability-density | 易学教程

confusion on 2 dimension kernel density estimation in R

阅读更多关于 confusion on 2 dimension kernel density estimation in R

问题 A kernel density estimator is used to estimate a particular probability density function (see mvstat.net and sckit-learn docs for references) My confusion is about what exactly does kde2d() do? Does it estimate the joint distribution probability density function of two random variables f(a,b) in the below example? And what does the color mean? Here is the code example I am referring to. b <- log10(rgamma(1000, 6, 3)) a <- log10((rweibull(1000, 8, 2))) density <- kde2d(a, b, n=100) colour_flow

memory error by using rbf with scipy

阅读更多关于 memory error by using rbf with scipy

I want to plot some points with the rbf function like here to get the density distribution of the points: if i run the following code, it works fine: from scipy.interpolate.rbf import Rbf # radial basis functions import cv2 import matplotlib.pyplot as plt import numpy as np # import data x = [1, 1, 2 ,3, 2, 7, 8, 6, 6, 7, 6.5, 7.5, 9, 8, 9, 8.5] y = [0, 2, 5, 6, 1, 2, 9, 2, 3, 3, 2.5, 2, 8, 8, 9, 8.5] d = np.ones(len(x)) print(d) ti = np.linspace(-1,10) xx, yy = np.meshgrid(ti, ti) rbf = Rbf(x, y, d, function='gaussian') jet = cm = plt.get_cmap('jet') zz = rbf(xx, yy) plt.pcolor(xx, yy, zz,

Sampling from a multivariate probability density function in python

阅读更多关于 Sampling from a multivariate probability density function in python

I have a multivariate probability density function P(x,y,z), and I want to sample from it. Normally, I would use numpy.random.choice() for this sort of task, but this function only works for 1-dimensional probability densities. Is there an equivalent function for multivariate pdfs? There a few different paths one can follow here. (1) If P(x,y,z) factors as P(x,y,z) = P(x) P(y) P(z) (i.e., x, y, and z are independent) then you can sample each one separately. (2) If P(x,y,z) has a more general factorization, you can reduce the number of variables that need to be sampled to whatever's conditional

Plot Lognormal Probability Density in R

阅读更多关于 Plot Lognormal Probability Density in R

I am trying to generate a plot for Lognormal Probability Density in R, with 3 different means log and standards deviation log. I have tried the following, but my graph is so ugly and does not look good at all. x<- seq(0,10,length = 100) a <- dlnorm(x, meanlog = 0, sdlog = 1, log = FALSE) b <- dlnorm(x, meanlog = 0, sdlog = 1.5, log = FALSE) g <- dlnorm(x, meanlog = 1.5, sdlog = 0.2, log = FALSE) plot(x,a, lty=5, col="blue", lwd=3) lines(x,b, lty=2, col = "red") lines(x,g, lty=4, col = "green") I even was trying to add legend on the right top for each mean log and standard deviation log, but it

How to random sample lognormal data in Python using the inverse CDF and specify target percentiles?

阅读更多关于 How to random sample lognormal data in Python using the inverse CDF and specify target percentiles?

I'm trying to generate random samples from a lognormal distribution in Python, the application is for simulating network traffic. I'd like to generate samples such that: The modal sample result is 320 (~10^2.5) 80% of the samples lie within the range 100 to 1000 (10^2 to 10^3) My strategy is to use the inverse CDF (or Smirnov transform I believe): Use the PDF for a normal distribution centred around 2.5 to calculate the PDF for 10^x where x ~ N(2.5,sigma). Calculate the CDF for the above distribution. Generate random uniform data along the interval 0 to 1. Use the inverse CDF to transform the

Calculating a 2D joint probability distribution

阅读更多关于 Calculating a 2D joint probability distribution

I have many points inside a square. I want to partition the square in many small rectangles and check how many points fall in each rectangle, i.e. I want to compute the joint probability distribution of the points. I am reporting a couple of common sense approaches, using loops and not very efficient: % Data N = 1e5; % number of points xy = rand(N, 2); % coordinates of points xy(randi(2*N, 100, 1)) = 0; % add some points on one side xy(randi(2*N, 100, 1)) = 1; % add some points on the other side xy(randi(N, 100, 1), :) = 0; % add some points on one corner xy(randi(N, 100, 1), :) = 1; % add

Plotting probability density function with frequency counts

阅读更多关于 Plotting probability density function with frequency counts

I want to convert fitted distribution to frequency. import numpy as np import matplotlib.pyplot as plt from scipy import stats %matplotlib notebook # sample data generation np.random.seed(42) data = sorted(stats.lognorm.rvs(s=0.5, loc=1, scale=1000, size=1000)) # fit lognormal distribution shape, loc, scale = stats.lognorm.fit(data, loc=0) pdf_lognorm = stats.lognorm.pdf(data, shape, loc, scale) fig, ax = plt.subplots(figsize=(8, 4)) ax.hist(data, bins='auto', density=True) ax.plot(data, pdf_lognorm) ax.set_ylabel('probability') ax.set_title('Linear Scale') The above code snippet will generate

Reproduce a prior density plot in R

阅读更多关于 Reproduce a prior density plot in R

问题 I am still getting used to converting formulas to code in R. I am trying to work through the empirical Bayes chapter (chapter 6) in Computer Age Statistical Inference. I would like to produce plot 6.4, but in order to do so I need to find the marginal distribution of a function of the data. To obtain the the data and plot it I did the following. nodes <- read.table("https://web.stanford.edu/~hastie/CASI_files/DATA/nodes.txt", header = T) nodes %>% ggplot(aes(x=x/n))+ geom_histogram(bins = 30)

Random sampling from a dataset, while preserving original probability distribution

阅读更多关于 Random sampling from a dataset, while preserving original probability distribution

I have a set of >2000 numbers, gathered from measurement. I want to sample from this data set, ~10 times in each test, while preserving probability distribution overall, and in each test (to extent approximately possible). For example, in each test, I want some small value, some middle class value, some big value, with the mean and variance approximately close to the original distribution. Combining all the tests, I also want the total mean and variance of all the samples, approximately close to the original distribution. As my dataset is a long-tail probability distribution , the amount of

Generate random variables from a distribution function using inverse sampling

阅读更多关于 Generate random variables from a distribution function using inverse sampling

问题 I have a specific density function and I want to generate random variables knowing the expression of the density function. For example, the density function is : df=function(x) { - ((-a1/a2)*exp((x-a3)/a2))/(1+exp((x-a3)/a2))^2 } From this expression I want to generate 1000 random elements with the same distribution. I know I should use the inverse sampling method. For this, I use the CDF function of my PDF which is calculated as follows: cdf=function(x) { 1 - a1/(1+exp((x-a3)/a2)) The idea