probability-density

confusion on 2 dimension kernel density estimation in R

有些话、适合烂在心里 提交于 2019-12-06 12:37:11
问题 A kernel density estimator is used to estimate a particular probability density function (see mvstat.net and sckit-learn docs for references) My confusion is about what exactly does kde2d() do? Does it estimate the joint distribution probability density function of two random variables f(a,b) in the below example? And what does the color mean? Here is the code example I am referring to. b <- log10(rgamma(1000, 6, 3)) a <- log10((rweibull(1000, 8, 2))) density <- kde2d(a, b, n=100) colour_flow

memory error by using rbf with scipy

萝らか妹 提交于 2019-12-06 08:42:09
I want to plot some points with the rbf function like here to get the density distribution of the points: if i run the following code, it works fine: from scipy.interpolate.rbf import Rbf # radial basis functions import cv2 import matplotlib.pyplot as plt import numpy as np # import data x = [1, 1, 2 ,3, 2, 7, 8, 6, 6, 7, 6.5, 7.5, 9, 8, 9, 8.5] y = [0, 2, 5, 6, 1, 2, 9, 2, 3, 3, 2.5, 2, 8, 8, 9, 8.5] d = np.ones(len(x)) print(d) ti = np.linspace(-1,10) xx, yy = np.meshgrid(ti, ti) rbf = Rbf(x, y, d, function='gaussian') jet = cm = plt.get_cmap('jet') zz = rbf(xx, yy) plt.pcolor(xx, yy, zz,

Sampling from a multivariate probability density function in python

不想你离开。 提交于 2019-12-06 07:20:21
I have a multivariate probability density function P(x,y,z), and I want to sample from it. Normally, I would use numpy.random.choice() for this sort of task, but this function only works for 1-dimensional probability densities. Is there an equivalent function for multivariate pdfs? There a few different paths one can follow here. (1) If P(x,y,z) factors as P(x,y,z) = P(x) P(y) P(z) (i.e., x, y, and z are independent) then you can sample each one separately. (2) If P(x,y,z) has a more general factorization, you can reduce the number of variables that need to be sampled to whatever's conditional

Plot Lognormal Probability Density in R

两盒软妹~` 提交于 2019-12-06 04:39:13
I am trying to generate a plot for Lognormal Probability Density in R, with 3 different means log and standards deviation log. I have tried the following, but my graph is so ugly and does not look good at all. x<- seq(0,10,length = 100) a <- dlnorm(x, meanlog = 0, sdlog = 1, log = FALSE) b <- dlnorm(x, meanlog = 0, sdlog = 1.5, log = FALSE) g <- dlnorm(x, meanlog = 1.5, sdlog = 0.2, log = FALSE) plot(x,a, lty=5, col="blue", lwd=3) lines(x,b, lty=2, col = "red") lines(x,g, lty=4, col = "green") I even was trying to add legend on the right top for each mean log and standard deviation log, but it

How to random sample lognormal data in Python using the inverse CDF and specify target percentiles?

南楼画角 提交于 2019-12-06 03:35:00
I'm trying to generate random samples from a lognormal distribution in Python, the application is for simulating network traffic. I'd like to generate samples such that: The modal sample result is 320 (~10^2.5) 80% of the samples lie within the range 100 to 1000 (10^2 to 10^3) My strategy is to use the inverse CDF (or Smirnov transform I believe): Use the PDF for a normal distribution centred around 2.5 to calculate the PDF for 10^x where x ~ N(2.5,sigma). Calculate the CDF for the above distribution. Generate random uniform data along the interval 0 to 1. Use the inverse CDF to transform the

Calculating a 2D joint probability distribution

和自甴很熟 提交于 2019-12-06 01:30:33
I have many points inside a square. I want to partition the square in many small rectangles and check how many points fall in each rectangle, i.e. I want to compute the joint probability distribution of the points. I am reporting a couple of common sense approaches, using loops and not very efficient: % Data N = 1e5; % number of points xy = rand(N, 2); % coordinates of points xy(randi(2*N, 100, 1)) = 0; % add some points on one side xy(randi(2*N, 100, 1)) = 1; % add some points on the other side xy(randi(N, 100, 1), :) = 0; % add some points on one corner xy(randi(N, 100, 1), :) = 1; % add

Plotting probability density function with frequency counts

青春壹個敷衍的年華 提交于 2019-12-05 23:49:49
I want to convert fitted distribution to frequency. import numpy as np import matplotlib.pyplot as plt from scipy import stats %matplotlib notebook # sample data generation np.random.seed(42) data = sorted(stats.lognorm.rvs(s=0.5, loc=1, scale=1000, size=1000)) # fit lognormal distribution shape, loc, scale = stats.lognorm.fit(data, loc=0) pdf_lognorm = stats.lognorm.pdf(data, shape, loc, scale) fig, ax = plt.subplots(figsize=(8, 4)) ax.hist(data, bins='auto', density=True) ax.plot(data, pdf_lognorm) ax.set_ylabel('probability') ax.set_title('Linear Scale') The above code snippet will generate

Reproduce a prior density plot in R

℡╲_俬逩灬. 提交于 2019-12-05 06:22:58
问题 I am still getting used to converting formulas to code in R. I am trying to work through the empirical Bayes chapter (chapter 6) in Computer Age Statistical Inference. I would like to produce plot 6.4, but in order to do so I need to find the marginal distribution of a function of the data. To obtain the the data and plot it I did the following. nodes <- read.table("https://web.stanford.edu/~hastie/CASI_files/DATA/nodes.txt", header = T) nodes %>% ggplot(aes(x=x/n))+ geom_histogram(bins = 30)

Random sampling from a dataset, while preserving original probability distribution

不想你离开。 提交于 2019-12-05 05:56:11
I have a set of >2000 numbers, gathered from measurement. I want to sample from this data set, ~10 times in each test, while preserving probability distribution overall, and in each test (to extent approximately possible). For example, in each test, I want some small value, some middle class value, some big value, with the mean and variance approximately close to the original distribution. Combining all the tests, I also want the total mean and variance of all the samples, approximately close to the original distribution. As my dataset is a long-tail probability distribution , the amount of

Generate random variables from a distribution function using inverse sampling

狂风中的少年 提交于 2019-12-04 21:37:44
问题 I have a specific density function and I want to generate random variables knowing the expression of the density function. For example, the density function is : df=function(x) { - ((-a1/a2)*exp((x-a3)/a2))/(1+exp((x-a3)/a2))^2 } From this expression I want to generate 1000 random elements with the same distribution. I know I should use the inverse sampling method. For this, I use the CDF function of my PDF which is calculated as follows: cdf=function(x) { 1 - a1/(1+exp((x-a3)/a2)) The idea