statistics | 易学教程

Create random numbers with left skewed probability distribution

阅读更多关于 Create random numbers with left skewed probability distribution

问题 I would like to pick a number randomly between 1-100 such that the probability of getting numbers 60-100 is higher than 1-59. I would like to have the probability to be a left-skewed distribution for numbers 1-100. That is to say, it has a long tail and a peak. Something along the lines: pers = np.arange(1,101,1) prob = <left-skewed distribution> number = np.random.choice(pers, 1, p=prob) I do not know how to generate a left-skewed discrete probability function. Any ideas? Thanks! 回答1: Like

ggplot2: Plotting regression lines with different intercepts but with same slope

阅读更多关于 ggplot2: Plotting regression lines with different intercepts but with same slope

问题 I want to plot regression lines with different intercepts but with the same slope. With the following ggplot2 code, I can plot regression lines with different intercepts and different slopes. But could not figured out how to draw regression lines with different different intercepts but the same slopes. library(ggplot2) ggplot(data=df3, mapping=aes(x=Income, y=Consumption, color=Gender)) + geom_point() + geom_smooth(data=df3, method = "lm", se=FALSE, mapping=aes(x=Income, y=Consumption))

Any alternatives to Google Trends? [closed]

阅读更多关于 Any alternatives to Google Trends? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I'm writing a small helper utility for obscure software that is used at a local shop. Basically, I would like to know if anyone searches for anything associated with that software and if publishing my work on the Internet would make any sense. I entered the name of the software into Google Trends, but my terms

mean and variance of image in single pass

阅读更多关于 mean and variance of image in single pass

问题 am trying to calculate mean and variance using 3X3 window over image(hXw) in opencv...here is my code...is there any accuracy issues with this??or is there any other efficient method to do it in one pass.? int pi,a,b; for(i=1;i<h-1;i++) { for(j=1;j<w-1;j++) { int sq=0,sum=0; double mean=0; double var=0; for(a=-1;a<=1;a++) { for(b=-1;b<=1;b++) { pi=data[(i+a)*step+(j+b)]; sq=pi*pi; sum=sum+sq; mean=mean+pi; } } mean=mean/9; double soa=mean*mean;//square of average double aos=sum/9;//mean of

Smoothing Small Data Set With Second Order Quadratic Curve

阅读更多关于 Smoothing Small Data Set With Second Order Quadratic Curve

问题 I'm doing some specific signal analysis, and I am in need of a method that would smooth out a given bell-shaped distribution curve. A running average approach isn't producing the results I desire. I want to keep the min/max, and general shape of my fitted curve intact, but resolve the inconsistencies in sampling. In short: if given a set of data that models a simple quadratic curve, what statistical smoothing method would you recommend? If possible, please reference an implementation, library

Orthogonal matching pursuit regression - am I using it wrong?

阅读更多关于 Orthogonal matching pursuit regression - am I using it wrong?

问题 I am trying out this method as a regularized regression, as an alternative to lasso and elastic net. I have 40k data points and 40 features. Lasso selects 5 features, and orthogonal matching pursuit selects only 1. What could be causing this? Am I using omp the wrong way? Perhaps it is not meant to be used as a regression. Please let me know if you can thing of anything else I may be doing wrong. 回答1: Orthogonal Matching Pursuit seems a bit broken, or at least very sensitive to input data, as

Density of a Two-Piece Normal (or Split Normal) Distribution

阅读更多关于 Density of a Two-Piece Normal (or Split Normal) Distribution

问题 Is there a density function for the two-piece Normal distribution: on CRAN? Thought I would check before I code one. I have checked the distribution task view. It is not listed there. I have looked in a couple of likely packages, but to no avail. Update: I have added dsplitnorm , psplitnorm , qsplitnorm and rsplitnorm functions to the fanplot package. 回答1: If you choose to construct your own version of the distribution, you might be interested in distr . It (and the related packages distrEx ,

Can we generate contingency table for chisquare test using python?

阅读更多关于 Can we generate contingency table for chisquare test using python?

问题 I am using scipy.stats.chi2_contingency method to get chi square statistics. We need to pass frequency table i.e. contingency table as parameter. But I have a feature vector and want to automatically generate the frequency table. Do we have any such function available? I am doing it like this currently: def contigency_matrix_categorical(data_series,target_series,target_val,indicator_val): observed_freq={} for targets in target_val: observed_freq[targets]={} for indicators in indicator_val:

R MICE imputation failing

阅读更多关于 R MICE imputation failing

问题 I am really baffled about why my imputation is failing in R's Mice 2.22 package. I am attempting a very simple operation with the following data frame: > dfn a b c d 1 0 1 0 1 2 1 0 0 0 3 0 0 0 0 4 NA 0 0 0 5 0 0 0 NA I then use mice in the following way to perform a simple mean imputation: imp <- mice(dfn, method = "mean", m = 1, maxit =1) filled <- complete(imp) However, my completed data looks like this: > fill a b c d 1 0.00 1 0 1 2 1.00 0 0 0 3 0.00 0 0 0 4 0.25 0 0 0 5 0.00 0 0 NA Why

AWStats or Google Analytics? Which is more accurate?

阅读更多关于 AWStats or Google Analytics? Which is more accurate?

问题 I have AWStats provided my hosting service provider. I have google analytics as well setup. But both show different statistics whom should I trust? Whats more accurate of these two? Should I use something else for getting accurate statistics. 回答1: They measure in different ways. AWStats uses analyzed server logs, and they include crawlers and bots, as well as end users with JavaScript disabled and Google Analytics opt-out users, none of which Google Analytics measures. AWStats constructs