statistics

Create random numbers with left skewed probability distribution

风流意气都作罢 提交于 2019-12-12 14:41:10
问题 I would like to pick a number randomly between 1-100 such that the probability of getting numbers 60-100 is higher than 1-59. I would like to have the probability to be a left-skewed distribution for numbers 1-100. That is to say, it has a long tail and a peak. Something along the lines: pers = np.arange(1,101,1) prob = <left-skewed distribution> number = np.random.choice(pers, 1, p=prob) I do not know how to generate a left-skewed discrete probability function. Any ideas? Thanks! 回答1: Like

ggplot2: Plotting regression lines with different intercepts but with same slope

℡╲_俬逩灬. 提交于 2019-12-12 14:06:52
问题 I want to plot regression lines with different intercepts but with the same slope. With the following ggplot2 code, I can plot regression lines with different intercepts and different slopes. But could not figured out how to draw regression lines with different different intercepts but the same slopes. library(ggplot2) ggplot(data=df3, mapping=aes(x=Income, y=Consumption, color=Gender)) + geom_point() + geom_smooth(data=df3, method = "lm", se=FALSE, mapping=aes(x=Income, y=Consumption))

Any alternatives to Google Trends? [closed]

蓝咒 提交于 2019-12-12 13:32:39
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I'm writing a small helper utility for obscure software that is used at a local shop. Basically, I would like to know if anyone searches for anything associated with that software and if publishing my work on the Internet would make any sense. I entered the name of the software into Google Trends, but my terms

mean and variance of image in single pass

孤街醉人 提交于 2019-12-12 13:16:51
问题 am trying to calculate mean and variance using 3X3 window over image(hXw) in opencv...here is my code...is there any accuracy issues with this??or is there any other efficient method to do it in one pass.? int pi,a,b; for(i=1;i<h-1;i++) { for(j=1;j<w-1;j++) { int sq=0,sum=0; double mean=0; double var=0; for(a=-1;a<=1;a++) { for(b=-1;b<=1;b++) { pi=data[(i+a)*step+(j+b)]; sq=pi*pi; sum=sum+sq; mean=mean+pi; } } mean=mean/9; double soa=mean*mean;//square of average double aos=sum/9;//mean of

Smoothing Small Data Set With Second Order Quadratic Curve

▼魔方 西西 提交于 2019-12-12 12:34:57
问题 I'm doing some specific signal analysis, and I am in need of a method that would smooth out a given bell-shaped distribution curve. A running average approach isn't producing the results I desire. I want to keep the min/max, and general shape of my fitted curve intact, but resolve the inconsistencies in sampling. In short: if given a set of data that models a simple quadratic curve, what statistical smoothing method would you recommend? If possible, please reference an implementation, library

Orthogonal matching pursuit regression - am I using it wrong?

别等时光非礼了梦想. 提交于 2019-12-12 12:23:52
问题 I am trying out this method as a regularized regression, as an alternative to lasso and elastic net. I have 40k data points and 40 features. Lasso selects 5 features, and orthogonal matching pursuit selects only 1. What could be causing this? Am I using omp the wrong way? Perhaps it is not meant to be used as a regression. Please let me know if you can thing of anything else I may be doing wrong. 回答1: Orthogonal Matching Pursuit seems a bit broken, or at least very sensitive to input data, as

Density of a Two-Piece Normal (or Split Normal) Distribution

流过昼夜 提交于 2019-12-12 12:14:06
问题 Is there a density function for the two-piece Normal distribution: on CRAN? Thought I would check before I code one. I have checked the distribution task view. It is not listed there. I have looked in a couple of likely packages, but to no avail. Update: I have added dsplitnorm , psplitnorm , qsplitnorm and rsplitnorm functions to the fanplot package. 回答1: If you choose to construct your own version of the distribution, you might be interested in distr . It (and the related packages distrEx ,

Can we generate contingency table for chisquare test using python?

最后都变了- 提交于 2019-12-12 11:51:48
问题 I am using scipy.stats.chi2_contingency method to get chi square statistics. We need to pass frequency table i.e. contingency table as parameter. But I have a feature vector and want to automatically generate the frequency table. Do we have any such function available? I am doing it like this currently: def contigency_matrix_categorical(data_series,target_series,target_val,indicator_val): observed_freq={} for targets in target_val: observed_freq[targets]={} for indicators in indicator_val:

R MICE imputation failing

笑着哭i 提交于 2019-12-12 11:13:15
问题 I am really baffled about why my imputation is failing in R's Mice 2.22 package. I am attempting a very simple operation with the following data frame: > dfn a b c d 1 0 1 0 1 2 1 0 0 0 3 0 0 0 0 4 NA 0 0 0 5 0 0 0 NA I then use mice in the following way to perform a simple mean imputation: imp <- mice(dfn, method = "mean", m = 1, maxit =1) filled <- complete(imp) However, my completed data looks like this: > fill a b c d 1 0.00 1 0 1 2 1.00 0 0 0 3 0.00 0 0 0 4 0.25 0 0 0 5 0.00 0 0 NA Why

AWStats or Google Analytics? Which is more accurate?

假如想象 提交于 2019-12-12 10:46:24
问题 I have AWStats provided my hosting service provider. I have google analytics as well setup. But both show different statistics whom should I trust? Whats more accurate of these two? Should I use something else for getting accurate statistics. 回答1: They measure in different ways. AWStats uses analyzed server logs, and they include crawlers and bots, as well as end users with JavaScript disabled and Google Analytics opt-out users, none of which Google Analytics measures. AWStats constructs