statistics

Access or parse elements in summary() in R

ε祈祈猫儿з 提交于 2019-12-23 16:44:50
问题 I run the following R commands to make a Dunnett's test and get the summary. How can I access each row of the Linear Hypotheses below, which is part of summary output? Basically I do not know the structure of the summary. I attempted using names() but it seems not working as I do not see any named attribute to give that. library("multcomp") Group <- factor(c("A","A","B","B","B","C","C","C","D","D","D","E","E","F","F","F")) Value <- c(5,5.09901951359278,4.69041575982343,4.58257569495584,4

Converting non-stationary to stationary

。_饼干妹妹 提交于 2019-12-23 16:32:40
问题 I have one data it is not stationary . I'm trying to make it stationary . I tried log transformation, BoxCox transformation, lag(1, 2 and 3) differences. No use of these transformations and differencing. I used adf test to test stationary in R. Can anybody tell is there any other method to make it stationary. data: 6.668 5.591 4.734 3.493 3.235 3.968 2.64 2.885 3.045 3.579 5.463 5.458 5.758 5.931 5.731 6.799 9.568 9.11 6.571 8.528 15.11 13.956 16.46 19.599 27.281 39.928 56.284 67.565 106.399

unrated versus negative-rated entities with Wilson score — how to handle?

人走茶凉 提交于 2019-12-23 16:22:32
问题 Having read How Not To Sort By Average Rating I thought I should give it a try. CREATE FUNCTION `mydb`.`LowerBoundWilson95` (pos FLOAT, neg FLOAT) RETURNS FLOAT DETERMINISTIC RETURN IF( pos + neg <= 0, 0, ( (pos + 1.9208) / (pos + neg) - 1.96 * SQRT( (pos * neg) / (pos + neg) + 0.9604 ) / (pos + neg) ) / ( 1 + 3.8416 / (pos + neg) ) ); Running some tests, I discover that objects with pos=0 and neg>0 have very small, but non-negative scores, whereas an object with pos=neg=0 has a score of zero

Multi Collinearity for Categorical Variables

ε祈祈猫儿з 提交于 2019-12-23 16:10:54
问题 For Numerical/Continuous data, to detect Collinearity between predictor variables we use the Pearson's Correlation Coefficient and make sure that predictors are not correlated among themselves but are correlated with the response variable. But How can we detect multicollinearity if we have a dataset, where predictors are all categorical . I am sharing one dataset where I am trying to find out if predictor variables are correlated or not > A(Response Variable) B C D > Yes Yes Yes Yes > No Yes

Fit a mixture of von Mises distributions in R

℡╲_俬逩灬. 提交于 2019-12-23 15:53:16
问题 I have a set of angular data that I'd like to fit a mixture of two von Mises distributions to. As shown below, the data are clustered at about 0 and ±π, so having a periodic boundary is required for this case. I have tried using the movMF package to fit a distribution to these data but it seems that it is normalizing each row, and since this is a set of 1D data, the result is a vector of ±1. How are others fitting a mixture of distributions like this in R? 回答1: The problem lies with using a

How to make a sample from the empirical distribution function

懵懂的女人 提交于 2019-12-23 15:28:52
问题 I'm trying to implement the nonparametric bootstrapping on Python. It requires to take a sample, build an empirical distribution function from it and then to generate a bunch of samples from this edf. How can I do it? In scipy I found only how to make your own distribution function if you know the exact formula describing it, but I have only an edf. 回答1: The edf you get by sorting the samples: N = samples.size ss = np.sort(samples) # these are the x-values of the edf # the y-values are 1/(2N)

How can I estimate the shape and scale of a gamma dist. with a particular mean and a 95% quantile?

倾然丶 夕夏残阳落幕 提交于 2019-12-23 12:26:50
问题 Is there any way, in R, to calculate the scale and shape of a gamma distribution, given a particular value of mean (or median) and a particular quantile (the 95% quantile)? So for example I have a mean = 130 and a 95% quantile = 300 with an offset of the distribution at 80 is there any way to obtain the scale and shape of a gamma that meet these criteria? 回答1: Here is one approach: myfun <- function(shape) { scale <- 130/shape pgamma(300, shape, scale=scale) - 0.95 } tmp <- uniroot( myfun,

Vectorizing the solution of a linear equation system in MATLAB

为君一笑 提交于 2019-12-23 12:13:49
问题 Summary: This question deals with the improvement of an algorithm for the computation of linear regression. I have a 3D ( dlMAT ) array representing monochrome photographs of the same scene taken at different exposure times (the vector IT ) . Mathematically, every vector along the 3rd dimension of dlMAT represents a separate linear regression problem that needs to be solved. The equation whose coefficients need to be estimated is of the form: DL = R*IT^P , where DL and IT are obtained

Why Pearson correlation output is NaN?

依然范特西╮ 提交于 2019-12-23 11:11:35
问题 I'm trying to get the Pearson correlation coefficient between to variables in R. This is the scatterplot of the variables: ggplot(results_summary, aes(x =D_in, y = D_ex)) + geom_point(col=ifelse(results_summary$FDR < 0.05, ifelse(results_summary$logF>0, "red", "green" ), "black")) As you can see, the variables correlate pretty well, so I'm expecting a high correlation coefficient. However when I try to get the Pearson correlation coefficient I'm getting a NaN! > cor(results_summary$D_in,

Why Pearson correlation output is NaN?

血红的双手。 提交于 2019-12-23 11:11:28
问题 I'm trying to get the Pearson correlation coefficient between to variables in R. This is the scatterplot of the variables: ggplot(results_summary, aes(x =D_in, y = D_ex)) + geom_point(col=ifelse(results_summary$FDR < 0.05, ifelse(results_summary$logF>0, "red", "green" ), "black")) As you can see, the variables correlate pretty well, so I'm expecting a high correlation coefficient. However when I try to get the Pearson correlation coefficient I'm getting a NaN! > cor(results_summary$D_in,