linear-regression | 易学教程

Categorical and ordinal feature data difference in regression analysis?

阅读更多关于 Categorical and ordinal feature data difference in regression analysis?

问题 I am trying to completely understand difference between categorical and ordinal data when doing regression analysis. For now, what is clear: Categorical feature and data example: Color: red, white, black Why categorical: red < white < black is logically incorrect Ordinal feature and data example: Condition: old, renovated, new Why ordinal: old < renovated < new is logically correct Categorical-to-numeric and ordinal-to-numeric encoding methods: One-Hot encoding for categorical data Arbitrary

Categorical and ordinal feature data difference in regression analysis?

阅读更多关于 Categorical and ordinal feature data difference in regression analysis?

Categorical and ordinal feature data difference in regression analysis?

阅读更多关于 Categorical and ordinal feature data difference in regression analysis?

Linear regression on raster images - lm complains about NAs

阅读更多关于 Linear regression on raster images - lm complains about NAs

问题 I'm sure this can be fixed with few bytes, but I've spent hours on this simple thing and can't get out of it. I don't use R often. I have 5 asciigrid files that represent 5 raster images. Some pixels do have values, other do have NAs. For example, the first image might be something like: NA NA NA NA NA NA NA 2 3 NA NA 0.2 0.3 1 NA NA NA 4 NA NA and the second might be: NA NA NA NA NA NA NA 5 1 NA NA 0.1 12 12 NA NA NA 6 NA NA As you can see, NA position is always the same and I'm 100% sure

Linear regression on raster images - lm complains about NAs

阅读更多关于 Linear regression on raster images - lm complains about NAs

Print OLS regression summary to text file

阅读更多关于 Print OLS regression summary to text file

问题 I am running OLS regression using pandas.stats.api.ols using a groupby with the following code: from pandas.stats.api import ols df=pd.read_csv(r'F:\file.csv') result=df.groupby(['FID']).apply(lambda d: ols(y=d.loc[:, 'MEAN'], x=d.loc[:, ['Accum_Prcp', 'Accum_HDD']])) for i in result: x=pd.DataFrame({'FID':i.index, 'delete':i.values}) frame = pd.concat([x,DataFrame(x['delete'].tolist())], axis=1, join='outer') del frame['delete'] print frame but this returns the error: AttributeError: 'OLS'

What package in R is used to calculate non-zero null hypothesis p-values on linear models?

阅读更多关于 What package in R is used to calculate non-zero null hypothesis p-values on linear models?

问题 The standard summary(lm(Height~Weight)) will output results for the hypothesis test H0: Beta1=0, but if I am interested in testing the hypothesis H0: B1=1 is there a package that will produce that p-value? I know I can calculate it by hand and I know I can "flip the confidence interval" for a two tailed test (test a 95% hypothesis by seeing if the 95% confint contains the point of interest), but I am looking for an easy way to generate the p-values for a simulation study. 回答1: You can use

What package in R is used to calculate non-zero null hypothesis p-values on linear models?

阅读更多关于 What package in R is used to calculate non-zero null hypothesis p-values on linear models?

How to instantiate a Scikit-Learn linear model with known coefficients without fitting it

阅读更多关于 How to instantiate a Scikit-Learn linear model with known coefficients without fitting it

问题 Background I am testing various saved models as part of an experiment, but one of the models comes from an algorithm I wrote, not from a sklearn model-fitting. However, my custom model is still a linear model so I want to instantiate a LinearModel instance and set the coef_ and intercept_ attributes to the values from my custom fitting algorithm so I can use it for predictions. What I tried so far: from sklearn.linear_model import LinearRegression my_intercepts = np.ones(2) my_coefficients =

How to manually calculate Cook's distance

阅读更多关于 How to manually calculate Cook's distance

问题 I calculated Cook's distance manually and with the function cooks.distance and I got two different results. Can someone please help me understand why? Below is how I manually calculate Cook's distance: j=rnorm(100) o=rexp(100) p=runif(100) model=lm(j~o+p) O=model.matrix(model) P = O%*% solve(t(O) %*% O) %*% t(O) lev=diag(P) b<-solve(t(O)%*%O)%*%t(O)%*%j RSS <- sum((j-O%*%b)^2) s2<- RSS/97 #three predictors (including intercept (100-3=97)) residuals(model)^2/(4*s2)*(lev/(1-lev)^2) The above