regression | 易学教程

Estimate the standard deviation of fitted parameters in scipy.odr?

阅读更多关于 Estimate the standard deviation of fitted parameters in scipy.odr?

问题 ( Somewhat related to this question Linear fit including all errors with NumPy/SciPy, and borrowing code from this one Linear fitting in python with uncertainty in both x and y coordinates ) I fit a linear model ( y=a*x+b ) using fixed errors in x,y using scipy.odr (code is below), and I get: Parameters (a, b): [ 5.21806759 -4.08019995] Standard errors: [ 0.83897588 2.33472161] Squared diagonal covariance: [ 1.06304228 2.9582588 ] What is the correct standard deviation values for the fitted a

Neural Network – Predicting Values of Multiple Variables

阅读更多关于 Neural Network – Predicting Values of Multiple Variables

问题 I have data with columns A, B, C as inputs and columns D, E, F, G as outputs. The table has a shape (1000,7). I would like to train the model, validate and test it. My data: A = [100, 120, 140, 160, 180, 200, 220, 240, 260, 280]; B = [300, 320, 340, 360, 380, 400, 420, 440, 460, 480]; C = [500, 520, 540, 560, 580, 600, 620, 640, 660, 680]; My desired outcome: For each combination of A, B, C --> I get D, E, F, G as outputs (for example) : D = 2.846485609 E = 5.06656901 F = 3.255358183 G = 5

Categorical and ordinal feature data representation in regression analysis?

阅读更多关于 Categorical and ordinal feature data representation in regression analysis?

问题 I am trying to fully understand difference between categorical and ordinal data when doing regression analysis. For now, what is clear: Categorical feature and data example: Color: red, white, black Why categorical: red < white < black is logically incorrect Ordinal feature and data example: Condition: old, renovated, new Why ordinal: old < renovated < new is logically correct Categorical-to-numeric and ordinal-to-numeric encoding methods: One-Hot encoding for categorical data Arbitrary

Rolling Regression with Data.table - Update?

阅读更多关于 Rolling Regression with Data.table - Update?

问题 I am attempting to run a rolling regression within a data.table. There are a number of questions that get at what I am trying to do, but they are generally 3+ years old and offer inelegant answers. (see: here, for example) I am wondering if there has been any update to the data.table package that make this more intuitive/ faster? Here is what I am trying to do. My code looks like this: DT<-data.table( Date = seq(as.Date("2000/1/1"), by = "day", length.out = 1000), x1=rnorm(1000), x2=rnorm

Split data.frame by country, and create linear regression model on each subset [duplicate]

阅读更多关于 Split data.frame by country, and create linear regression model on each subset [duplicate]

问题 This question already has answers here : Linear Regression and group by in R (10 answers) Closed 3 years ago . I have a data.frame of data from the World Bank which looks something like this; country date BirthRate US. 4 Aruba 2011 10.584 25354.8 5 Aruba 2010 10.804 24289.1 6 Aruba 2009 11.060 24639.9 7 Aruba 2008 11.346 27549.3 8 Aruba 2007 11.653 25921.3 9 Aruba 2006 11.977 24015.4 All in all there 70 something sub sets of countries in this data frame that I would like to run a linear

Split data.frame by country, and create linear regression model on each subset [duplicate]

阅读更多关于 Split data.frame by country, and create linear regression model on each subset [duplicate]

How to conduct linear hypothesis test on regression coefficients with a clustered covariance matrix?

阅读更多关于 How to conduct linear hypothesis test on regression coefficients with a clustered covariance matrix?

问题 I am interested in calculating estimates and standard errors for linear combinations of coefficients after a linear regression in R. For example, suppose I have the regression and test: data(mtcars) library(multcomp) lm1 <- lm(mpg ~ cyl + hp, data = mtcars) summary(glht(lm1, linfct = 'cyl + hp = 0')) This will estimate the value of the sum of the coefficients on cyl and hp , and provide the standard error based on the covariance matrix produced by lm . But, suppose I want to cluster my

Difference is value between xgb.train and xgb.XGBRegressor in Python for certain cases

阅读更多关于 Difference is value between xgb.train and xgb.XGBRegressor in Python for certain cases

问题 I noticed that there are two possible implementations of XGBoost in Python as discussed here and here When I tried running the same dataset through the two possible implementations I noticed that the results were different. Code import xgboost as xgb from xgboost.sklearn import XGBRegressor import xgboost import pandas as pd import numpy as np from sklearn import datasets boston_data = datasets.load_boston() df = pd.DataFrame(boston_data.data,columns=boston_data.feature_names) df['target'] =

How to set contrasts for my variable in regression analysis with R?

阅读更多关于 How to set contrasts for my variable in regression analysis with R?

问题 During coding, I need to change the dummy value assigned to a factor. However, the following code does not work. Any suggestion? test_mx= data.frame(a= c(T,T,T,F,F,F), b= c(1,1,1,0,0,0)) test_mx a b 1 TRUE 1 2 TRUE 1 3 TRUE 1 4 FALSE 0 5 FALSE 0 6 FALSE 0 model= glm(b ~ a, data= test_mx, family= "binomial") summary(model) model= glm(a ~ b, data= test_mx, family= "binomial") summary(model) Here I will get the coef for b is 47. Now if I swap the dummy value, it should be -47 then. However, this

Back-transform coefficients from glmer with scaled independent variables for prediction

阅读更多关于 Back-transform coefficients from glmer with scaled independent variables for prediction

问题 I've fitted a mixed model using the lme4 package. I transformed my independent variables with the scale() function prior to fitting the model. I now want to display my results on a graph using predict() , so I need the predicted data to be back on the original scale. How do I do this? Simplified example: database <- mtcars # Scale data database$wt <- scale(mtcars$wt) database$am <- scale(mtcars$am) # Make model model.1 <- glmer(vs ~ scale(wt) + scale(am) + (1|carb), database, family =