linear-regression

R Conditional Regression with Multiple Conditions

一笑奈何 提交于 2019-12-13 07:21:41
问题 I am trying to run a regression in R based on two conditions. My data has binary variables for both year and another classification. I can get the regression to run properly while only using 1 condition: # now time for the millions of OLS # format: OLSABCD where ABCD are binary for the values of MSA/UA and years # A = 1 if MSA, 0 if UA # B = 1 if 2010 # C = 1 if 2000 # D = 1 if 1990 OLS1000<-summary(lm(lnrank ~ lnpop, data = subset(df, msa==1))) OLS1000 However I cannot figure out how to get

How to apply linear regresssion of sklearn for some string variable

北城余情 提交于 2019-12-13 06:40:58
问题 I am going to predict the box office of a movie using logistic regression. I got some train data including the actors and directors. This is my datas: Director1|Actor1|300 million Director2|Actor2|500 million I am going to encode the directors and actors using integers. 1|1|300 million 2|2|300 million Which means that X={[1,1],[2,2]} y=[300,500] and fit(X,y) Does that work? 回答1: You cannot use categorical variables in linear regression like that. Linear regression treats all variables like

Cross Validating step functions in R

≡放荡痞女 提交于 2019-12-13 06:34:25
问题 I am trying to get errors from step functions but I get an error : library(boot) library(ISLR) attach(Wage) set.seed(5082) cv.error <- rep (0,12) for (i in 2:13){ step.fit = glm(wage~cut(age,i), data = Wage) cv.error[i] <- cv.glm(Wage ,step.fit, K= 10)$delta [1] } Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : cut(age, i) has new levels (17.9,43.5], (43.5,69.1] I can get the error from cv.glm()$delta [1] if instead of auto generating the cut()

character causing syntax issue with statsmodel

青春壹個敷衍的年華 提交于 2019-12-13 05:43:18
问题 I'm trying to fit a linear model to some data using the code below. I'm getting the error below. I think the error has an issue with the '%' in the field name. I have many fields in my data with this naming convention. Does anyone know how to solve this issue with statsmodel? code: mod = ols('fieldA%'+'~'+'fieldB',data=smp_df).fit() error: Traceback (most recent call last): File "C:\Users\username\AppDataPython\envs\py36\lib\site-packages\IPython\core\interactiveshell.py", line 3267, in run

Exporting and formatting regression analysis results in R to excel

佐手、 提交于 2019-12-13 05:20:57
问题 I am running regression analysis in R and unsure how to export my regression analysis results directly into Excel in standard regression table format (with significance level stars, standard errors, p-value , 95% confidence interval, R-sqr , F-test ). In stata, I would use the outreg2 command, which automatically generates a regression table, and I was wondering, if R has a similar code? For example: reg <- lm(imbd_score ~ budget + duration + year + cast_total_facebook_likes, data = imbd)

Contiki: Error if ELF File contains calculation with several unsinged int

被刻印的时光 ゝ 提交于 2019-12-13 05:19:55
问题 I encountered some problems while working with the contiki ELF-loader and hope that someone would be so kind to provide me more insight or some hints to solve these problems. In the following I try to keep the problem description short. My aim is to: Execute an ELF file on a T-Mote-Sky. This ELF file contains a contiki process with a computation (linear regression of data samples over time). Using "cooja" for simulation Code specific information: ELF file size about 2000 bytes quite large

SAS reading a file in long format

怎甘沉沦 提交于 2019-12-13 05:08:14
问题 I have a file in long format, like so: name weight month cal bob 80 01 5000 ben 70 01 4989 mary 60 01 3000 bob 81 02 4999 ben 68 02 6000 mary 57 02 2800 ... I would like to create N linear regressions of weight over cal: one for each of the months. I know how to read the data into a dataset and how to fit a regression model. I am not sure how I do this in a loop for the N months... Any pointers? Many thanks! 来源: https://stackoverflow.com/questions/22589223/sas-reading-a-file-in-long-format

How to regularize the intercept with glmnet

南楼画角 提交于 2019-12-13 03:44:52
问题 I know that glmnet does not regularize the intercept by default, but I would like to do it anyway. I was taking a look at this question and tried to do what whuber suggested (adding a constant variable and turning the parameter intercept to FALSE ) , but as a result glmnet is not fitting the added constant as well. library(dplyr) library(glmnet) X <- mtcars %>% mutate(intercept = 1) %>% select(-c(mpg)) %>% as.matrix() y <- mtcars %>% select(mpg) %>% as.matrix() model <- glmnet(X, y, intercept

SVM provided a bad result in my data. How to fix?

眉间皱痕 提交于 2019-12-13 01:50:26
问题 I have a dataset that contains 510 samples for training and 127 samples for testing, each sample has 7680 features. I want to design a model to predict the height (cm)-label-from the training data. Currently, I used SVM but it provided very bad result. Could you look at my code and give me some comments. You can try it in your machine using the dataset and a runnable code import numpy as np from sklearn.svm import SVR # Training Data train_X = np.loadtxt('trainX.txt') # 510 x 7680 train_Y =

Toy R function for solving ordinary least squares by singular value decomposition

99封情书 提交于 2019-12-12 22:17:49
问题 I'm trying to write a functions for multiple regression analysis ( y = Xb + e ) using a singular value decomposition for matrices. y and X must be the input and regression coefficients vector b , the residual vector e and variance accounted for R2 as output. Beneath is what I have so far and I'm totally stuck. The labels part of the weight also gives me an error. What is this labels part? Can anybody give me some tips to help me proceed? Test <- function(X, y) { x <- t(A) %*% A duv <- svd(x)