regression

R Conditional Regression with Multiple Conditions

一笑奈何 提交于 2019-12-13 07:21:41
问题 I am trying to run a regression in R based on two conditions. My data has binary variables for both year and another classification. I can get the regression to run properly while only using 1 condition: # now time for the millions of OLS # format: OLSABCD where ABCD are binary for the values of MSA/UA and years # A = 1 if MSA, 0 if UA # B = 1 if 2010 # C = 1 if 2000 # D = 1 if 1990 OLS1000<-summary(lm(lnrank ~ lnpop, data = subset(df, msa==1))) OLS1000 However I cannot figure out how to get

Automate regression with specific dependent and independent variables

。_饼干妹妹 提交于 2019-12-13 07:06:05
问题 MVE: Let this be the data set: data <- data.frame(year = rep(seq(1966,2015,1), 8), county = c(rep('prva', 50), rep('druga', 50), rep('treća', 50), rep('četvrta', 50), rep('peta', 50), rep('šesta', 50), rep('sedma', 50), rep('osma', 50)), crime1 = runif(400), crime2 = runif(400), crime3 = runif(400), uvar1 = runif(400), uvar2 = runif(400), uvar3 = runif(400), var1 = runif(400), var2 = runif(400), var3 = runif(400), var4 = runif(400), var5 = runif(400)) Let's say crime1,2 and 3 are specific

Passing the weights argument to a regression function inside an R function

自作多情 提交于 2019-12-13 06:54:15
问题 I am trying to write an R function to run a weighted (optional) regressions, and I am having difficulties getting the weight variable to work. Here is a simplified version of the function. HC <- function(data, FUN, formula, tau = 0.5, weights = NULL){ if(is.null(weights)){ est <- FUN(data = data, formula = formula, tau = tau) intercept = est$coef[["(Intercept)"]] zeroWorker <- exp(intercept) } else { est <- FUN(data = data, formula = formula, tau = tau, weights = weights) intercept = est$coef

Cross Validating step functions in R

≡放荡痞女 提交于 2019-12-13 06:34:25
问题 I am trying to get errors from step functions but I get an error : library(boot) library(ISLR) attach(Wage) set.seed(5082) cv.error <- rep (0,12) for (i in 2:13){ step.fit = glm(wage~cut(age,i), data = Wage) cv.error[i] <- cv.glm(Wage ,step.fit, K= 10)$delta [1] } Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : cut(age, i) has new levels (17.9,43.5], (43.5,69.1] I can get the error from cv.glm()$delta [1] if instead of auto generating the cut()

Regressing periodic data with sklearn

流过昼夜 提交于 2019-12-13 05:35:46
问题 I have a dataset with a Regression problem. Earlier i thought it is a linear regression problem but when i plotted "date_time" against "traffic_volume" then it turned out be something like a Sine curve so i decided to go for " Curve Fitting ". Here's the code: import pandas as pd from sklearn.model_selection import train_test_split import numpy as np import datetime as dt from sklearn.linear_model import LinearRegression from sklearn import linear_model from sklearn.model_selection import

Exporting and formatting regression analysis results in R to excel

佐手、 提交于 2019-12-13 05:20:57
问题 I am running regression analysis in R and unsure how to export my regression analysis results directly into Excel in standard regression table format (with significance level stars, standard errors, p-value , 95% confidence interval, R-sqr , F-test ). In stata, I would use the outreg2 command, which automatically generates a regression table, and I was wondering, if R has a similar code? For example: reg <- lm(imbd_score ~ budget + duration + year + cast_total_facebook_likes, data = imbd)

Ranking algorithm with missing values and bias

对着背影说爱祢 提交于 2019-12-13 05:00:12
问题 The problem is : A set of 5 independent users where asked to rate 50 products given to them. All 50 products would have been used by the users in some point of time. Some users have more bias towards certain products. One user did not truly complete the survey and gave random values. It is not necessary for the users to rate all the products. Now given a 4 sample dataset , rank the products based on ratings datset : product #user1 #user2 #user3 #user4 #user5 0 29 - 10 90 12 1 - - - - 7 2 - -

Stata: saving regressions coefficients and standard errors in .dta file when there are factor variables

霸气de小男生 提交于 2019-12-13 04:35:10
问题 I would like to run several regressions and store their results in a DTA file that I could later use for analysis. My constraints are: I cannot install modules (I am writing code for other people and not sure what modules they have installed) Some of the regressors are factor variables. Each regression differ only by the dependent variable, so I would like to store that in the final dataset to keep track of what regression the coefficients/variances correspond to. I am seriously losing sanity

Linear regression with conditional statement in R

我的未来我决定 提交于 2019-12-13 04:34:32
问题 I have a huge database and I need to run different regressions with conditional statements. So I see to options to do it: 1) in the regression include the command data subset (industrycodes==12) and 2) I don't obtain the same results as if cut the data to the values when furniture==12. And they should be the same. Could somebody help me with the codes, I think I have a problem with this. I put an example very basic to explain it. ID roa employees industrycodes 1 0,5 10 12 2 0,3 20 11 3 0,8 15

matlab fminunc not quitting (running indefinitely)

这一生的挚爱 提交于 2019-12-13 03:59:40
问题 I have been trying to implement logistic regression in matlab for a while now. I have done it already, but for reasions unknown to me, I am unable to perform a single iteration using fminunc. When the function it called, the program just go in to wait mode indefinitely. Is there something wrong with code, or is my data set to large? function [theta J] = logisticReg(initial_theta, X, y, lambda, iter) % Set Options options = optimset('GradObj', 'on', 'MaxIter', iter); % Optimize [theta, J, exit