regression

NLS Regression in GGPlot2, Plotting y=Ax^b Trendline Error

冷暖自知 提交于 2019-12-11 16:18:47
问题 I'm attempting to fit a basic power trendline on a set of 3 data point, as you could do in Excel to mimic the y = Ax^b function. I have a very simple data set loaded into LCurve.data as follows: MDPT = {4, 10.9, 51.6} AUC = {287069.4, 272986.0, 172426.1} fm0 <- nls(log(LCurve.data$AUC) ~ log(a) + b * log(LCurve.data$MDPT), data = LCurve.data, start = list (a = 1, b =1)) ggplot(LCurve.data, aes(x=MDPT, y = AUC)) + geom_line() + geom_point() + stat_smooth(method = 'nls', formula = y ~ a * x ^ b

Forecasting using Multiple Regression in BigQuery

假如想象 提交于 2019-12-11 15:56:45
问题 Pity Google BigQuery still doesn't have a function such as forecast() that we see in Spreadsheets-- don't look down on yet; given one has the statistical know-how, surprising amount of smoothing and seasonality can be added to forecasting on spreadsheets. BigQuery allows you to determine Standard Deviation, correlation and intercept metrics. Using that, one can create the prediction model-- refer to this and this. But that uses Linear regression model; so we are not happy with the seasonality

Weekday as dummy / factor variable in a linear regression model using statsmodels

白昼怎懂夜的黑 提交于 2019-12-11 15:53:22
问题 The question: How can I add a dummy / factor variable to a model using sm.OLS() ? The details Below is a reproducible dataframe that you can pick up using ctrl + C and then run the snippet further down for a reproducible example. Input data: Date A B weekday 2013-05-04 25.03 88.51 Saturday 2013-05-05 52.98 67.99 Sunday 2013-05-06 39.93 75.19 Monday 2013-05-07 47.31 86.99 Tuesday 2013-05-08 19.61 87.94 Wednesday 2013-05-09 39.51 83.10 Thursday 2013-05-10 21.22 62.16 Friday 2013-05-11 19.04 58

Negative Binomial Regression: coefficient interpretation

半城伤御伤魂 提交于 2019-12-11 15:52:23
问题 How should coefficients (intercept, categorical variable, continuous variable) in a negative binomial regression model be interpreted? What is the base formula behind the regression (such as for Poisson regression, it is $\ln(\mu)=\beta_0+\beta_1 x_1 + \dots$)? Below I have an example output from my specific model that I want to interpret, where seizure.rate is a count variable and treatment categorical (placebo vs. non-placebo). Call: glm.nb(formula = seizure.rate2 ~ treatment2, data =

ValueError: X.shape[1] = 1 should be equal to 14, the number of features at training time

风流意气都作罢 提交于 2019-12-11 15:38:02
问题 I am trying to predict boston housingdata by using SVR and I am using the following code but getting some error. # -*- coding: utf-8 -*- import sys import pandas as pd columns=['ID','crim','zn','indus','chas','nox','rm','age','dis','rad','tax','ptratio','black','lstat','medv'] dataset_train=pd.read_csv('train.csv')#,names=columns) train_y=pd.DataFrame(dataset_train.medv) dataset_train=dataset_train.drop('medv',axis=1) columns_test=['ID','crim','zn','indus','chas','nox','rm','age','dis','rad',

How to run a regression row by row

北战南征 提交于 2019-12-11 15:27:32
问题 I just started using R for statistical purposes and I appreciate any kind of help. As a first step, I ran a time series regression over my columns. Y values are dependent and the X is explanatory. # example Y1 <- runif(100, 5.0, 17.5) Y2 <- runif(100, 4.0, 27.5) Y3 <- runif(100, 3.0, 14.5) Y4 <- runif(100, 2.0, 12.5) Y5 <- runif(100, 5.0, 17.5) X <- runif(100, 5.0, 7.5) df1 <- data.frame(X, Y1, Y2, Y3, Y4, Y5) # calculating log returns to provide data for the first regression n <- nrow(df1) X

Type mismatch and expected values missing when using the LinEst function in VBA

落花浮王杯 提交于 2019-12-11 15:26:47
问题 This is a follow up on this question. I'm working on producing a quadratic fit for a plot of data using Excel VBA. As is, when I call linEst, I'm getting the error "Type Mismatch". The one time it did work for me, if the formula for a quadratic equation is Ax^2 + Bx + C, I only got my A and C values to quadSlope and quadB respectively. I have no idea what caused it to work the first time, so I can't provide much else on attempted solutions aside from the code posted below. Dim quad() As

Plot multiple polynomial regression curve

人走茶凉 提交于 2019-12-11 14:27:02
问题 I am trying to plot only a few regression lines and not any of the points. (No fitted , because I have over 7 thousand points.) I know how to do this with linear regressions, but not with polynomial regression. My data is here. With a few linear regressions: plot_data=read.csv("plot_data.csv") #read data #linear regressions Off_linear=lm(Z_Salary~OBPM,data=plot_data) Def_linear=lm(Z_Salary~DBPM,data=plot_data) Tot_linear=lm(Z_Salary~BPM,data=plot_data) #try to plot. This works. Not sure how

Get number of data in each factor level (as well as interaction) from a fitted lm or glm [R]

社会主义新天地 提交于 2019-12-11 14:09:09
问题 I have a logistic regression model in R, where all of the predictor variables are categorical rather than continuous (in addition to the response variable, which is also obviously categorical/binary). When calling summary(model_name) , is there a way to include a column representing the number of observations within each factor level? 回答1: I have a logistic regression model in R, where all of the predictor variables are categorical rather than continuous. If all your covariates are factors

Dummy variables for Logistic regression in R

霸气de小男生 提交于 2019-12-11 13:50:56
问题 I am running a logistic regression on three factors that are all binary. My data table1<-expand.grid(Crime=factor(c("Shoplifting","Other Theft Acts")),Gender=factor(c("Men","Women")), Priorconv=factor(c("N","P"))) table1<-data.frame(table1,Yes=c(24,52,48,22,17,60,15,4),No=c(1,9,3,2,6,34,6,3)) and the model fit4<-glm(cbind(Yes,No)~Priorconv+Crime+Priorconv:Crime,data=table1,family=binomial) summary(fit4) R seems to take 1 for prior conviction P and 1 for crime shoplifting. As a result the