regression

Problems displaying LOESS regression line and confidence interval

假装没事ソ 提交于 2019-12-01 12:24:30
I am having some issues trying to compete a LOESS regression with a data set. I have been able to properly create the line, but I am unable to get it to plot correctly. I ran through the data like this. animals.lo <- loess(X15p5 ~ Period, animals, weights = n.15p5) animals.lo summary(animals.lo) plot(X15p5~ Period, animals) lines(animals$X15p5, animals.lo, col="red") At this point I received an error "Error in xy.coords(x, y) : 'x' and 'y' lengths differ" I searched around and read that this issue could be due to the points needing to be ordered, so I proceeded. a <- order(animals$Period)

Print or capturing multiple objects in R

你说的曾经没有我的故事 提交于 2019-12-01 11:47:55
I have multiple regressions in an R script and want to append the regression summaries to a single text file output. I know I can use the following code to do this for one regression summary, but how would I do this for multiple? rpt1 <- summary(fit) capture.output(rpt1, file = "results.txt") I would prefer not to have to use this multiple times in the same script (for rpt1, rpt2, etc.), and thus have separate text files for each result. I'm sure this is easy, but I'm still learning the R ropes. Any ideas? You can store the result as a list and then use the capture.output fit1<-lm(mpg~cyl,data

How to do two-dimensional regression analysis in Python?

喜欢而已 提交于 2019-12-01 11:45:17
Firstly, I am not familiar with Python and I still barely understand the mechanism of Python code. But I need to do some statistical analysis through Python. I have tried many many ways to figure out but I failed. Basically, I have 3 arrays of data (assume these arrays are X , Y , Z ). I did some analysis with ( X , Y ) and ( Z , Y ) by making the scatter plot and put the best fit with the data to see the correlation. №1 and №2 are quite easy enough. Now I need to see the edge on view from the graph which is the one with combined X and Z . So, I made the equation (see below). import pylab as

How to do two-dimensional regression analysis in Python?

被刻印的时光 ゝ 提交于 2019-12-01 11:18:19
问题 Firstly, I am not familiar with Python and I still barely understand the mechanism of Python code. But I need to do some statistical analysis through Python. I have tried many many ways to figure out but I failed. Basically, I have 3 arrays of data (assume these arrays are X , Y , Z ). I did some analysis with ( X , Y ) and ( Z , Y ) by making the scatter plot and put the best fit with the data to see the correlation. №1 and №2 are quite easy enough. Now I need to see the edge on view from

Use lapply for multiple regression with formula changing, not the dataset

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-01 10:53:43
I have seen an example of list apply (lapply) that works nicely to take a list of data objects, and return a list of regression output, which we can pass to Stargazer for nicely formatted output. Using stargazer with a list of lm objects created by lapply-ing over a split data.frame library(MASS) library(stargazer) data(Boston) by.river <- split(Boston, Boston$chas) class(by.river) fit <- lapply(by.river, function(dd)lm(crim ~ indus,data=dd)) stargazer(fit, type = "text") What i would like to do is, instead of passing a list of datasets to do the same regression on each data set (as above),

Seaborn barplot with regression line

点点圈 提交于 2019-12-01 09:38:31
Is there a way to add a regression line to a barplot in seaborn where the x axis contains pandas.Timestamps? For example, overlay a trendline in this bar plot below. Am looking for the most efficient way to do this: seaborn.set(style="white", context="talk") a = pandas.DataFrame.from_dict({'Attendees': {pandas.Timestamp('2016-12-01'): 10, pandas.Timestamp('2017-01-01'): 12, pandas.Timestamp('2017-02-01'): 15, pandas.Timestamp('2017-03-01'): 16, pandas.Timestamp('2017-04-01'): 20}}) ax = seaborn.barplot(data=a, x=a.index, y=a.Attendees, color='lightblue', ) seaborn.despine(offset=10, trim=False

Rolling regression and prediction with lm() and predict()

好久不见. 提交于 2019-12-01 09:25:38
I need to apply lm() to an enlarging subset of my dataframe dat , while making prediction for the next observation. For example, I am doing: fit model predict ---------- ------- dat[1:3, ] dat[4, ] dat[1:4, ] dat[5, ] . . . . dat[-1, ] dat[nrow(dat), ] I know what I should do for a particular subset (related to this question: predict() and newdata - How does this work? ). For example to predict the last row, I do dat1 = dat[1:(nrow(dat)-1), ] dat2 = dat[nrow(dat), ] fit = lm(log(clicks) ~ log(v1) + log(v12), data=dat1) predict.fit = predict(fit, newdata=dat2, se.fit=TRUE) How can I do this

Non linear Regression: Why isn't the model learning?

痴心易碎 提交于 2019-12-01 09:23:00
I just started learning keras. I am trying to train a non-linear regression model in keras but model doesn't seem to learn much. #datapoints X = np.arange(0.0, 5.0, 0.1, dtype='float32').reshape(-1,1) y = 5 * np.power(X,2) + np.power(np.random.randn(50).reshape(-1,1),3) #model model = Sequential() model.add(Dense(50, activation='relu', input_dim=1)) model.add(Dense(30, activation='relu', init='uniform')) model.add(Dense(output_dim=1, activation='linear')) #training sgd = SGD(lr=0.1); model.compile(loss='mse', optimizer=sgd, metrics=['accuracy']) model.fit(X, y, nb_epoch=1000) #predictions

Use lapply for multiple regression with formula changing, not the dataset

天涯浪子 提交于 2019-12-01 09:07:29
问题 I have seen an example of list apply (lapply) that works nicely to take a list of data objects, and return a list of regression output, which we can pass to Stargazer for nicely formatted output. Using stargazer with a list of lm objects created by lapply-ing over a split data.frame library(MASS) library(stargazer) data(Boston) by.river <- split(Boston, Boston$chas) class(by.river) fit <- lapply(by.river, function(dd)lm(crim ~ indus,data=dd)) stargazer(fit, type = "text") What i would like to

Output Regression statistics for each variable one at a time in R

拜拜、爱过 提交于 2019-12-01 08:51:44
问题 I have a data frame that looks like this. names and number of columns will NOT be consistent (sometimes 'C' will not be present, other times "D', 'E', 'F' may be present, etc.). The only consistent variable will always be Y, and I want to regress against Y. # name and number of columns varies...so need flexible process Y <- c(4, 4, 3, 4, 3, 2, 3, 2, 2, 3, 4, 4, 3, 4, 8, 6, 5, 4, 3, 6) A <- c(1, 2, 1, 2, 3, 2, 1, 1, 1, 2, 1, 4, 3, 1, 2, 2, 1, 2, 4, 8) B <- c(5, 6, 6, 5, 3, 7, 2, 1, 1, 2, 7, 4,