regression

how to plot the data for linear model with 3 variables in matlab?

爷,独闯天下 提交于 2019-12-08 11:35:25
问题 To plot the data in 3D plane for this model: y = a + a1*x1 + a2*x2 I do like this, the figure is shown in this website (http://kr.mathworks.com/help/stats/regress.html) , x1, x2, and y denote respectively vectors X, Y, and Z. scatter3(x1,x2,y,'filled') hold on x1fit = min(x1):100:max(x1); x2fit = min(x2):10:max(x2); [X1FIT,X2FIT] = meshgrid(x1fit,x2fit); YFIT = b(1) + b(2)*X1FIT + b(3)*X2FIT + b(4)*X1FIT.*X2FIT; mesh(X1FIT,X2FIT,YFIT) xlabel('Weight') ylabel('Horsepower') zlabel('MPG') view

Regress function for Excel VBA

本秂侑毒 提交于 2019-12-08 11:08:55
问题 Using the regress function in excel and I am having trouble getting the output to post in the spot i want. Excel keeps creating a new workbook in which to place the results. What am I doing wrong? How do I target a space in my current workbook to paste the output of the regression? Application.Run "ATPVBAEN.XLAM!Regress", Worksheets("statistics Database").Range("$C$7:$C$" & (7 + I - 1)), Worksheets("statistics Database").Range("$D$7:$F$" & (7 + I - 1)), False, True, , , True, , True, ,

Data Transformation in R for Panel Regression

萝らか妹 提交于 2019-12-08 07:43:00
问题 I really need your help regarding a problem which may seem easy to solve for you. Currently I work on a project which involves some panel-regressions. I have several large csv-files (up to 12 million entries per sheet) which are formatted as in the picture attached, whereas the columns (V1, V2) are individuals and the rows (1, 2, 3) are time identifiers. In order to use the plm() -function I need all these files to convert to the following data structure: ID Time X1 X2 1 1 x1 x2 1 2 x1 x2 1 .

Problems with within and random models in plm package

牧云@^-^@ 提交于 2019-12-08 07:41:02
问题 I am working with plm package and I have problem with random and within models, which are giving errors which says "empty model". However, the model is not empty. In the source code for plm.fit, where the error originates it says something like (writing from the top of my head...) X <- model.matrix(formula,data, lhs=1,...) if (ncol(X) == 0) stop("empty model") however if I try to replicate this behaviour with the commands I am inputing into the original function, it gives ncol(X) is 17 or

Reading csv to array, performing linear regression on array and writing to csv in Python depending on gradient

房东的猫 提交于 2019-12-08 07:40:53
问题 I am having to tackle a problem that far exceeds my current programming skill for Python. I am having difficulty combining different modules (csv reader, numpy etc.) into a single script. My data contains a large list of weather variables across time (with minute resolution) for many days. My objective is to determine the trend of the wind speed between 9am and 12pm of every day in the list. If the gradient of the wind speed is positive, I wish to write the date on which this occurred to a

R: Hide dummies output

我怕爱的太早我们不能终老 提交于 2019-12-08 07:21:01
问题 I'm new to running regressions with R . Learning by doing and looking at different online tutorials, here's what I'm doing atm to regress y onto x1 and have dummies for x2 and x3 (but no interacted dummies): myDataTable[, x2.f := factor(x2)] myDataTable[, x3.f := factor(x3)] ols <- myDataTable[, lm(y ~ x1 + x2.f +x3.f)] Now, I would like to look at my regression output, but it's very long, since there's many (think thousands) of values for x3 , summary(ols) is unreadable. How can I look at

Statsmodels.formula.api OLS does not show statistical values of intercept

断了今生、忘了曾经 提交于 2019-12-08 07:10:43
问题 I am running the following source code: import statsmodels.formula.api as sm # Add one column of ones for the intercept term X = np.append(arr= np.ones((50, 1)).astype(int), values=X, axis=1) regressor_OLS = sm.OLS(endog=y, exog=X).fit() print(regressor_OLS.summary()) where X is an 50x5 (before adding the intercept term) numpy array which looks like this: [[0 1 165349.20 136897.80 471784.10] [0 0 162597.70 151377.59 443898.53]...] and y is a a 50x1 numpy array with float values for the

Out of memory when using `outer` in solving my big normal equation for least squares estimation

主宰稳场 提交于 2019-12-08 06:11:16
问题 Consider the following example in R: x1 <- rnorm(100000) x2 <- rnorm(100000) g <- cbind(x1, x2, x1^2, x2^2) gg <- t(g) %*% g gginv <- solve(gg) bigmatrix <- outer(x1, x2, "<=") Gw <- t(g) %*% bigmatrix beta <- gginv %*% Gw w1 <- bigmatrix - g %*% beta If I try to run such a thing in my computer, it will throw a memory error (because the bigmatrix is too big). Do you know how can I achieve the same, without running into this problem? 回答1: This is a least squares problem with 100,000 responses.

designing classification problem of weather data

梦想与她 提交于 2019-12-08 05:53:25
问题 In normal 2 or multi class classification problem, we can use any famous machine learning algorithm like Naive Bayes or SVM to train and test the model. My problem is that I have been given weather data where the label variable is in the format of "20 % rain, 80 % dry" or "30% cloudy, 70% rain" etc. How should I approach this problem? Will I need to covert the problem into regression somehow? In that case, if there are three labels (rain, dry, cloudy) in data, what may be the right approach

m-estimate for continuous values

徘徊边缘 提交于 2019-12-08 05:19:08
问题 I'm building a custom regression tree and want to use m-estimate for pruning. Does anyone know how to calculate that. http://www.ailab.si/blaz/predavanja/UISP/slides/uisp07-RegTrees.ppt might help (slide 12, how should Em look like?) 回答1: There are a lot of m-estimates. They all boil down to recasting your estimation problem as a minimization problem. If you use squared error as the function you're minimizing, you just get sample mean. If you use absolute value of the error, you get the