regression

Error with using mlogit R function: “The two indexes don't define unique observations”

扶醉桌前 提交于 2020-07-10 09:00:08
问题 My dataset look like this ID choice_situation Alternative Attr1 Attr2 Attr3 choice ID_1 1 1 0 0 0 0 ID_1 1 2 1 1 0 1 ID_1 2 1 1 1 0 0 ID_1 2 2 1 1 1 1 ID_1 3 1 2 1 0 1 ID_1 3 2 3 1 0 0 ID_2 1 1 3 0 1 1 ID_2 1 2 0 0 0 0 ID_2 2 1 2 1 1 0 ID_2 2 2 2 1 1 1 ID_2 3 1 0 0 0 1 ID_2 3 2 0 0 1 0 ..... Every time I run the code of mlogit function DCE_data<- mlogit.data(data=dataset, choice = "choice", shape = "long", alt.var = "Alternative", id.var = "ID") #ok model<- mlogit(choice ~ Attr1 + Attr2 +

Why neural network tends to output 'mean value'?

梦想的初衷 提交于 2020-07-09 02:28:30
问题 I am using keras to build a simple neural network for a regression task. But the output is always tends to the 'mean value' of ground truth y data. See the first figure, blue is ground truth, red is predicted value (very close to the constant mean of ground truth). Also the model stops learning very early even though I set a learning epoch=100. Anyone have ideas under what kinds of conditions the neural network will stop learning early and why the regression output tends to 'the mean' of

Weighted linear regression with Scikit-learn

北战南征 提交于 2020-07-05 04:22:50
问题 My data: State N Var1 Var2 Alabama 23 54 42 Alaska 4 53 53 Arizona 53 75 65 Var1 and Var2 are aggregated percentage values at the state level. N is the number of participants in each state. I would like to run a linear regression between Var1 and Var2 with the consideration of N as weight with sklearn in Python 2.7. The general line is: fit(X, y[, sample_weight]) Say the data is loaded into df using Pandas and the N becomes df["N"] , do I simply fit the data into the following line or do I

Weighted linear regression with Scikit-learn

蹲街弑〆低调 提交于 2020-07-05 04:22:02
问题 My data: State N Var1 Var2 Alabama 23 54 42 Alaska 4 53 53 Arizona 53 75 65 Var1 and Var2 are aggregated percentage values at the state level. N is the number of participants in each state. I would like to run a linear regression between Var1 and Var2 with the consideration of N as weight with sklearn in Python 2.7. The general line is: fit(X, y[, sample_weight]) Say the data is loaded into df using Pandas and the N becomes df["N"] , do I simply fit the data into the following line or do I

Multivariate Linear Regression in c++ [closed]

谁说胖子不能爱 提交于 2020-06-27 04:49:14
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 6 years ago . Improve this question I have a vector A[a1, a2, a3] and B[b1, b2, b3]. I want to find a "correlation" matrix X (3x3) that can predict from new incoming data of A' to to produce output predictions of B'. Basically in the end: A'*X to get B'. I have lots of recorded data of A and B

Error in summary quantreg backsolve

陌路散爱 提交于 2020-06-25 10:23:34
问题 When I run a quantile regression in R, using the quantreg package, and then I run summary(quantregObject) , I get this error message: Error in base::backsolve(r, x, k = k, upper.tri = upper.tri, transpose = transpose, : singular matrix in 'backsolve'. First zero in diagonal [1] Any suggestion how could I fix this problem? 回答1: In short, try: summary(quantregObject, se = "iid") which puts a strong assumption on your residuals. Or if you need accuracy use a boot strap to get the standard errors

p-values from ridge regression in python

南笙酒味 提交于 2020-06-25 05:29:29
问题 I'm using ridge regression (ridgeCV). And I've imported it from: from sklearn.linear_model import LinearRegression, RidgeCV, LarsCV, Ridge, Lasso, LassoCV How do I extract the p-values? I checked but ridge has no object called summary. I couldn't find any page which discusses this for python (found one for R). alphas = np.linspace(.00001, 2, 1) rr_scaled = RidgeCV(alphas = alphas, cv =5, normalize = True) rr_scaled.fit(X_train, Y_train) 回答1: You can use the regressors package to output p

ggplot2: add regression equations and R2 and adjust their positions on plot

大兔子大兔子 提交于 2020-06-25 03:26:12
问题 Using df and the code below library(dplyr) library(ggplot2) library(devtools) df <- diamonds %>% dplyr::filter(cut%in%c("Fair","Ideal")) %>% dplyr::filter(clarity%in%c("I1" , "SI2" , "SI1" , "VS2" , "VS1", "VVS2")) %>% dplyr::mutate(new_price = ifelse(cut == "Fair", price* 0.5, price * 1.1)) ggplot(df, aes(x= new_price, y= carat, color = cut))+ geom_point(alpha = 0.3)+ facet_wrap(~clarity, scales = "free_y")+ geom_smooth(method = "lm", se = F) I got this plot Thanks to @kdauria's answer to

Optimise custom gaussian processes kernel in scikit using gridsearch

人走茶凉 提交于 2020-06-17 09:51:11
问题 I'm working with Gaussian processes and when I use the scikit-learn GP modules I struggle to create and optimise custom kernels using gridsearchcv . The best way to describe this problem is using the classic Mauna Loa example where the appropriate kernel is constructed using a combination of already defined kernels such as RBF and RationalQuadratic . In that example the parameters of the custom kernel are not optimised but treated as given. What if I wanted to run a more general case where I

The output of my regression NN with LSTMs is wrong even with low val_loss

亡梦爱人 提交于 2020-06-17 09:41:47
问题 The bounty expires in 5 days . Answers to this question are eligible for a +50 reputation bounty. Sharan Duggirala wants to draw more attention to this question. The Model I am currently working on a stack of LSTMs and trying to solve a regression problem. The architecture of the model is as below: comp_lstm = tf.keras.models.Sequential([ tf.keras.layers.LSTM(64, return_sequences = True), tf.keras.layers.LSTM(64, return_sequences = True), tf.keras.layers.LSTM(64), tf.keras.layers.Dense(units