logistic-regression

R - Getting Column of Dataframe from String [duplicate]

橙三吉。 提交于 2019-12-09 04:02:41
问题 This question already has answers here : Dynamically select data frame columns using $ and a vector of column names (8 answers) Closed 3 years ago . I am trying to create a function that allows the conversion of selected columns of a data frame to categorical data type (factor) before running a regression analysis. Question is how do I slice a particular column from a data frame using a string (character). Example: strColumnNames <- "Admit,Rank" strDelimiter <- "," strSplittedColumnNames <-

Why most of the predicted results are 0 when I use a Caffe BP regression model?

做~自己de王妃 提交于 2019-12-08 08:50:31
问题 I converted my input data into hdf5 format. And each input data has a shape of 309 dims and a label the input data just as follow: part of the input data like this my net structure as follow: name: "RegressionNet" layer { name: "framert" type: "HDF5Data" top: "data" top: "label" include { phase: TRAIN } hdf5_data_param { source: "train_data_list.txt" batch_size: 100 } } layer { name: "framert" type: "HDF5Data" top: "data" top: "label" include { phase: TEST } hdf5_data_param { source: "test

How to print the probability of prediction in LogisticRegressionWithLBFGS for pyspark

眉间皱痕 提交于 2019-12-08 08:01:57
问题 I am using Spark 1.5.1 and, In pyspark, after I fit the model using: model = LogisticRegressionWithLBFGS.train(parsedData) I can print the prediction using: model.predict(p.features) Is there a function to print the probability score also along with the prediction? 回答1: You have to clear the threshold first, and this works only for binary classification: from pyspark.mllib.classification import LogisticRegressionWithLBFGS, LogisticRegressionModel from pyspark.mllib.regression import

how to get the log likelihood for a logistic regression model in sklearn?

旧时模样 提交于 2019-12-08 07:51:01
问题 I'm using a logistic regression model in sklearn and I am interested in retrieving the log likelihood for such a model, so to perform an ordinary likelihood ratio test as suggested here. The model is using the log loss as scoring rule. In the documentation, the log loss is defined "as the negative log-likelihood of the true labels given a probabilistic classifier’s predictions" . However, the value is always positive, whereas the log likelihood should be negative. As an example: from sklearn

Cost Function and Gradient Seem to be Working, but scipy.optimize functions are not

不打扰是莪最后的温柔 提交于 2019-12-08 02:17:18
问题 I'm working through my Matlab code for the Andrew NG Coursera course and turning it into python. I am working on non-regularized logistic regression and after writing my gradient and cost functions I needed something similar to fminunc and after some googling, I found a couple options. They are both returning the same results, but they do not match what is in Andrew NG's expected results code. Others seem to be getting this to work correctly, but I'm wondering why my specific code does not

Stata's xtlogit (fe, re) equivalent in R?

假如想象 提交于 2019-12-07 19:05:06
问题 Stata allows for fixed effects and random effects specification of the logistic regression through the xtlogit fe and xtlogit re commands accordingly. I was wondering what are the equivalent commands for these specifications in R. The only similar specification I am aware of is the mixed effects logistic regression mymixedlogit <- glmer(y ~ x1 + x2 + x3 + (1 | x4), data = d, family = binomial) but I am not sure whether this maps to any of the aforementioned commands. 回答1: The glmer command is

R Error in solve.default(V) : 'a' is 0-diml in regTermTest function

这一生的挚爱 提交于 2019-12-07 16:07:03
问题 I'm trying to use regTermTest function in R survey package to test the significance of each variables for logistic regression. However, I got a solver error for one of my variable, "fun". The error is Error in solve.default(V) : 'a' is 0-diml My code for the logistic regression is model2 <- glm(decision~samerace+race_o+field+goal+attr+sinc+intel+fun+amb+shar+like+prob,data=trg2, family=binomial) regTermTest(model2, "fun") I also encountered p = NA result for another variable "amb".

Logistic regression results different in Scikit python and R?

淺唱寂寞╮ 提交于 2019-12-07 13:04:40
问题 I was running logistic regression on iris dataset on both R and Python.But both are giving different results(coefficients,intercept and scores). #Python codes. In[23]: iris_df.head(5) Out[23]: Sepal.Length Sepal.Width Petal.Length Petal.Width Species 0 5.1 3.5 1.4 0.2 0 1 4.9 3.0 1.4 0.2 0 2 4.7 3.2 1.3 0.2 0 3 4.6 3.1 1.5 0.2 0 In[35]: iris_df.shape Out[35]: (100, 5) #looking at the levels of the Species dependent variable.. In[25]: iris_df['Species'].unique() Out[25]: array([0, 1], dtype

Reproducing drc::plot.drc with ggplot2

ⅰ亾dé卋堺 提交于 2019-12-07 09:19:43
问题 I want to reproduce the following drc::plot.drc graphs with ggplot2 . df1 <- structure(list(TempV = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 4L,

`warm_start` Parameter And Its Impact On Computational Time

梦想的初衷 提交于 2019-12-07 08:23:48
问题 I have a logistic regression model with a defined set of parameters ( warm_start=True ). As always, I call LogisticRegression.fit(X_train, y_train) and use the model after to predict new outcomes. Suppose I alter some parameters, say, C=100 and call .fit method again using the same training data . Theoretically, for the second time, I think .fit should take less computational time as compared to the model with warm_start=False . However, empirically is not actually true. Please, help me