logistic-regression

How can I use the predict function in R in a logistic regression fitted years ago?

点点圈 提交于 2019-12-07 07:57:15
问题 I have a problem that I am trying to resolve with no success. More than two days searching and I didn’t get a single clue. Sorry if the answer is out there and I didn’t find it. Suppose that you have a logistic equation regression (binary model) from an old model that you estimated some years ago. Therefore you know the parameters βk (k = 1, 2, ..., p) because they were estimated in the past. But you don’t have the data that were used to fit the model. My question is: can I introduce this old

No zeros predicted from zeroinfl object in R?

眉间皱痕 提交于 2019-12-07 07:22:17
问题 I created a zero inflated negative binomial model and want to investigate how many of the zeros were partitioned out to sampling or structural zeros. How do I implement this in R. The example code on the zeroinfl page is not clear to me. data("bioChemists", package = "pscl") fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin") table(round(predict(fm_zinb2, type="zero"))) > 0 1 > 891 24 table(round(bioChemists$art)) > 0 1 2 3 4 5 6 7 8 9 10 11 12 16 19 > 275 246 178 84 67 27

Vowpal Wabbit Logistic Regression

前提是你 提交于 2019-12-07 01:23:23
问题 I am performing logistic regression using Vowpal Wabbit on a dataset with 25 features and 48 million instances. I have a question on current predict values. Should it be within 0 or 1. average since example example current current current loss last counter weight label predict features 0.693147 0.693147 1 1.0 -1.0000 0.0000 24 0.419189 0.145231 2 2.0 -1.0000 -1.8559 24 0.235457 0.051725 4 4.0 -1.0000 -2.7588 23 6.371911 12.508365 8 8.0 -1.0000 -3.7784 24 3.485084 0.598258 16 16.0 -1.0000 -2

classification: PCA and logistic regression using sklearn

浪子不回头ぞ 提交于 2019-12-07 00:48:36
问题 Step 0: Problem description I have a classification problem, ie I want to predict a binary target based on a collection of numerical features, using logistic regression, and after running a Principal Components Analysis (PCA). I have 2 datasets: df_train and df_valid (training set and validation set respectively) as pandas data frame, containing the features and the target. As a first step, I have used get_dummies pandas function to transform all the categorical variables as boolean. For

Python multinomial logit with statsmodels module: Change base value of mlogit regression

一个人想着一个人 提交于 2019-12-06 09:17:32
问题 I have a little problem which I am stuck with. I am building a multinomial logit model with Python statsmodels and wish to reproduce an example given in a textbook. So far so good, but I am struggling with setting a different target value as the base value for the regression. Can somebody help?! import numpy as np import pandas as pd import statsmodels.api as sm import matplotlib.pyplot as plt #import data df = pd.read_excel('C:/.../diabetes.xlsx') #split the data in dependent and independent

Stata's xtlogit (fe, re) equivalent in R?

我们两清 提交于 2019-12-06 09:08:18
Stata allows for fixed effects and random effects specification of the logistic regression through the xtlogit fe and xtlogit re commands accordingly. I was wondering what are the equivalent commands for these specifications in R. The only similar specification I am aware of is the mixed effects logistic regression mymixedlogit <- glmer(y ~ x1 + x2 + x3 + (1 | x4), data = d, family = binomial) but I am not sure whether this maps to any of the aforementioned commands. The glmer command is used to quickly fit logistic regression models with varying intercepts and varying slopes (or, equivalently

What is the Search/Prediction Time Complexity of Logistic Regression?

∥☆過路亽.° 提交于 2019-12-06 06:04:21
I am looking into the time complexities of Machine Learning Algorithms and I cannot find what is the time complexity of Logistic Regression for predicting a new input. I have read that for Classification is O(c*d) c-beeing the number of classes, d-beeing the number of dimensions and I know that for the Linear Regression the search/prediction time complexity is O(d). Could you maybe explain what is the search/predict time complexity of Logistic Regression? Thank you in advance Example For The other Machine Learning Problems: https://www.thekerneltrip.com/machine/learning/computational

Comparison of R and scikit-learn for a classification task with logistic regression

China☆狼群 提交于 2019-12-06 05:54:19
问题 I am doing a Logistic Regression described in the book 'An Introduction to Statistical Learning with Applications in R' by James, Witten, Hastie, Tibshirani (2013). More specifically, I am fitting the binary classification model to the 'Wage' dataset from the R package 'ISLR' described in §7.8.1. Predictor 'age' (transformed to polynomial, degree 4) is fitted against the binary classification wage>250. Then the age is plotted against the predicted probabilities of the 'True' value. The model

Multinomial/conditional Logit Regression, Why StatsModel fails on mlogit package example?

不打扰是莪最后的温柔 提交于 2019-12-06 05:51:46
问题 I am trying to reproduce an example of a multinomial logit regression of the mlogit package in R. data("Fishing", package = "mlogit") Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode") #a pure "conditional" model summary(mlogit(mode ~ price + catch, data = Fish)) To reproduce this example with statsmodel function MNLogit, I export the Fishing data set as a csv file and do the following import pandas import statsmodels.api as st #load data df = pandas.read_csv(

Statsmodels logistic regression convergence problems

谁都会走 提交于 2019-12-06 05:21:55
I'm trying to run a logistic regression in statsmodels on a large design matrix (~200 columns). The features include a number of interactions, categorical features and semi-sparse (70%) integer features. Although my design matrix is not actually ill-conditioned, it seems to be somewhat close (according to numpy.linalg.matrix_rank , it is full-rank with tol=1e-3 but not with tol=1e-2 ). As a result, I'm struggling to get logistic regression to converge with any of the methods in statsmodels. Here's what I've tried so far: method='newton' : Did not converge after 1000 iterations; raised a