regression | 易学教程

正确选择ML算法

阅读更多关于正确选择ML算法

本文教你如何选择合适自己的机器学习算法。分类逻辑斯蒂回归（Logistic regression）属于判别式模型，有很多正则化模型的方法（L0，L1，L2，etc），而且你不必像在用朴素贝叶斯那样担心你的特征是否相关。与决策树与SVM机相比，你还会得到一个不错的概率解释，你甚至可以轻松地利用新数据来更新模型（使用在线梯度下降算法，onlinegradientdescent）。如果你需要一个概率架构（比如，简单地调节分类阈值，指明不确定性，或者是要获得置信区间），或者你希望以后将更多的训练数据快速整合到模型中去，那么使用它吧。优点实现简单，广泛的应用于工业问题上；分类时计算量非常小，速度很快，存储资源低；便利的观测样本概率分数；对逻辑回归而言，多重共线性并不是问题，它可以结合L2正则化来解决该问题；表现最好：当特征没有相关性，最终分类结果是线性的，且特征维度远小于数据量的时候效果好。缺点当特征空间很大时，逻辑回归的性能不是很好；容易欠拟合，一般准确度不太高不能很好地处理大量多类特征或变量；只能处理两分类问题（在此基础上衍生出来的softmax可以用于多分类），且必须线性可分；对于非线性特征，需要进行转换；表现最差：当特征相关性比较强时，表现会很差。链接机器学习之良/恶性乳腺癌肿瘤预测机器学习算法集锦：从贝叶斯到深度学习及各自优缺点朴素贝叶斯

why Keras 2D regression network has constant output

阅读更多关于 why Keras 2D regression network has constant output

问题 I am working on the some kind of the 2D Regression Deep network with keras, but the network has constant output for every datasets, even I test with handmade dataset in this code I feed the network with a constant 2d values and the output is linear valu of the X (2*X/100) but the out put is constant. import resource import glob import gc rsrc = resource.RLIMIT_DATA soft, hard = resource.getrlimit(rsrc) print ('Soft limit starts as :', soft) resource.setrlimit(rsrc, (4 * 1024 * 1024 * 1024,

why Keras 2D regression network has constant output

阅读更多关于 why Keras 2D regression network has constant output

How to do 2SLS IV regression using statsmodels python?

阅读更多关于 How to do 2SLS IV regression using statsmodels python?

问题 I'm trying to do 2 stage least squares regression in python using the statsmodels library. from statsmodels.sandbox.regression.gmm import IV2SLS resultIV = IV2SLS(dietdummy['Log Income'], dietdummy.drop(['Log Income', 'Diabetes']), dietdummy.drop(['Log Income', 'Reads Nutri') Reads Nutri is my endogenous variable my instrument is Diabetes and my dependent variable is Log Income . Did I do this right? its much different than the way I would do it on stata. Also, when I do resultIV.summary() I

How to do 2SLS IV regression using statsmodels python?

阅读更多关于 How to do 2SLS IV regression using statsmodels python?

How to do 2SLS IV regression using statsmodels python?

阅读更多关于 How to do 2SLS IV regression using statsmodels python?

Cluster standard errors for ordered logit R polr - values deleted in estimation

阅读更多关于 Cluster standard errors for ordered logit R polr - values deleted in estimation

问题 I am quite new to R and used to pretty basic application. Now I have encountered a problem I need help with: I am looking for a way to cluster standard errors for an ordered logistic regression (my estimation is similar to this example) I already tried robcov and vcovCL and they give me similar error messages: Error in meatCL(x, cluster = cluster, type = type, ...) : number of observations in 'cluster' and 'estfun()' do not match Error in u[, ii] <- ui : number of items to replace is not a

Iterating over multiple regression models and data subsets in R

阅读更多关于 Iterating over multiple regression models and data subsets in R

问题 I am trying to learn how to automate running 3 or more regression models over subsets of a dataset using the purrr and broom packages in R. I am doing this with the nest %>% mutate(map()) %>% unnest() flow in mind. I am able to replicate examples online when there is only one regression model that is applied to several data subsets. However, I am running into problems when I have more than one regression model in my function. What I tried to do library(tidyverse) library(broom) estimate_model

Iterating over multiple regression models and data subsets in R

阅读更多关于 Iterating over multiple regression models and data subsets in R

Polynomial Regression nonsense Predictions

阅读更多关于 Polynomial Regression nonsense Predictions

问题 Suppose I want to fit a linear regression model with degree two (orthogonal) polynomial and then predict the response. Here are the codes for the first model (m1) x=1:100 y=-2+3*x-5*x^2+rnorm(100) m1=lm(y~poly(x,2)) prd.1=predict(m1,newdata=data.frame(x=105:110)) Now let's try the same model but instead of using $poly(x,2)$, I will use its columns like: m2=lm(y~poly(x,2)[,1]+poly(x,2)[,2]) prd.2=predict(m2,newdata=data.frame(x=105:110)) Let's look at the summaries of m1 and m2. > summary(m1)