regression | 易学教程

What is causing this error? Coefficients not defined because of singularities

阅读更多关于 What is causing this error? Coefficients not defined because of singularities

问题 I'm trying to find a model for my data but I get the message "Coefficients: (3 not defined because of singularities)" These occur for winter, large and high_flow I found this: https://stats.stackexchange.com/questions/13465/how-to-deal-with-an-error-such-as-coefficients-14-not-defined-because-of-singu which said it may be incorrect dummy variables, but I've checked that none of my columns are duplicates. when I use the function alias() I get: Model : S ~ A + B + C + D + E + F + G + spring +

Python——sklearn 中 Logistics Regression 的 coef_ 和 intercept_ 的具体意义

阅读更多关于 Python——sklearn 中 Logistics Regression 的 coef_ 和 intercept_ 的具体意义

sklearn 中 Logistics Regression 的 coef_ 和 intercept_ 的具体意义使用 sklearn 库可以很方便的实现各种基本的机器学习算法，例如今天说的逻辑斯谛回归（Logistic Regression），我在实现完之后，可能陷入代码太久，忘记基本的算法原理了，突然想不到**coef_ 和 intercept_**具体是代表什么意思了，就是具体到公式中的哪个字母，虽然总体知道代表的是模型参数。正文我们使用 sklearn 官方的一个例子来作为说明，源码可以从这里下载，下面我截取其中一小段并做了一些修改： import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_blobs from sklearn.linear_model import LogisticRegression # 构造一些数据点 centers = [[-5, 0], [0, 1.5], [5, -1]] X, y = make_blobs(n_samples=1000, centers=centers, random_state=40) transformation = [[0.4, 0.2], [-0.4, 1.2]] X = np.dot(X,

吴恩达深度学习学习笔记——C1W2——神经网络基础——作业2——用神经网络的思路实现Logistic回归

阅读更多关于吴恩达深度学习学习笔记——C1W2——神经网络基础——作业2——用神经网络的思路实现Logistic回归

可以明确的说，如果不自己一步步调试作业代码，很难看懂作业内容。这里主要梳理一下作业的主要内容和思路，完整作业文件可参考: http://localhost:8888/tree/Andrew-Ng-Deep-Learning-notes/assignments/C1W2 作业完整截图，参考本文结尾：作业完整截图。作业指导及目标 Logistic Regression with a Neural Network mindset（用神经网络的思路实现Logistic回归） Welcome to your first (required) programming assignment! You will build a logistic regression classifier to recognize cats. This assignment will step you through how to do this with a Neural Network mindset, and so will also hone your intuitions about deep learning. Instructions: Do not use loops (for/while) in your code, unless the instructions explicitly ask

How to do a Leave One Out cross validation by group / subset?

阅读更多关于 How to do a Leave One Out cross validation by group / subset?

问题 This question is the second part of a previous question (Linear Regression prediction in R using Leave One out Approach). I'm trying to build models for each country and generate linear regression predictions using the leave one out approach. In other words, in the code below when building model1 and model2 the "data" used should not be the entire data set. Instead it should be a subset of the dataset (country). Each country data should be evaluated using a model built with data specific to

Clustered standard errors in statsmodels with categorical variables (Python)

阅读更多关于 Clustered standard errors in statsmodels with categorical variables (Python)

问题 I want to run a regression in statsmodels that uses categorical variables and clustered standard errors. I have a dataset with columns institution, treatment, year, and enrollment. Treatment is a dummy, institution is a string, and the others are numbers. I've made sure to drop any null values. df.dropna() reg_model = smf.ols("enroll ~ treatment + C(year) + C(institution)", df) .fit(cov_type='cluster', cov_kwds={'groups': df['institution']}) I'm getting the following: ValueError: The weights

通俗得说线性回归算法（二）线性回归实战

阅读更多关于通俗得说线性回归算法（二）线性回归实战

前情提要：通俗得说线性回归算法（一）线性回归初步介绍一.sklearn线性回归详解 1.1 线性回归参数介绍完线性回归，那么我们来看看如何运用sklearn来调用线性回归模型，进行训练和预测。 def LinearRegression(fit_intercept=True, normalize=False, copy_X=True, n_jobs=None ) - fit_intercept：默认为true，参数意思是说要不要计算此模型的截距。如果设置为False，则不会在计算中使用截距。 - normalize：正则化，默认是false。 - copy_X：默认是true，会复制一份x，否则会覆盖掉原有的x。 - n_jobs：指定多少个CPU进行运算，默认是None，表示1。如果设置为-1则表示使用全部cpu。 1.2 线性回归例子 import numpy as np from sklearn.linear_model import LinearRegression X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]]) # y = 1 * x_0 + 2 * x_1 + 3 y = np.dot(X, np.array([1, 2])) + 3 reg = LinearRegression().fit(X, y) reg

Plot best fit line with plotly

阅读更多关于 Plot best fit line with plotly

问题 I am using plotly's python library to plot a scatter graph of time series data. Eg data : 2015-11-11 1 2015-11-12 2 2015-11-14 4 2015-11-15 2 2015-11-21 3 2015-11-22 2 2015-11-23 3 Code in python: df = pandas.read_csv('~/Data.csv', parse_dates=["date"], header=0) df = df.sort_values(by=['date'], ascending=[True]) trace = go.Scatter( x=df['date'], y=df['score'], mode='markers' ) fig.append_trace(trace, 2, 2) # It is a subplot iplot(fig) Once the scatter plot is plotted, I want to plot a best

手把手教你用 TensorFlow 实战线性回归问题

阅读更多关于手把手教你用 TensorFlow 实战线性回归问题

TensorFlow 实战线性回归问题线性回归 (Linear Regression) 是利用称为线性回归方程的最小平方函数对一个或多个自变量和因变量之间关系进行建模的一种回归分析，用来确定两种或两种以上变量间相互依赖的定量关系的一种统计分析方法，运用十分广泛。线性回归问题也是机器学习的入门级知识，下面就和小编一起来学习一下用 Python + TensorFlow 如何实现线性回归吧！ 1、线性回归方程单变量的线性回归方程可以表示为： y=w*x+b 本例我们将通过代码来生成一个人工数据集。随机生成一个近似采样随机分布，使得w=2.0，b=1，并加入一个噪声，噪声的最大振幅为0.4。即方程表示为： y=2.0*x+1 2、人工数据集生成 %matplotlib inline import matplotlib.pyplot as plt import numpy as np import tensorflow as tf # 设置随机数种子 np.random.seed(5) #采用np生成等差数列，生成100个点，每个点取值在-1到1之间 x_data = np.linspace(-1,1,100) # y=2x+1，其中，噪声的维度与x_data一致 y_data = 2*x_data + 1.0 + np.random.randn(*x_data.shape)*0.4

逻辑回归（Logistic Regression）

阅读更多关于逻辑回归（Logistic Regression）

逻辑回归（Logistic Regression）是一种用于解决二分类（0 or 1）问题的机器学习方法，用于估计某种事物的可能性。比如某用户购买某商品的可能性，某病人患有某种疾病的可能性，以及某广告被用户点击的可能性等。　注意，这里用的是“可能性”，而非数学上的“概率”，logisitc回归的结果并非数学定义中的概率值，不可以直接当做概率值来用。该结果往往用于和其他特征值加权求和，而非直接相乘。逻辑回归假设因变量 y 服从伯努利分布（0-1分布）；而线性回归假设因变量 y 服从正太分布（高斯分布）。一个机器学习的模型，实际上是把决策函数限定在某一组条件下，这组限定条件就决定了模型的假设空间。当然，我们还希望这组限定条件简单而合理。逻辑回归模型所做的假设是：　　　这里的 g(h) 是上边提到的 sigmoid 函数，相应的决策函数为：　决策边界（Decision Boundary）　决策边界，也称为决策面，是用于在N维空间，将不同类别样本分开的平面或曲面。　首先看Andrew Ng老师课程上的两张图：　线性决策边界：　在逻辑回归中，假设函数（h=g(z)）用于计算样本属于某类别的可能性；决策函数（h=1(g(z)>0.5)）用于计算（给出）样本的类别；决策边界（θ^Tx=0）是一个方程，用于标识出分类函数（模型）的分类边界。

Finding coefficients for logistic regression in python

阅读更多关于 Finding coefficients for logistic regression in python

问题 I'm working on a classification problem and need the coefficients of the logistic regression equation. I can find the coefficients in R but I need to submit the project in python. I couldn't find the code for learning coefficients of logistic regression in python. How to get the coefficient values in python? 回答1: sklearn.linear_model.LogisticRegression is for you. See this example: from sklearn.linear_model import LogisticRegression from sklearn.datasets import load_iris X, y = load_iris

订阅 regression