【Keras学习笔记】2:多元线性回归

匿名 (未验证) 提交于 2019-12-02 23:26:52
版权声明:本文为博主原创学习笔记,如需转载请注明来源。 https://blog.csdn.net/SHU15121856/article/details/89286220

多元线性回归

多元线性回归也就相当于NN的一层,y=wx+b,其中w和x是>1维的同维向量,也就是用输入的特征x1,x2,…去使用参数w和b预测y值。

import pandas as pd import matplotlib as plt %matplotlib inline 
# Kaggle房价的train数据 df = pd.read_csv("./data/houseprice.csv") df.head() 
Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape LandContour Utilities ... PoolArea PoolQC Fence MiscFeature MiscVal MoSold YrSold SaleType SaleCondition SalePrice
0 1 60 RL 65.0 8450 Pave NaN Reg Lvl AllPub ... 0 NaN NaN NaN 0 2 2008 WD Normal 208500
1 2 20 RL 80.0 9600 Pave NaN Reg Lvl AllPub ... 0 NaN NaN NaN 0 5 2007 WD Normal 181500
2 3 60 RL 68.0 11250 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 9 2008 WD Normal 223500
3 4 70 RL 60.0 9550 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 2 2006 WD Abnorml 140000
4 5 60 RL 84.0 14260 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 12 2008 WD Normal 250000

5 rows × 81 columns

import numpy as np # 处理极端值 train = df[df['GarageArea'] < 1200] # 处理缺失值:对于数值形式的数据,先用默认interpolate()进行插值,再删除那些有NaN的行 train = train.select_dtypes(include=[np.number]).interpolate().dropna() train.head() 
Id MSSubClass LotFrontage LotArea OverallQual OverallCond YearBuilt YearRemodAdd MasVnrArea BsmtFinSF1 ... WoodDeckSF OpenPorchSF EnclosedPorch 3SsnPorch ScreenPorch PoolArea MiscVal MoSold YrSold SalePrice
0 1 60 65.0 8450 7 5 2003 2003 196.0 706 ... 0 61 0 0 0 0 0 2 2008 208500
1 2 20 80.0 9600 6 8 1976 1976 0.0 978 ... 298 0 0 0 0 0 0 5 2007 181500
2 3 60 68.0 11250 7 5 2001 2002 162.0 486 ... 0 42 0 0 0 0 0 9 2008 223500
3 4 70 60.0 9550 7 5 1915 1970 0.0 216 ... 0 35 272 0 0 0 0 2 2006 140000
4 5 60 84.0 14260 8 5 2000 2000 350.0 655 ... 192 84 0 0 0 0 0 12 2008 250000

5 rows × 38 columns

# 取出特征和预测值 x = train.iloc[:,1:37] y = train.iloc[:,-1] 
import keras # 初始化model model = keras.Sequential() 
Using TensorFlow backend. 
# 添加全连接层(输出维度是1,输入维度是36) from keras import layers model.add(layers.Dense(1, input_dim=36)) 
WARNING:tensorflow:From E:\MyProgram\Anaconda\envs\krs\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. 
model.summary() 
_________________________________________________________________ Layer (type)                 Output Shape              Param #    ================================================================= dense_1 (Dense)              (None, 1)                 37         ================================================================= Total params: 37 Trainable params: 37 Non-trainable params: 0 _________________________________________________________________ 
# 编译model,指明优化器和优化目标 model.compile(     optimizer='adam',     loss='mse' ) 
# 训练模型 model.fit(x, y, epochs=3000, verbose=0) 
WARNING:tensorflow:From E:\MyProgram\Anaconda\envs\krs\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead.      <keras.callbacks.History at 0x13bf6780> 
model.predict(x) 
array([[200148.9  ],        [188514.89 ],        [205523.38 ],        ...,        [218126.67 ],        [115857.586],        [191505.69 ]], dtype=float32) 
y.head() 
0    208500 1    181500 2    223500 3    140000 4    250000 Name: SalePrice, dtype: int64 
文章来源: https://blog.csdn.net/SHU15121856/article/details/89286220
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!