1.softmax算法
(1).softmax的定义
(2).TensorFlow中的softmax
# 计算softmax tf.nn.softmax(logits, name=None) # 对softmax求对数 tf.nn.log_softmax(logits, name=None)
2.损失函数
损失函数的作用是用来描述模型预测值与真实值的差距大小。一般有两种常见的算法,均值平方差(MSE)和交叉熵
(1).均值平方差
MSE = tf.reduce_mean(tf.sub(logits, outputs), 2.0) MSE = tf.reduce_mean(tf.square(tf.sub(logits, outputs))) MSE = tf.reduce_mean(tf.square(logits-outputs))
(2).交叉熵
# 代表输入logits和tragets的交叉熵 tf.nn.sigmoid_cross_entropy_with_logits(logits, targets, name=None) # 计算logits和labels的softmax交叉熵。Logits和lables必须为相同的shape与数据类型 tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None) # 与上诉函数功能一样。区别在于此函数的样本真实值与预测结果不需要one-hot编码 # 但是要求分类的个数一定要从0开始。假如分2类开始,那么标签的预测值只有1和0这两个数。 tf.nn.spare_softmax_cross_entropy_with_logits(logits, labels, name=None) # 在交叉熵的基础上给第一项乘以一个系数(加权),是增加或减少正样本在计算交叉熵时的损失值 tf.nn.weighted_cross_entropy_with_logits(logits, targets, pos_weights, name=None)
(3).均值平方差实验
# -*-coding:utf-8 -*- # 损失函数loss:预测值(y)与已知答案(y_)的差距 # loss_mse = tf.reduce_mean(tf.square(y_-y)) # 交叉熵ce(Cross Entorpy):表示两个概率分布之间的距离 # H(y_, y) = -∑ y_ * logy import tensorflow as tf import numpy as np BATCH_SIZE = 8 seed = 23455 # 引入随机数种子是为了方便生成一致的数据 rdm = np.random.RandomState(seed) X = rdm.rand(32, 2) Y_ = [[x1 + x2 + (rdm.rand()/10.0 - 0.05)] for (x1, x2) in X] # 定义神经网络的输入,参数和输出,定义向前传播过程 x = tf.placeholder(tf.float32, shape=(None, 2)) y_ = tf.placeholder(tf.float32, shape=(None, 1)) w1 = tf.Variable(tf.random_normal([2, 1], stddev=1, seed=1)) y = tf.matmul(x, w1) # 定义损失函数及反向传播方法 # 定义损失函数为MSE,反向传播方法为梯度下降 loss_mse = tf.reduce_mean(tf.square(y_ - y)) train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss_mse) # 生成回话,训练steps轮 with tf.Session() as sess: init_op = tf.global_variables_initializer() sess.run(init_op) STEPS = 2000 for i in range(STEPS): start = (i * BATCH_SIZE) % 32 end = (i * BATCH_SIZE) % 32 + BATCH_SIZE sess.run(train_step, feed_dict={x:X[start:end], y_:Y_[start:end]}) if i % 500 == 0: print('After %d training steps:, w1 is ' % (i)) print(sess.run(w1),'\n') print('Final w1 is:\n', sess.run(w1)) 结果为: After 0 training steps:, w1 is [[-0.80974597] [ 1.4852903 ]] After 500 training steps:, w1 is [[-0.46074435] [ 1.641878 ]] After 1000 training steps:, w1 is [[-0.21939856] [ 1.6984766 ]] After 1500 training steps:, w1 is [[-0.04415594] [ 1.7003176 ]] Final w1 is: [[0.08883245] [1.6731207 ]]
(4).交叉熵实验
# -*- coding:utf-8 -*- import tensorflow as tf labels = [[0,0,1],[0,1,0]] logits = [[2,0.5,6],[0.1,0,3]] logits_scaled = tf.nn.softmax(logits) logits_scaled2 = tf.nn.softmax(logits_scaled) result1 = tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits) result2 = tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits_scaled) result3 = -tf.reduce_sum(labels*tf.log(logits_scaled), 1) with tf.Session() as sess: print('sclaed=' ,sess.run(logits_scaled)) print('sclaed2=', sess.run(logits_scaled2)) # 经过第二次的softmax后,分布概率会有变化 print('rel1=', sess.run(result1), '\n') # 正确的方式 print('rel2=', sess.run(result2), '\n') print('rel3=', sess.run(result3), '\n') 结果为: sclaed= [[0.01791432 0.00399722 0.97808844] [0.04980332 0.04506391 0.90513283]] sclaed2= [[0.21747023 0.21446465 0.56806517] [0.2300214 0.22893383 0.5410447 ]] rel1= [0.02215516 3.0996735 ] rel2= [0.56551915 1.4743223 ] rel3= [0.02215518 3.0996735 ]