softmax

How best to deal with “None of the above” in Image Classification?

南笙酒味 提交于 2019-12-09 10:58:52
问题 This seems to be a fundamental question which some of you out there must have an opinion on. I have an image classifier implemented in CNTK with 48 classes. If the image does not match any of the 48 classes very well, then I'd like to be able to conclude that it was not among these 48 image types. My original idea was simply that if the highest output of the final Softmax layer was low, I would be able to conclude that the test image matched none well. While I occasionally see this occur, in

Argmax on a tensor and ceiling in Tensorflow

帅比萌擦擦* 提交于 2019-12-08 12:20:45
问题 Suppose I have a tensor in Tensorflow that its values are like: A = [[0.7, 0.2, 0.1],[0.1, 0.4, 0.5]] How can I change this tensor into the following: B = [[1, 0, 0],[0, 0, 1]] In other words I want to just keep the maximum and replace it with 1. Any help would be appreciated. 回答1: I think that you can solve it with a one-liner: import tensorflow as tf import numpy as np x_data = [[0.7, 0.2, 0.1],[0.1, 0.4, 0.5]] # I am using hard-coded dimensions for simplicity x = tf.placeholder(dtype=tf

tensorflow 错误杂记

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-08 10:40:19
ValueError:No gradients provided for any variable 错误解释:要进行训练的变量与 Loss function 之间没有路径联系起来 原因:很大可能是因为在 sess.run(train_step) 使用了 sess.run() 或者是 x.eval() 修改方法:在训练之前,不要使用任何的 run ,修改代码,使得所有的 op 在最后的会话 ‘session’ 中进行实现 训练之后输出的结果为 nan 具体的原因不太清楚,我改正我这个问题的做法是将前面代码的 tf.nn.softmax(x) 改为了 tf.nn.log_softmax(x) 就解决了 ValueError: setting an array element with a sequence 通常是因为这儿需要的是 array,你用的是 list,或者需要的是 list, 你用的 array, 从这方面入手进行改错 优化器 optimizer,GradientDescentOptimizer 不报错,RMSPropOptimizer,AdamOptimizer 会报错 因为 AdamOptimizer, RMSPropOptimizer 他们在内部会生成新的变量,所以 tf.initialize_all_variables() 应该在 optimizer 定义的后面再运行

Neural Network using Softmax with strange outputs

こ雲淡風輕ζ 提交于 2019-12-07 16:33:15
问题 I'm trying to build a tensorflow neural network using a sigmoid activation hidden layer and a softmax output layer with 3 classes. The outputs are mostly very bad and I believe it is because I am making a mistake in my model construction because I've built a similar model with Matlab and the results have been good. The data is normalized. These results look like this: [9.2164397e-01 1.6932052e-03 7.6662831e-02] [3.4100169e-01 2.2419590e-01 4.3480241e-01] [2.3466848e-06 1.3276369e-04 9

Softmax function of a numpy array by row

穿精又带淫゛_ 提交于 2019-12-07 12:43:13
问题 I am trying to apply a softmax function to a numpy array. But I am not getting the desired results. This is the code I have tried: import numpy as np x = np.array([[1001,1002],[3,4]]) softmax = np.exp(x - np.max(x))/(np.sum(np.exp(x - np.max(x))) print softmax I think the x - np.max(x) code is not subtracting the max of each row. The max needs to be subtracted from x to prevent very large numbers. This is supposed to output np.array([ [0.26894142, 0.73105858], [0.26894142, 0.73105858]]) But I

ArcFace算法笔记

回眸只為那壹抹淺笑 提交于 2019-12-06 21:06:38
论文:ArcFace: Additive Angular Margin Loss for Deep Face Recognition 论文链接: https://arxiv.org/abs/1801.07698 代码链接: https://github.com/deepinsight/insightface 这篇文章提出一种新的用于人脸识别的损失函数:additive angular margin loss,基于该损失函数训练得到人脸识别算法ArcFace(开源代码中为该算法取名为insightface,二者意思一样,接下来都用ArchFace代替)。ArcFace的思想(additive angular margin)和SphereFace以及不久前的CosineFace(additive cosine margin )有一定的共同点, 重点在于:在ArchFace中是直接在角度空间(angular space)中最大化分类界限,而CosineFace是在余弦空间中最大化分类界限,这也是为什么这篇文章叫ArcFace的原因,因为arc含义和angular一样 。除了损失函数外,本文的作者还清洗了公开数据集MS-Celeb-1M的数据,并强调了干净数据的对实验结果的影响,同时还对网络结构和参数做了优化。总体来说ArcFace这篇文章做了很多实验来验证additive angular

softmax

帅比萌擦擦* 提交于 2019-12-06 12:36:23
1、what ? Softmax function, a wonderful activation function that turns numbers aka logits into probabilities that sum to one. Softmax function outputs a vector that represents the probability distributions of a list of potential outcomes. 2、how ?   two component   special number e & sum    3、Why not just divide each logits by the sum of logits? Why do we need exponents?   When logits are negative, adding it together does not give us the correct normalization . exponentiate logits turn them them zero or positive! 参考补充知识Logits。 4、python 实现代码: import numpy as np def softmax(logits): ## 以e 为底,list

Neural Network using Softmax with strange outputs

喜欢而已 提交于 2019-12-06 02:56:31
I'm trying to build a tensorflow neural network using a sigmoid activation hidden layer and a softmax output layer with 3 classes. The outputs are mostly very bad and I believe it is because I am making a mistake in my model construction because I've built a similar model with Matlab and the results have been good. The data is normalized. These results look like this: [9.2164397e-01 1.6932052e-03 7.6662831e-02] [3.4100169e-01 2.2419590e-01 4.3480241e-01] [2.3466848e-06 1.3276369e-04 9.9986482e-01] [6.5199631e-01 3.4800139e-01 2.3596617e-06] [9.9879754e-01 9.0103465e-05 1.1123115e-03] [6

模型蒸馏(Distil)及mnist实践

假如想象 提交于 2019-12-05 23:44:21
结论:蒸馏是个好方法。 模型压缩/蒸馏在论文《Model Compression》及《Distilling the Knowledge in a Neural Network》提及,下面介绍后者及使用keras测试mnist数据集。 蒸馏:使用小模型模拟大模型的泛性。 通常,我们训练mnist时,target是分类标签,在蒸馏模型时,使用的是教师模型的输出概率分布作为“soft target”。也即损失为学生网络与教师网络输出的交叉熵(这里采用DistilBert论文中的策略,此论文不同)。 当训练好教师网络后,我们可以不再需要分类标签,只需要比较2个网络的输出概率分布。当然可以在损失里再加上学生网络的分类损失,论文也提到可以进一步优化。 如图,将softmax公式稍微变换一下,目的是使得输出更小,softmax后就更为平滑。 论文的损失定义 本文代码使用的损失为p和q的交叉熵 代码测试部分 1,教师网络,测试精度99.46%,已经相当好了,可训练参数858,618。 # 教师网络 inputs=Input((28,28,1)) x=Conv2D(64,3)(inputs) x=BatchNormalization(center=True,scale=False)(x) x=Activation('relu')(x) x=Conv2D(64,3,strides=2)(x) x

[转]MNIST机器学习入门

亡梦爱人 提交于 2019-12-05 09:05:04
MNIST机器学习入门 转自:http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_beginners.html?plg_nld=1&plg_uin=1&plg_auth=1&plg_nld=1&plg_usr=1&plg_vkey=1&plg_dev=1 这个教程的目标读者是对机器学习和TensorFlow都不太了解的新手。如果你已经了解MNIST和softmax回归(softmax regression)的相关知识,你可以阅读这个 快速上手教程 。 当我们开始学习编程的时候,第一件事往往是学习打印"Hello World"。就好比编程入门有Hello World,机器学习入门有MNIST。 MNIST是一个入门级的计算机视觉数据集,它包含各种手写数字图片: 它也包含每一张图片对应的标签,告诉我们这个是数字几。比如,上面这四张图片的标签分别是5,0,4,1。 在此教程中,我们将训练一个机器学习模型用于预测图片里面的数字。我们的目的不是要设计一个世界一流的复杂模型 -- 尽管我们会在之后给你源代码去实现一流的预测模型 -- 而是要介绍下如何使用TensorFlow。所以,我们这里会从一个很简单的数学模型开始,它叫做Softmax Regression。 对应这个教程的实现代码很短