tensorflow

How to add report_tensor_allocations_upon_oom to RunOptions in Keras

☆樱花仙子☆ 提交于 2021-02-06 15:13:29
问题 I'm trying to train a neural net on a GPU using Keras and am getting a "Resource exhausted: OOM when allocating tensor" error. The specific tensor it's trying to allocate isn't very big, so I assume some previous tensor consumed almost all the VRAM. The error message comes with a hint that suggests this: Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. That sounds good, but how do I do it?

双节棍「大师」鱼佬亲传武功秘籍:如何进行一场数据挖掘算法竞赛?

自作多情 提交于 2021-02-06 15:11:16
当我们掌握了一定的机器学习和数据挖掘基础理论后,参加一场数据算法竞赛可以接触真实的业务和数据,将理论知识过渡到工程应用,同时可以在竞赛过程中进行反复地思考,强化对理论知识的理解。 本次分享,我将以个人竞赛经历和圈内整体情况为背景和大家聊聊如何进行一场数据挖掘算法竞赛,以及赛前、赛中和赛后需要做哪些事情。最后还将进行一个案例分享,来看看我是如何进行一场比赛的。 注: 本文详细视频 晚7点 在阿里天池分享,链接可回看 https://tianchi.aliyun.com/course/live?liveId=41153 主题大纲 为什么要参加数据挖掘竞赛?能带来什么? 参加竞赛需要哪些基础知识和技能? 如何选择适合自己的竞赛? 竞赛中的几个主要模块议 竞赛过程中最重要的事情 好的竞赛总结比竞赛过程更重要 案例分享( 天池“全国城市计算AI挑战赛”) 为什么要参加数据挖掘竞赛? 从理论知识到从理论知识到工程应用;真实数据,增加项目经验 求职加分,企业看重;企业办赛,人才选拔 奖金的激励(丰厚) 交友,学习,PK高手 参加竞赛需要的基础知识和技能? 理论知识掌握:评价指标、数据分析、特征工程、常用模型 工具的掌握 语言的选择:Python 可视化工具:Matplotlib、Seaborn 数据处理工具:Pandas、NumPy 机器学习库:Sklearn、XGBoost、LightGBM

How to use the Tensorflow Dataset Pipeline for Variable Length Inputs?

风流意气都作罢 提交于 2021-02-06 12:49:53
问题 I am training a Recurrent Neural Network in Tensorflow over a dataset of sequence of numbers of varying lengths and have been trying to use the tf.data API to create an efficient pipeline. However I can't seem to get this thing to work My approach My data set is a NumPy array of shape [10000, ?, 32, 2] which is saved on my disk as a file in the .npy format. Here the ? denotes that elements have variable length in the second dimension. 10000 denotes the number of minibatches in the dataset and

python tensorflow import dll load failed

佐手、 提交于 2021-02-06 12:02:15
问题 I installed latest python 3.6.4 x64 version and then installed tensorflow for cpu-only with pip3 C:\>pip3 install tensorflow however when I tried to import tensorflow in python it showed me the error below I am sure that I have installed Microsoft Visual C++ 2015 Redistributable(x64) so it wll not be the problem of dll msvcp140.dll lost it say that "DLL load failed with error code -1073741795" so what is exactly the problem here i cannot find any other information about this error code my os

python tensorflow import dll load failed

邮差的信 提交于 2021-02-06 11:56:39
问题 I installed latest python 3.6.4 x64 version and then installed tensorflow for cpu-only with pip3 C:\>pip3 install tensorflow however when I tried to import tensorflow in python it showed me the error below I am sure that I have installed Microsoft Visual C++ 2015 Redistributable(x64) so it wll not be the problem of dll msvcp140.dll lost it say that "DLL load failed with error code -1073741795" so what is exactly the problem here i cannot find any other information about this error code my os

How to set parameters of the Adadelta Algorithm in Tensorflow correctly?

邮差的信 提交于 2021-02-06 10:51:41
问题 I've been using Tensorflow for regression purposes. My neural net is very small with 10 input neurons, 12 hidden neurons in a single layer and 5 output neurons. activation function is relu cost is square distance between output and real value my neural net trains correctly with other optimizers such as GradientDescent, Adam, Adagrad. However when I try to use Adadelta, the neural net simply won't train. Variables stay the same at every step. I have tried with every initial learning_rate

What is the difference between Loss, accuracy, validation loss, Validation accuracy?

南楼画角 提交于 2021-02-06 10:12:21
问题 At the end of each epoch, I am getting for example the following output: Epoch 1/25 2018-08-06 14:54:12.555511: 2/2 [==============================] - 86s 43s/step - loss: 6.0767 - acc: 0.0469 - val_loss: 4.1037 - val_acc: 0.2000 Epoch 2/25 2/2 [==============================] - 26s 13s/step - loss: 3.6901 - acc: 0.0938 - val_loss: 2.5610 - val_acc: 0.0000e+00 Epoch 3/25 2/2 [==============================] - 66s 33s/step - loss: 3.1491 - acc: 0.1406 - val_loss: 2.4793 - val_acc: 0.0500 Epoch

What is the difference between Loss, accuracy, validation loss, Validation accuracy?

好久不见. 提交于 2021-02-06 10:11:45
问题 At the end of each epoch, I am getting for example the following output: Epoch 1/25 2018-08-06 14:54:12.555511: 2/2 [==============================] - 86s 43s/step - loss: 6.0767 - acc: 0.0469 - val_loss: 4.1037 - val_acc: 0.2000 Epoch 2/25 2/2 [==============================] - 26s 13s/step - loss: 3.6901 - acc: 0.0938 - val_loss: 2.5610 - val_acc: 0.0000e+00 Epoch 3/25 2/2 [==============================] - 66s 33s/step - loss: 3.1491 - acc: 0.1406 - val_loss: 2.4793 - val_acc: 0.0500 Epoch

Training a tf.keras model with a basic low-level TensorFlow training loop doesn't work

冷暖自知 提交于 2021-02-06 09:55:28
问题 Note: All code for a self-contained example to reproduce my problem can be found below. I have a tf.keras.models.Model instance and need to train it with a training loop written in the low-level TensorFlow API. The problem: Training the exact same tf.keras model once with a basic, standard low-level TensorFlow training loop and once with Keras' own model.fit() method produces very different results. I would like to find out what I'm doing wrong in my low-level TF training loop. The model is a

Keras, TensorFlow : “TypeError: Cannot interpret feed_dict key as Tensor”

余生颓废 提交于 2021-02-06 09:22:14
问题 I am trying to use keras fune-tuning to develop image classify applications. I deployed that application to a web server and the image classification is succeeded. However, when the application is used from two or more computers at the same time, the following error message appears and the application doesn't work. TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(3, 3, 3, 64), dtype=float32) is not an element of this graph. Here is my code for image