Tensorflow: ran out of memory trying to allocate 3.90GiB. The caller indicates that this is not a failure

江枫思渺然 提交于 2019-12-05 17:18:33

问题


There is a question that I don't understand.

Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.90GiB. 
The caller indicates that this is not a failure, 
but may mean that there could be performance gains if more memory is available.

What is the sentence mean?

I have read the source code. But I cann't understand because of my poor ability.
The memory size of GPU is 6GB, the result of memory use that I use tfprof analysis is about 14GB. That is beyond the memory size of GPU. The sentence is showing weather tensorflow allocate the memory of CPU or use the good algorithm about the use of memory of GPU?

The version of tensorflow that I use is 1.2.

The infomition of GPU as fllows:

  • name: GeForce GTX TITAN Z
  • major: 3 minor: 5 memoryClockRate (GHz) 0.8755
  • Total memory: 5.94GiB
  • Free memory: 5.87GiB

My Code:

#!/usr/bin/python3.4

import tensorflow as tf
import tensorflow.examples.tutorials.mnist.input_data as input_data
import os
os.environ["CUDA_VISIBLE_DEVICES"] = '0' 


mnist = input_data.read_data_sets("MNIST_data", one_hot=True)
sess = tf.InteractiveSession(config=tf.ConfigProto(log_device_placement=True))
#sess = tf.InteractiveSession()
def weight_variable(shape):
    init = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(init)

def bias_variable(shape):
    init = tf.constant(0.1, shape=shape)
    return tf.Variable(init)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])
x_image = tf.reshape(x, [-1, 28, 28, 1])


W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)


W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)


W_f1 = weight_variable([7*7*64, 1024])
b_f1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_f1) + b_f1)


keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

W_f2 = weight_variable([1024, 10])
b_f2 = bias_variable([10])
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_f2) + b_f2)

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


test_images = tf.placeholder(tf.float32, [None, 784])
test_labels = tf.placeholder(tf.float32, [None, 10])


tf.global_variables_initializer().run()

run_metadata = tf.RunMetadata()


for i in range(100):
    batch = mnist.train.next_batch(10000)
    if (i%10 == 0):  
        train_accurancy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob : 1.0})
        print("step %d, traning accurancy %g" % (i, train_accurancy))
    sess.run(train_step, feed_dict={x: batch[0], y_: batch[1], keep_prob : 0.5}, options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE), run_metadata=run_metadata)

tf.contrib.tfprof.model_analyzer.print_model_analysis(
    tf.get_default_graph(),
    run_meta=run_metadata,
    tfprof_options=tf.contrib.tfprof.model_analyzer.PRINT_ALL_TIMING_MEMORY)

test_images = mnist.test.images[0:300, :]
test_labels = mnist.test.labels[0:300, :]
print("test accuracy %g" % accuracy.eval({x: test_images, y_: test_labels, keep_prob: 1.0}))

The warning:

2017-08-10 21:37:44.589635: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.90GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-08-10 21:37:46.208897: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.61GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.

The result of tfprof:

==================Model Analysis Report======================
_TFProfRoot (0B/14854.97MB, 0us/7.00ms)

回答1:


you're using the GPU, and your batchSize is 1000 it's a lot for 10 classes! make a smaller batch size like 10to20 and augment the range to 10e4 or even 10e3. This problem is well known . Any if you definitely want to use 10000 as batch size, tell tensorflow to use the CPU using :

tf.device('/cpu:0')


来源:https://stackoverflow.com/questions/45625691/tensorflow-ran-out-of-memory-trying-to-allocate-3-90gib-the-caller-indicates-t

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!