Is there a way of determining how much GPU memory is in use by TensorFlow?

后端未结

关注

 4  1649

醉梦人生 2020-11-30 04:59

Tensorflow tends to preallocate the entire available memory on it\'s GPUs. For debugging, is there a way of telling how much of that memory is actually in use?

4条回答

日久生厌 (楼主)

2020-11-30 05:16
(1) There is some limited support with Timeline for logging memory allocations. Here is an example for its usage:
```
    run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
    run_metadata = tf.RunMetadata()
    summary, _ = sess.run([merged, train_step],
                          feed_dict=feed_dict(True),
                          options=run_options,
                          run_metadata=run_metadata)
    train_writer.add_run_metadata(run_metadata, 'step%03d' % i)
    train_writer.add_summary(summary, i)
    print('Adding run metadata for', i)
    tl = timeline.Timeline(run_metadata.step_stats)
    print(tl.generate_chrome_trace_format(show_memory=True))
    trace_file = tf.gfile.Open(name='timeline', mode='w')
    trace_file.write(tl.generate_chrome_trace_format(show_memory=True))
```
You can give this code a try with the MNIST example (mnist with summaries)

This will generate a tracing file named timeline, which you can open with chrome://tracing. Note that this only gives an approximated GPU memory usage statistics. It basically simulated a GPU execution, but doesn't have access to the full graph metadata. It also can't know how many variables have been assigned to the GPU.

(2) For a very coarse measure of GPU memory usage, nvidia-smi will show the total device memory usage at the time you run the command.

nvprof can show the on-chip shared memory usage and register usage at the CUDA kernel level, but doesn't show the global/device memory usage.

Here is an example command: nvprof --print-gpu-trace matrixMul

And more details here: http://docs.nvidia.com/cuda/profiler-users-guide/#abstract
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...