Is there a way of determining how much GPU memory is in use by TensorFlow?

后端 未结 4 1649
醉梦人生
醉梦人生 2020-11-30 04:59

Tensorflow tends to preallocate the entire available memory on it\'s GPUs. For debugging, is there a way of telling how much of that memory is actually in use?

4条回答
  •  日久生厌
    2020-11-30 05:16

    (1) There is some limited support with Timeline for logging memory allocations. Here is an example for its usage:

        run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
        run_metadata = tf.RunMetadata()
        summary, _ = sess.run([merged, train_step],
                              feed_dict=feed_dict(True),
                              options=run_options,
                              run_metadata=run_metadata)
        train_writer.add_run_metadata(run_metadata, 'step%03d' % i)
        train_writer.add_summary(summary, i)
        print('Adding run metadata for', i)
        tl = timeline.Timeline(run_metadata.step_stats)
        print(tl.generate_chrome_trace_format(show_memory=True))
        trace_file = tf.gfile.Open(name='timeline', mode='w')
        trace_file.write(tl.generate_chrome_trace_format(show_memory=True))
    

    You can give this code a try with the MNIST example (mnist with summaries)

    This will generate a tracing file named timeline, which you can open with chrome://tracing. Note that this only gives an approximated GPU memory usage statistics. It basically simulated a GPU execution, but doesn't have access to the full graph metadata. It also can't know how many variables have been assigned to the GPU.

    (2) For a very coarse measure of GPU memory usage, nvidia-smi will show the total device memory usage at the time you run the command.

    nvprof can show the on-chip shared memory usage and register usage at the CUDA kernel level, but doesn't show the global/device memory usage.

    Here is an example command: nvprof --print-gpu-trace matrixMul

    And more details here: http://docs.nvidia.com/cuda/profiler-users-guide/#abstract

提交回复
热议问题