Multi-GPU profiling (Several CPUs , MPI/CUDA Hybrid)

前端 未结 4 980
日久生厌
日久生厌 2021-01-03 02:30

I had a quick look on the forums and I don\'t think this question has been asked already.

I am currently working with an MPI/CUDA hybrid code, made by somebody else

4条回答
  •  日久生厌
    2021-01-03 03:10

    Apparently since 2015 it is possible to auto-annotated MPI calls via NVTX and mpi_interceptions.so library when using nvprof profiler:

    https://devblogs.nvidia.com/gpu-pro-tip-track-mpi-calls-nvidia-visual-profiler/

    http://on-demand.gputechconf.com/gtc/2017/presentation/s7495-jain-optimizing-application-performance-cuda-profiling-tools.pdf

    TAO still does not support distributed deep learning according to this presentation:

    http://on-demand.gputechconf.com/gtc/2017/presentation/s7684-allen-malony-performance-analysis-of-cuda-deep-learning-networks-using-tau.pdf

提交回复
热议问题