Multi-GPU profiling (Several CPUs , MPI/CUDA Hybrid)
问题 I had a quick look on the forums and I don't think this question has been asked already. I am currently working with an MPI/CUDA hybrid code, made by somebody else during his PhD. Each CPU has its own GPU. My task is to gather data by running the (already working) code, and implement extra things. Turning this code into a single CPU / Multi-GPU one is not an option at the moment (later, possibly.). I would like to make use of performance profiling tools to analyse the whole thing. For now an