Completely disable optimizations on NVCC

拜拜、爱过 提交于 2019-12-05 02:55:46

问题


I'm trying to measure peak single-precision flops on my GPU, for that I'm modifying a PTX file to perform successive MAD instructions on registers. Unfortunately the compiler is removing all the code because it actually does nothing usefull since I do not perform any load/store of the data. Is there a compiler flag or pragma to add to the code so the compiler does not touch it?

Thanks.


回答1:


I don't think there is any way to turn off such optimization in the compiler. You can work around this by adding code to store your values and wrapping that code in a conditional statement that is always false. To make a conditional that the compiler can't determine to always be false, use at least one variable (not just constants).




回答2:


To completely disable optimizations with nvcc, you can use the following:

nvcc -O0 -Xopencc -O0 -Xptxas -O0  // sm_1x targets using Open64 frontend
nvcc -O0 -Xcicc -O0 -Xptxas -O0 // sm_2x and sm_3x targets using NVVM frontend

Note that the resulting code may be extremely slow. The -O0 flag is passed to the host compiler to disable host code optimization. The -Xopencc -O0 and -Xcicc -O0 flags control the compiler frontend (the part that produces PTX) and turn off optimizations there. The -Xptxas -O0 flag controls the compiler backend (the part that converts PTX to machine code) and turns off optimizations in that part. Note that -Xopencc, -Xcicc, and -Xptxas flags are component-level flags, and unless documented in the nvcc manual, should be considered unsupported.




回答3:


(I am still in CUDA 4.0, it may have changed with the new version)

To disable optimizations of ptxas (the tool that converts ptx into cubin) you need to pass an option --opt-level 0 (default is --opt-level 3). If you want to pass this option through nvcc you will need to prefix it with --ptxas-options.

Do note however, that ptxas does a lot of useful optimizations that --- when disabled --- may render your code even slower if not incorrect at all! For example, it does register allocation and tries to predict where is shared and where is global memory.




回答4:


These worked for me:

-g -G -Xcompiler -O0 -Xptxas -O0 -lineinfo -O0




回答5:


As far as I know, there is no compiler flag or pragma for that. but you can compute more and store less



来源:https://stackoverflow.com/questions/11821605/completely-disable-optimizations-on-nvcc

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!