perf: strange relation between software events

匿名 (未验证) 提交于 2019-12-03 00:44:02

问题:

Okay, so this really bugs me.

I'm using perf to record the cpu-clock event (a software event):

$ > perf record -e cpu-clock srun -n 1 ./stream 

... and the table produced by perf report is empty.

I'm using perf to record all available software events listed in perf list:

$ > perf record -e alignment-faults,context-switches,cpu-clock,cpu-migrations,\ dummy,emulation-faults,major-faults,minor-faults,page-faults,task-clock\ srun -n 1 ./stream 

... the table gives me a list of available samples:

0 alignment-faults                                    125 context-switches                                                 255 cpu-clock                                                   21 cpu-migrations                                                         0 dummy                                                               0 emulation-faults                                              0 major-faults                                                       128 minor-faults                  132 page-faults                                                            254 task-clock  

I can look at the samples collected in cpu-clock and it gives me information. Why?! Why does it not work if I only measure cpu-clock? Why were there no samples collected in four events?

This is a follow-up to this question: error: perf.data file has no samples

回答1:

Probably srun don't start target process with direct fork. It may use some varian ot remote shell like ssh or daemon to start processes.

perf record (without -a option) will track only directly forked sub-processes, not the process started (forked) by sshd or other daemon. And it will never profile remote machine if the srun can go to it and perf record ... srun command was used (this is to profile srun application and everything it forks).

Try perf stat first to get total (raw) performance counters, and put perf as srun argument; this is the correct usage with tools which uses remote shell or daemons (probably with full path to perf):

 srun -n 1 perf stat ./stream  srun -n 1 /usr/bin/perf stat ./stream 

perf stat will print running time of target task. Then select some event with high raw counter (perf record usually tune sample rate to around several kHz, so thousands of samples will be generated, if there are enough raw event counts):

 srun -n 1 perf record -e cpu-clock ./stream  srun -n 1 /usr/bin/perf record -e cpu-clock ./stream 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!