linux perf: how to interpret and find hotspots

前端 未结 5 1829
说谎
说谎 2020-11-28 21:14

I tried out linux\' perf utility today and am having trouble in interpreting its results. I\'m used to valgrind\'s callgrind which is of course a totally different approach

5条回答
  •  难免孤独
    2020-11-28 21:32

    With Linux 3.7 perf is finally able to use DWARF information to generate the callgraph:

    perf record --call-graph dwarf -- yourapp
    perf report -g graph --no-children
    

    Neat, but the curses GUI is horrible compared to VTune, KCacheGrind or similar... I recommend to try out FlameGraphs instead, which is a pretty neat visualization: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

    Note: In the report step, -g graph makes the results output simple to understand "relative to total" percentages, rather than "relative to parent" numbers. --no-children will show only self cost, rather than inclusive cost - a feature that I also find invaluable.

    If you have a new perf and Intel CPU, also try out the LBR unwinder, which has much better performance and produces far smaller result files:

    perf record --call-graph lbr -- yourapp
    

    The downside here is that the call stack depth is more limited compared to the default DWARF unwinder configuration.

提交回复
热议问题