perf

How to measure mispredictions for a single branch on Linux?

余生长醉 提交于 2019-12-05 00:29:12
问题 I know that I can get the total percentage of branch mispredictions during the execution of a program with perf stat . But how can I get the statistics for a specific branch ( if or switch statement in C code)? 回答1: You can sample on the branch-misses event: sudo perf record -e branch-misses <yourapp> and then report it (and even selecting the function you're interested in): sudo perf report -n --symbols=<yourfunction> There you can access the annotated code and get some statistics for a

Adding dynamic tracepoint through perf in Linux for function that is not listed

£可爱£侵袭症+ 提交于 2019-12-04 20:18:43
I am trying to trace function zap_pte_range from mm/memory.c using perf . But function is not listed in the perf probe -F . So is there a way to dynamically trace this function? I.e. with explicitly adding the tracepoint and recompiling the kernel? perf probe -a zap_pte_range gives: [kernel.kallsyms] with build id 33b15ec444475ee7806331034772f61666fa6719 not found, continuing without symbols Failed to find symbol zap_pte_range in kernel Error: Failed to add events. There is no such trace point. So apparently you cannot trace it the easy way. It seems that this function was inlined by compiler

How to catch the L3-cache hits and misses by perf tool in Linux

被刻印的时光 ゝ 提交于 2019-12-04 15:58:36
问题 Is there any way to catch the L3-cache hits and misses by perf tool in Linux. According to the output of perf list cache , L1 and LLC cache are supported. According to the definition of perf_evsel__hw_cache array in perf's source code: const char *perf_evsel__hw_cache[PERF_COUNT_HW_CACHE_MAX] [PERF_EVSEL__MAX_ALIASES] = { { "L1-dcache", "l1-d", "l1d", "L1-data", }, { "L1-icache", "l1-i", "l1i", "L1-instruction", }, { "LLC", "L2", }, { "dTLB", "d-tlb", "Data-TLB", }, { "iTLB", "i-tlb",

ceph jewel手动编译安装的一些优化

别说谁变了你拦得住时间么 提交于 2019-12-04 15:09:04
一、手动安装ceph。 根据http://my.oschina.net/linuxhunter/blog/682013,手动安装jewel版本ceph到硬件服务器。 二、测试ceph集群的方法。 使用ceph自带的rados bench命令简单测试手动搭建ceph集群的性能,查看系统性能的工具使用perf命令。由于默认安装的ubuntu环境没有安装perf工具及其依赖,所有要手动安装perf工具。#apt-get install perf linux-tools-4.4.0-21-generic。安装完成后启用两个终端分别运行#perf top命令和#rados banch -p test_rbd 60 write --no-cleanup命令。 三、发现的问题。 在运行#perf top命令的终端上发现ceph-osd在调用ceph_crc32_sctp这个函数上占用了35%的CPU时间,而此时集群的压力并不大,因此决定从源代码上找找ceph_crc32_sctp为什么占用那么多的CPU时间。 ceph_crc32_sctp这个函数位于src/common/sctp_crc32.c这个文件中,只有ceph_choose_crc32函数调用ceph_crc32_sctp函数。分析ceph_choose_crc32这个函数不难发现,该函数根据当前CPU的架构来选择计算crc32的方法

Logging Memory Access Footprint

老子叫甜甜 提交于 2019-12-04 13:32:01
I found mtrace by Dr.Clements. Although it is useful, it doesn't work normally in the situation I need. I intend to use the record to understand memory access pattern in different scenario. Can someone share the related experience? Any suggestion will be appreciated. 0313 Updated : I'm trying to use qemu-mtrace to boot ubuntu 16.04 with linux-mtrace(3.8.0), but it only show several error message and terminated. Hope some tool be able to log every access. $ ./qemu-system-x86_64 -mtrace-enable -mtrace-file mtrace.out -hda ubuntu.img -m 1024 Error: mtrace_entry_ascope (exit, syscall:xx) with no

perf mem -D report

北城余情 提交于 2019-12-04 13:27:46
I was using perf mem -t load record "commands" to profile system memory access latency. After, I run perf mem -D report and I got the following results: [root@mdtm-server wenji]# perf mem -D report # PID, TID, IP, ADDR, LOCAL WEIGHT, DSRC, SYMBOL 2054 2054 0xffffffff811186bf 0x016ffffe8fbffc804b0 49 0x68100842 /lib/modules/3.12.23/build/vmlinux:perf_event_aux_ctx 2054 2054 0xffffffff81321d6e 0xffff880c7fc87d44 7 0x68100142 /lib/modules/3.12.23/build/vmlinux:ghes_copy_tofrom_phys What does "ADDR", "DSRC", "SYMBOL" mean? IP - PC of the load/store instruction; SYMBOL - name of function,

Can't sample hardware cache events with linux perf

二次信任 提交于 2019-12-04 12:03:59
For some reason, I can't sample ( perf record ) hardware cache events: # perf record -e L1-dcache-stores -a -c 100 -- sleep 5 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.607 MB perf.data (~26517 samples) ] # perf script but I can count them ( perf stat ): # perf stat -e L1-dcache-stores -a -- sleep 5 Performance counter stats for 'sleep 5': 711,781 L1-dcache-stores 5.000842990 seconds time elapsed I tried on different CPUs, OS versions (and kernel versions), perf versions but the result is the same. Is this an expected behaviour? What is the reason? Can

Which perf events can use PEBS?

天大地大妈咪最大 提交于 2019-12-04 11:50:25
I want to understand which events can have the precise modifier on my CPU (Sandy Bridge). Intel Software Developer's Manual (Table 18-32. PEBS Performance Events for Intel Microarchitecture Code Name Sandy Bridge) contains only the following events: INST_RETIRED , UOPS_RETIRED , BR_INST_RETIRED , BR_MISP_RETIRED , MEM_UOPS_RETIRED , MEM_LOAD_UOPS_RETIRED , MEM_LOAD_UOPS_LLC_HIT_RETIRED . And SandyBridge_core_V15.json lists the same events with PEBS > 0. However there are some examples of using perf , which add :p to the cycles event. And I can successfully run perf record -e cycles:p on my

How do you get debugging symbols working in linux perf tool inside Docker containers?

拈花ヽ惹草 提交于 2019-12-04 11:09:27
问题 I am using Docker containers based on the "ubuntu" tag and cannot get linux perf tool to display debugging symbols. Here is what I'm doing to demonstrate the problem. First I start a container, here with an interactive shell. $ docker run -t -i ubuntu:14.04 /bin/bash Then from the container prompt I install linux perf tool. $ apt-get update $ apt-get install -y linux-tools-common linux-tools-generic linux-tools-`uname -r` I can now use the perf tool. My kernel is 3.16.0-77-generic . Now I'll

How to measure mispredictions for a single branch on Linux?

匆匆过客 提交于 2019-12-04 10:18:36
I know that I can get the total percentage of branch mispredictions during the execution of a program with perf stat . But how can I get the statistics for a specific branch ( if or switch statement in C code)? You can sample on the branch-misses event: sudo perf record -e branch-misses <yourapp> and then report it (and even selecting the function you're interested in): sudo perf report -n --symbols=<yourfunction> There you can access the annotated code and get some statistics for a given branch. Or directly annotate it with the perf command with --symbol option. 来源: https://stackoverflow.com