perf

softlockup/hardlockup原理详细介绍

眉间皱痕 提交于 2020-01-16 00:50:31
转载自 https://blog.csdn.net/hzj_001/article/details/100054659 主体涉及到了3个机制:kernel watchodog线程,高精度定时器(时钟中断),基于PMU硬件perf event的NMI(不可屏蔽中断)。 基本思想: 1.)(soft lockup):抢占被长时间关闭而导致其余进程无法调度 2.)(hard lockup):中断被长时间关闭而导致 softlockup基本原理: 1)SoftLockup 检测首先需要对每一个CPU core注册叫做watchdog的kernel线程。即[watchdog/0],[watchdog/1],[watchdog/2]… 2)同时,系统会有一个高精度的计时器hrtimer,该计时器能定期产生时钟中断,该中断对应的中断回调函数是watchdog_timer_fn();此中断回调函数主要做3件事: a.watchdog_interrupt_count函数更新hrtimer_interrupts变量(判断hardlockup会用) b.wake_up_process唤醒watchdog线程(更新时间戳) c.is_softlockup判断是否出现了soft_lockup soft lock detector会检查时间戳,如果超过soft lockup threshold一直未更新,说明

Linux性能分析工具汇总合集

爷,独闯天下 提交于 2020-01-11 23:56:44
出于对Linux操作系统的兴趣,以及对底层知识的强烈欲望,因此整理了这篇文章。本文也可以作为检验基础知识的指标,另外文章涵盖了一个系统的方方面面。如果没有完善的计算机系统知识,网络知识和操作系统知识,文档中的工具,是不可能完全掌握的,另外对系统性能分析和优化是一个长期的系列。 本文档主要是结合Linux 大牛,Netflix 高级性能架构师 Brendan Gregg 更新 Linux 性能调优工具的博文,搜集Linux系统性能优化相关文章整理后的一篇综合性文章,主要是结合博文对涉及到的原理和性能测试的工具展开说明。 背景知识:具备背景知识是分析性能问题时需要了解的。比如硬件 cache;再比如操作系统内核。应用程序的行为细节往往是和这些东西互相牵扯的,这些底层的东西会以意想不到的方式影响应用程序的性能,比如某些程序无法充分利用 cache,从而导致性能下降。比如不必要地调用过多的系统调用,造成频繁的内核 / 用户切换等。这里只是为本文的后续内容做一些铺垫,关于调优还有很多东西,我所不知道的比知道的要多的多,希望大家能共同学习进步。 【性能分析工具】 首先来看一张图: 上图是Brendan Gregg 的一次性能分析的分享,这里面的所有工具都可以通过man来获得它的帮助文档,下问简单介绍介绍一下常规的用法: ▲ vmstat--虚拟内存统计 vmstat

Perf tool stat output: multiplex and scaling of “cycles”

前提是你 提交于 2020-01-04 05:24:06
问题 I am trying to understand the multiplex and scaling of "cycles" event in the "perf" output. The following is the output of perf tool: 144094.487583 task-clock (msec) # 1.017 CPUs utilized 539912613776 instructions # 1.09 insn per cycle (83.42%) 496622866196 cycles # 3.447 GHz (83.48%) 340952514 cache-misses # 10.354 % of all cache refs (83.32%) 3292972064 cache-references # 22.854 M/sec (83.26%) 144081.898558 cpu-clock (msec) # 1.017 CPUs utilized 4189372 page-faults # 0.029 M/sec 0 major

Understanding the perf report

≯℡__Kan透↙ 提交于 2020-01-04 01:56:05
问题 I had been working on some time-sensitive project. Because of some undesired spikes in the timing, I had to go a bit deeper. Scenario : I have a kernel module, which is pinned to a CPU core. This CPU core is also listed in isolcpus in the kernel boot parameters. Here's what I have done to kernel boot parameters in cmdline intel_iommu=on iommu=pt default_hugepagesz=1G hugepagesz=1G hugepages=1 intel_idle.max_cstate=0 processor.max_cstate=0 nohz_full=7-11 isolcpus=7-11 mce=off rcu_nocbs=7-11

perf: strange relation between software events

大憨熊 提交于 2020-01-03 04:50:33
问题 Okay, so this really bugs me. I'm using perf to record the cpu-clock event (a software event): $ > perf record -e cpu-clock srun -n 1 ./stream ... and the table produced by perf report is empty. I'm using perf to record all available software events listed in perf list: $ > perf record -e alignment-faults,context-switches,cpu-clock,cpu-migrations,\ dummy,emulation-faults,major-faults,minor-faults,page-faults,task-clock\ srun -n 1 ./stream ... the table gives me a list of available samples: 0

system call hardware performance counters ubuntu

断了今生、忘了曾经 提交于 2020-01-02 20:19:07
问题 I am working on a project and I would like to obtain the performance counters(cache, TLB, etc) values of a system call(eg: read()) before and after the execution of a file. I tried doing this using perf on Ubuntu but was not able to get any results. Is there a way to do it using perf or maybe some other tool ? Thanks for the help. 3.329057 task-clock (msec) # 0.714 CPUs utilized 16 context-switches # 0.005 M/sec 0 cpu-migrations # 0.000 K/sec 257 page-faults # 0.077 M/sec 1,983,212 cycles # 0

How to get perf_event results for 2nd Nexus7 with Krait CPU

心不动则不痛 提交于 2020-01-02 18:57:29
问题 all. I try to get PMUs information such as Instructions, Cycle, Cache miss and etc. on 2nd Nexus7 with Krait CPU. The perf tool is not working correctly. Therefore, I am using follow a sample source code in perf_event tutorials. #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <string.h> #include <sys/ioctl.h> #include <linux/perf_event.h> #include <asm/unistd.h> static long perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long

嵩天老师用蒙特卡罗方法和python求圆周率详细代码解析

↘锁芯ラ 提交于 2020-01-02 16:08:31
from time import perf_counter from random import random start=perf_counter() x=10 sum=0 while x>0: zheng= 10000*10000 #相当于正方形的面积 yuan=0 #相当于圆的面积 for i in range(1,zheng): a,b=random(),random() r=pow(a**2+b**2,0.5) #用勾股定理求到圆心的距离(圆的半径) if r<=1: yuan = yuan+1 pi=4*(yuan/zheng) x=x-1 sum = sum +pi print('10次平均值:π={:.7f}'.format(sum/10)) print('亿次循环耗时:{:.2f}秒'.format(perf_counter()-start)) 来源: CSDN 作者: qqfushi 链接: https://blog.csdn.net/qqfushi/article/details/103805000

Can I get the python call stack with the linux perf?

谁说我不能喝 提交于 2020-01-01 07:42:33
问题 For example, def test(): print "test" I used perf record -g -p $pid , but the result was just all about PyEval_EvalFrameEx . How can I get the real name "test" or if can not by using perf? 回答1: As of 2018, perf simply doesn't have support for reading the Python stack frames (cf. a 2014 Python mailinglist discussion). Python 3.6 has some support for Dtrace and Systemtap. An alternative to this is Pyflame, a stochastic profiler for Python that samples python call stacks via ptrace() . In

Logging all memory accesses of any executable/process in Linux

喜欢而已 提交于 2019-12-30 11:27:07
问题 I have been looking for a way to log all memory accesses of a process/execution in Linux. I know there have been questions asked on this topic previously here like this Logging memory access footprint of whole system in Linux But I wanted to know if there is any non-instrumentation tool that performs this activity. I am not looking for QEMU/ VALGRIND for this purpose since it would be a bit slow and I want as little overhead as possible. I looked at perf mem and PEBS events like cpu/mem-loads