问题
I'm currently trying to track down some phantom I/O in a PostgreSQL build I'm testing. It's a multi-process server and it isn't simple to associate disk I/O back to a particular back-end and query.
I thought Linux's perf tool would be ideal for this, but I'm struggling to capture block I/O performance counter metrics and associate them with user-space activity.
It's easy to record block I/O requests and completions with, eg:
sudo perf record -g -T -u postgres -e 'block:block_rq_*'
and the user-space pid is recorded, but there's no kernel or user-space stack captured, or ability to snapshot bits of the user-space process's heap (say, query text) etc. So while you have the pid, you don't know what the process was doing at that point. Just perf script output like:
postgres 7462 [002] 301125.113632: block:block_rq_issue: 8,0 W 0 () 208078848 + 1024 [postgres]
If I add the -g flag to perf record it'll take snapshots of the kernel stack, but doesn't capture user-space state for perf events captured in the kernel. The user-space stack only goes up to the entry-point from userspace, like LWLockRelease, LWLockAcquire, memcpy (mmap'd IO), __GI___libc_write, etc.
So. Any tips? Being able to capture a snapshot of the user-space stack in response to kernel events would be ideal.
I'm on Fedora 19, 3.11.3-201.fc19.x86_64, Schrödinger’s Cat, with perf version 3.10.9-200.fc19.x86_64.
回答1:
OK, looks like there are several parts to this:
I'm on x86_64, where most distros build with
-fomit-frame-pointerby default, andperfcan't follow the stack without frame pointers;.... unless it's a newer version built with
libunwindsupport, in which case it supportsperf record -g dwarf.
See:
- the patch adding libunwind support to Perf
- Debian bug 725075.
- linux perf: how to interpret and find hotspots
I'm on Fedora 18, but the same issue applies. So if you're profiling code you're working on (as is likely on Stack Overflow), rebuild with -fno-omit-frame-pointer and -ggdb.
I landed up rebuilding perf because I wanted to be able to compare to the stock RPMs:
sudo yum build-dep perfsudo yum install yum-utils rpmdevtools libunwind-develyumdownloader --source perfor download the appropriatekernel-.....src.rpmsrpmrpmdev-setuptreerpm -Uvh kernel-*.src.rpmcd $HOME/rpmbuild/SPECSrpmbuild -bp --target=$(uname -m) kernel.spec
At this point you can just build a new perf if you want:
cd $HOME/rpmbuild/BUILD/kernel-*/linux-*/tools/perfmake
... which I did and tested that the updated perf does in fact capture a useful stack if built with libunwind available.
You can also build a new rpm:
edit kernel.spec, uncomment the line
%define buildid ..., change buildid to something like.perfunwind. Note it's%definenot% define.In the same spec file, find:
%global perf_make \ make %{?_smp_mflags} -C tools/perf -s V=1 WERROR=0 NO_LIBUNWIND=1 HAVE_CPLUS_DEMANGLE=1 NO_GTK2=1 NO_LIBNUMA=1 NO_STRLCPY=1 prefix=%{_prefix}and delete
NO_LIBUNWIND=1rpmbuild -bb --without up --without mp --without pae --without debug --without doc --without headers --without debuginfo --without bootwrapper --without with_vdso_install --with perf kernel.specto produce newperfRPMs without building the whole kernel. Or if you want, omit the--withoutfor the kernel flavour you want, in which case you'll also want to build headers, debuginfo, etc.sudo rpm -Uvh $HOME/rpmbuild/RPMS/x86_64/perf-*.fc19.x86_64.rpm
See the fedora project guide on building a custom kernel.
I've reported the issue to Fedora; they shouldn't be using NO_LIBUNWIND=1. See bug 1025603.
Once you have a rebuilt perf you can use perf record -g dwarf to get full stacks.
来源:https://stackoverflow.com/questions/19719911/getting-user-space-stack-information-from-perf