Quick way to count number of instructions executed in a C program

泪湿孤枕 提交于 2020-11-24 17:45:31

问题


Is there an easy way to quickly count the number of instructions executed (x86 instructions - which and how many each) while executing a C program ?

I use gcc version 4.7.1 (GCC) on a x86_64 GNU/Linux machine.


回答1:


Probably a duplicate of this question

I say probably because you asked for the assembler instructions, but that question handles the C-level profiling of code.

My question to you would be, however: why would you want to profile the actual machine instructions executed? As a very first issue, this would differ between various compilers, and their optimization settings. As a more practical issue, what could you actually DO with that information? If you are in the process of searching for/optimizing bottlenecks, the code profiler is what you are looking for.

I might miss something important here, though.




回答2:


You can easily count the number of executed instruction using Hardware Performance Counter (HPC). In order to access the HPC, you need an interface to it. I recommended you to use PAPI Performance API.




回答3:


Linux perf_event_open system call with config = PERF_COUNT_HW_INSTRUCTIONS

This Linux system call appears to be a cross architecture wrapper for performance events.

Here's an example adapted from the man perf_event_open page:

perf_event_open.c

#include <asm/unistd.h>
#include <linux/perf_event.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <unistd.h>

#include <inttypes.h>

static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
                int cpu, int group_fd, unsigned long flags)
{
    int ret;

    ret = syscall(__NR_perf_event_open, hw_event, pid, cpu,
                    group_fd, flags);
    return ret;
}

int
main(int argc, char **argv)
{
    struct perf_event_attr pe;
    long long count;
    int fd;

    uint64_t n;
    if (argc > 1) {
        n = strtoll(argv[1], NULL, 0);
    } else {
        n = 10000;
    }

    memset(&pe, 0, sizeof(struct perf_event_attr));
    pe.type = PERF_TYPE_HARDWARE;
    pe.size = sizeof(struct perf_event_attr);
    pe.config = PERF_COUNT_HW_INSTRUCTIONS;
    pe.disabled = 1;
    pe.exclude_kernel = 1;
    // Don't count hypervisor events.
    pe.exclude_hv = 1;

    fd = perf_event_open(&pe, 0, -1, -1, 0);
    if (fd == -1) {
        fprintf(stderr, "Error opening leader %llx\n", pe.config);
        exit(EXIT_FAILURE);
    }

    ioctl(fd, PERF_EVENT_IOC_RESET, 0);
    ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);

    /* Loop n times, should be good enough for -O0. */
    __asm__ (
        "1:;\n"
        "sub $1, %[n];\n"
        "jne 1b;\n"
        : [n] "+r" (n)
        :
        :
    );

    ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
    read(fd, &count, sizeof(long long));

    printf("Used %lld instructions\n", count);

    close(fd);
}

Compile and run:

g++ -ggdb3 -O0 -std=c++11 -Wall -Wextra -pedantic -o perf_event_open.out perf_event_open.c
./perf_event_open.out

Output:

Used 20016 instructions

So we see that the result is pretty close to the expected value of 20000: 10k * two instructions per loop in the __asm__ block (sub, jne).

If I vary the argument, even to low values such as 100:

./perf_event_open.out 100

it gives:

Used 216 instructions

maintaining that constant + 16 instructions, so it seems that accuracy is pretty high, those 16 must be just the ioctl setup instructions after our little loop.

Now you might also be interested in:

  • prevent reordering of the syscalls: Enforcing statement order in C++
  • prevent the test loop from being optimized out: How to prevent GCC from optimizing out a busy wait loop?

Other events of interest that can be measured by this system call:

  • cycle counts: How to get the CPU cycle count in x86_64 from C++?

Tested on Ubuntu 20.04 amd64, GCC 9.3.0, Linux kernel 5.4.0, Intel Core i7-7820HQ CPU.




回答4:


Although not "quick" depending on the program, this may have been answered in this question. Here, Mark Plotnick suggests to use gdb to watch your program counter register changes:

# instructioncount.gdb
set pagination off
set $count=0
while ($pc != 0xyourstoppingaddress)
    stepi
    set $count++
end
print $count
quit

Then, start gdb on your program:

gdb --batch --command instructioncount.gdb --args ./yourexecutable with its arguments

To get the end address 0xyourstoppingaddress, you can use the following script:

# stopaddress.gdb
break main
run
info frame
quit

which puts a breakpoint on the function main, and gives:

$ gdb --batch --command stopaddress.gdb --args ./yourexecutable with its arguments
...
Stack level 0, frame at 0x7fffffffdf70:
 rip = 0x40089d in main (main_aes.c:33); saved rip 0x7ffff7a66d20
 source language c.
 Arglist at 0x7fffffffdf60, args: argc=3, argv=0x7fffffffe048
...

Here what is important is the saved rip 0x7ffff7a66d20 part. On my CPU, rip is the instruction pointer, and the saved rip is the "return address", as stated by pepero in this answer.

So in this case, the stopping address is 0x7ffff7a66d20, which is the return address of the main function. That is, the end of the program execution.




回答5:


Intel Pin's instcount

You can use the Binary Instrumentation tool 'Pin' by Intel. I would avoid using a simulator for such a trivial task (simulators are often extremely slow). Pin does most of the stuff you can do with a simulator without pre-modifying the binary and at a normal execution like speed (depending on the pin tool you are using).

To count the number of instructions with Pin:

  1. Download the latest (or 3.10 if this answer gets old) pin kit from here.
  2. Extract everything and go to the directory: cd pin-root/source/tools/ManualExample/
  3. Make all the tools in the directory: make all
  4. Run the tool called inscount0.so using the command: ../../../pin -t obj-intel64/inscount0.so -- your-binary-here
  5. Get the instruction count in the file inscount.out, cat inscount.out.

The output would be something like:

➜ ../../../pin -t obj-intel64/inscount0.so -- /bin/ls
buffer_linux.cpp       itrace.cpp
buffer_windows.cpp     little_malloc.c
countreps.cpp          makefile
detach.cpp         makefile.rules
divide_by_zero_unix.c  malloc_mt.cpp
isampling.cpp          w_malloctrace.cpp
➜ cat inscount.out
Count 716372



来源:https://stackoverflow.com/questions/13313510/quick-way-to-count-number-of-instructions-executed-in-a-c-program

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!