Get the address that caused segmentation fault from core dump using C

血红的双手。 提交于 2020-01-01 04:16:10

问题


I am trying to write a C program that can parse core dump files. My question is, how can I get the address that caused the core dump in C? I know one can get the address using gdb from this answer:

How can I get GDB to tell me what address caused a segfault?

But I would like to directly retrieve the address in C. Any information would be highly appreciated. Thanks!

Notice: I know how to parse core dump as an elf. But I don't know how to get the address that caused the segfault.


回答1:


My question is, how can I get the address that caused the core dump in C?

Short answer:

There are two ways to interpret this question.

  1. What was address of the faulting instruction?

  2. What was the address that was out of bounds?

Elf core dumps keep all of the meta information in notes, which are stored in a note segment. The notes are of different types.

To answer #1, we need to grab the registers. Look at the elf header to find the program header table. Walk the program header table to find the note table (type PT_NOTE). Walk the note table to find a note of type NT_PRSTATUS. The payload of this note is a struct elf_prstatus, which can be found in linux/elfcore.h. One of the fields of this struct is all of the general purpose registers. Grab %rip and you are done.

For #2, we do something similar. This time we are looking for a note of type NT_SIGINFO. The payload of this note is a siginfo_t structure defined in signal.h. For applicable signals (SIGILL, SIGFPE, SIGSEGV, SIGBUS), the field si_addr will contain the address you tried to access but couldn't.

More information is below. In the example core dump, rip is 0x400560, the instruction address that tried to do an illegal access. This is displayed with the rest of the general purpose registers.

The memory the program tried to access is at 0x03. This is displayed with the rest of the signal information.

The long answer:

I think BFD has 25 years of cruft on it, so I wouldn't use it just to dump the contents of a core file on a linux box. Maybe if you had to write some kind of general purpose code that needs to work with a bunch of formats, but even then I'm not sure that's how I would go today.

The elf spec is pretty well written and it is not hard to just walk through the tables of program headers or section headers as needed. All of the process meta information in a core file is contained in a set of notes in a PT_NOTE program segment that can be parsed out in just a few lines of straight C code.

I wrote a little program to grab the registers out of a x86_68 core file and print some of the meta data. I put it on github. The logic for getting a note payload is in this function:

void *get_note(void *vp, int nt_type){
    Elf64_Ehdr *eh=vp;
    for(int i=0; i<eh->e_phnum; ++i){
        Elf64_Phdr *ph=(vp+eh->e_phoff+i*eh->e_phentsize);
        if(ph->p_type!=PT_NOTE){
            continue;
        }
        void *note_table=(vp + ph->p_offset);
        void *note_table_end=(note_table+ph->p_filesz);
        Elf64_Nhdr *current_note=note_table;
        while(current_note<(Elf64_Nhdr *)note_table_end){
            void *note_end=current_note;
            note_end += 3*sizeof(Elf64_Word);
            note_end += roundup8(current_note->n_namesz);
            if(current_note->n_type==nt_type){
                return note_end;
            }
            note_end += roundup8(current_note->n_descsz);
            current_note=note_end;          
        }
    }
    return 0;
}

The function is handed a pointer to the elf file and a note type and returns a pointer the payload of the associated note, if it exists. The various possible note types are in elf.h. The note types I actually see in core files on my machine are:

#define NT_PRSTATUS 1       /* Contains copy of prstatus struct */
#define NT_FPREGSET 2       /* Contains copy of fpregset struct */
#define NT_PRPSINFO 3       /* Contains copy of prpsinfo struct */
#define NT_AUXV     6       /* Contains copy of auxv array */
#define NT_X86_XSTATE   0x202       /* x86 extended state using xsave */
#define NT_SIGINFO  0x53494749  /* Contains copy of siginfo_t,
                                   size might increase */
#define NT_FILE     0x46494c45  /* Contains information about mapped
                                   files */

Most of these structures are in headers under /usr/include/linux. The xsave structure is a couple KB of floating point information described in Ch 13 of the intel manual. It has the SSE, AVX, and MPX, registers in it.

The NT_FILE payload doesn't seem to have an associated struct in a header, but it is described in a kernel comment (fs/binfmt_elf.c):

/*
 * Format of NT_FILE note:
 *
 * long count     -- how many files are mapped
 * long page_size -- units for file_ofs
 * array of [COUNT] elements of
 *   long start
 *   long end
 *   long file_ofs
 * followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...
 */

The changes for parsing the elf file for a 32 bit system are pretty trivial. Use the corresponding Elf32_XXX structures and round up by 4 instead of 8 for the variable sized fields.

I've been adding stuff to this little program last couple of days. Currently it does the file header, segment headers, general registers, program status, program info and a backtrace. I'll add support for the rest of the notes as I get time. Here is the current output:

 $ ./read_pc -biprst core
General Registers: 
r15     0x000000000000000000  r14     0x000000000000000000  
r13     0x0000007ffc20d36a50  r12     0x000000000000400430  
rbp     0x0000007ffc20d36950  rbx     0x000000000000000000  
r11     0x000000000000000246  r10     0x000000000000000000  
r9      0x000000000000000002  r8      0x000000000000000000  
rax     0x000000000000000003  rcx     0x00000000007ffffffe  
rdx     0x0000007f5817523780  rsi     0x000000000000000001  
rdi     0x000000000000000001  ss      0x00000000000000002b  
rip     0x000000000000400560  cs      0x000000000000000033  
eflags  0x000000000000010246  rsp     0x0000007ffc20d36950  
fs_base 0x0000007f5817723700  gs_base 0x000000000000000000  
ds      0x000000000000000000  es      0x000000000000000000  
fs      0x000000000000000000  gs      0x000000000000000000  
orig_rax 0x00ffffffffffffffff  

Program status: 
signo 11 signal code 0 errno 0
cursig 11 sigpend 000000000000000000 sigheld 000000000000000000
pid 27547 ppid 26600 pgrp 27547 sid 26600
utime: 0.000000 stime 0.000000
cutime: 0.000000 cstime 0.000000
fpvalid: 1


Signal Information: 
signo: 11 errno 0 code 1
addr 0x3 addr_lsb 0 addr_bnd ((nil), (nil))


Process Information:
state 0 (R) zombie 0 nice 0 flags 0x400600
uid 1000 gid 1000 pid 27547 ppid 26600 pgrp 27547 sid 26600
fname: foo
args: ./foo 


Backtrace: 
rip = 0x000000000000400560
rip = 0x000000000000400591
rip = 0x0000000000004005a1


Program Headers:
   Type      Offset             Virt Addr          PhysAddr          
             FileSiz            MemSize              Flags  Align    
 NOTE      0x00000000000004a0 0x0000000000000000 0000000000000000
           0x0000000000000b98 0x0000000000000000         0x000000
 LOAD      0x0000000000002000 0x0000000000400000 0000000000000000
           0x0000000000001000 0x0000000000001000 R X     0x001000
 LOAD      0x0000000000003000 0x0000000000600000 0000000000000000
           0x0000000000001000 0x0000000000001000   X     0x001000
 LOAD      0x0000000000004000 0x0000000000601000 0000000000000000
           0x0000000000001000 0x0000000000001000  WX     0x001000
 LOAD      0x0000000000005000 0x00000000018bf000 0000000000000000
           0x0000000000021000 0x0000000000021000  WX     0x001000
 LOAD      0x0000000000026000 0x00007f581715e000 0000000000000000
           0x0000000000001000 0x00000000001c0000 R X     0x001000
 LOAD      0x0000000000027000 0x00007f581731e000 0000000000000000
           0x0000000000000000 0x00000000001ff000         0x001000
 LOAD      0x0000000000027000 0x00007f581751d000 0000000000000000
           0x0000000000004000 0x0000000000004000   X     0x001000
 LOAD      0x000000000002b000 0x00007f5817521000 0000000000000000
           0x0000000000002000 0x0000000000002000  WX     0x001000
 LOAD      0x000000000002d000 0x00007f5817523000 0000000000000000
           0x0000000000004000 0x0000000000004000  WX     0x001000
 LOAD      0x0000000000031000 0x00007f5817527000 0000000000000000
           0x0000000000001000 0x0000000000026000 R X     0x001000
 LOAD      0x0000000000032000 0x00007f5817722000 0000000000000000
           0x0000000000003000 0x0000000000003000  WX     0x001000
 LOAD      0x0000000000035000 0x00007f581774a000 0000000000000000
           0x0000000000002000 0x0000000000002000  WX     0x001000
 LOAD      0x0000000000037000 0x00007f581774c000 0000000000000000
           0x0000000000001000 0x0000000000001000   X     0x001000
 LOAD      0x0000000000038000 0x00007f581774d000 0000000000000000
           0x0000000000001000 0x0000000000001000  WX     0x001000
 LOAD      0x0000000000039000 0x00007f581774e000 0000000000000000
           0x0000000000001000 0x0000000000001000  WX     0x001000
 LOAD      0x000000000003a000 0x00007ffc20d16000 0000000000000000
           0x0000000000022000 0x0000000000022000  WX     0x001000
 LOAD      0x000000000005c000 0x00007ffc20d9c000 0000000000000000
           0x0000000000002000 0x0000000000002000   X     0x001000
 LOAD      0x000000000005e000 0x00007ffc20d9e000 0000000000000000
           0x0000000000002000 0x0000000000002000 R X     0x001000
 LOAD      0x0000000000060000 0xffffffffff600000 0000000000000000
           0x0000000000001000 0x0000000000001000 R X     0x001000
All worked



回答2:


There is an ELF parser provided by the BFD (Binary File Descriptor) library, which is part of binutils and is used by gdb, readelf and others. However it is apparently quite old and crufty, so it may be more straightforward to write your own ELF parser directly from the spec.

The runtime library will normally install a signal handler to trap faults (eg. SIGSEV, SIGBUS, etc), and abort . To get the address of the fault, you will most likely need to unwind the stack to make a backtrace. You would also need to have the symbol table available to look up the addresses to match with function names. This is available either as part of the binary (in a debug build) or a separate symbol table file. The faulting address you're after is _siginfo._sifields._sigfault.si_addr.

It seems that the siginfo object is not stored in the core files. The kernel source for do_coredump() is worth a look. But saving siginfo seems to be something people are working on.

@evaitl gives a great answer above, so my vote goes there. :)

Further reading:

  • ELF core file format
  • Anatomy of an ELF core file
  • A brief look into core dumps
  • Binutils bfd source (git)
  • How can I get GDB to tell me what address caused a segfault?
  • How to generate a stacktrace when my gcc C++ app crashes


来源:https://stackoverflow.com/questions/38330622/get-the-address-that-caused-segmentation-fault-from-core-dump-using-c

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!