How does reverse debugging work?

前端 未结 8 986
谎友^
谎友^ 2020-12-04 07:21

GDB has a new version out that supports reverse debug (see http://www.gnu.org/software/gdb/news/reversible.html). I got to wondering how that works.

To get reverse

8条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-12-04 08:19

    mozilla rr is a more robust alternative to GDB reverse debugging

    https://github.com/mozilla/rr

    GDB's built-in record and replay has severe limitations, e.g. no support for AVX instructions: gdb reverse debugging fails with "Process record does not support instruction 0xf0d at address"

    Upsides of rr:

    • much more reliable currently. I have tested it relatively long runs of several complex software.
    • also offers a GDB interface with gdbserver protocol, making it a great replacement
    • small performance drop for most programs, I haven't noticed it myself without doing measurements
    • the generated traces are small on disk because only very few non-deterministic events are recorded, I've never had to worry about their size so far

    rr achieves this by first running the program in a way that records what happened on every single non-deterministic event such as a thread switch.

    Then during the second replay run, it uses that trace file, which is surprisingly small, to reconstruct exactly what happened on the original non-deterministic run but in a deterministic way, either forwards or backwards.

    rr was originally developed by Mozilla to help them reproduce timing bugs that showed up on their nightly testing the following day. But the reverse debugging aspect is also fundamental for when you have a bug that only happens hours inside execution, since you often want to step back to examine what previous state led to the later failure.

    The following example showcases some of its features, notably the reverse-next, reverse-step and reverse-continue commands.

    Install on Ubuntu 18.04:

    sudo apt-get install rr linux-tools-common linux-tools-generic linux-cloud-tools-generic
    sudo cpupower frequency-set -g performance
    # Overcome "rr needs /proc/sys/kernel/perf_event_paranoid <= 1, but it is 3."
    echo 'kernel.perf_event_paranoid=1' | sudo tee -a /etc/sysctl.conf
    sudo sysctl -p
    

    Test program:

    #include 
    #include 
    #include 
    
    int f() {
        int i;
        i = 0;
        i = 1;
        i = 2;
        return i;
    }
    
    int main(void) {
        int i;
    
        i = 0;
        i = 1;
        i = 2;
    
        /* Local call. */
        f();
    
        printf("i = %d\n", i);
    
        /* Is randomness completely removed?
         * Recently fixed: https://github.com/mozilla/rr/issues/2088 */
        i = time(NULL);
        printf("time(NULL) = %d\n", i);
    
        return EXIT_SUCCESS;
    }
    

    compile and run:

    gcc -O0 -ggdb3 -o reverse.out -std=c89 -Wextra reverse.c
    rr record ./reverse.out
    rr replay
    

    Now you are left inside a GDB session, and you can properly reverse debug:

    (rr) break main
    Breakpoint 1 at 0x55da250e96b0: file a.c, line 16.
    (rr) continue
    Continuing.
    
    Breakpoint 1, main () at a.c:16
    16          i = 0;
    (rr) next
    17          i = 1;
    (rr) print i
    $1 = 0
    (rr) next
    18          i = 2;
    (rr) print i
    $2 = 1
    (rr) reverse-next
    17          i = 1;
    (rr) print i
    $3 = 0
    (rr) next
    18          i = 2;
    (rr) print i
    $4 = 1
    (rr) next
    21          f();
    (rr) step
    f () at a.c:7
    7           i = 0;
    (rr) reverse-step
    main () at a.c:21
    21          f();
    (rr) next
    23          printf("i = %d\n", i);
    (rr) next
    i = 2
    27          i = time(NULL);
    (rr) reverse-next
    23          printf("i = %d\n", i);
    (rr) next
    i = 2
    27          i = time(NULL);
    (rr) next
    28          printf("time(NULL) = %d\n", i);
    (rr) print i
    $5 = 1509245372
    (rr) reverse-next
    27          i = time(NULL);
    (rr) next
    28          printf("time(NULL) = %d\n", i);
    (rr) print i
    $6 = 1509245372
    (rr) reverse-continue
    Continuing.
    
    Breakpoint 1, main () at a.c:16
    16          i = 0;
    

    When debugging complex software, you will likely run up to a crash point, and then fall inside a deep frame. In that case, don't forget that to reverse-next on higher frames, you must first:

    reverse-finish
    

    up to that frame, just doing the usual up is not enough.

    The most serious limitations of rr in my opinion are:

    • https://github.com/mozilla/rr/issues/2089 you have to do a second replay from scratch, which can be costly if the crash you are trying to debug happens, say, hours into execution
    • https://github.com/mozilla/rr/issues/1373 x86 only

    UndoDB is a commercial alternative to rr: https://undo.io Both are trace / replay based, but I'm not sure how they compare in terms of features and performance.

提交回复
热议问题