I\'ll explain:
Let\'s say I\'m interested in replacing the rand()
function used by a certain application.
So I attach gdb to this process and ma
Several of the answers here and the code injection article you linked to in your answer cover chunks of what I consider the optimal gdb
-oriented solution, but none of them pull it all together or cover all the points. The code-expression of the solution is a bit long, so here's a summary of the important steps:
dlopen()
in the inferior process to link in a shared library containing the injected code. In the article you linked to the author instead loaded a relocatable object file and hand-linked it against the inferior. This is quite frankly insane -- relocatable objects are not "ready-to-run" and include relocations even for internal references. And hand-linking is tedious and error-prone -- far simpler to let the real runtime dynamic linker do the work. This does mean getting libdl
into the process in the first place, but there are many options for doing that.dlsym
-- can circumvent the GOT and provide direct access to the function of interest. The only way to be certain of intercepting all calls to a particular function is overwrite the initial instructions of that function's code in-memory to create a "detour" redirecting execution to your injected function.gdb
has included the ability to write new commands in Python. This support can be used to implement a turn-key solution for injecting and detouring code in/to the inferior process.Here's an example. I have the same a
and b
executables as before and an inject2.so
created from the following code:
#include <unistd.h>
#include <stdio.h>
int (*rand__)(void) = NULL;
int
rand(void)
{
int result = rand__();
printf("rand invoked! result = %d\n", result);
return result % 47;
}
I can then place my Python detour
command in detour.py
and have the following gdb
session:
(gdb) source detour.py (gdb) exec-file a (gdb) set follow-fork-mode child (gdb) catch exec Catchpoint 1 (exec) (gdb) run Starting program: /home/llasram/ws/detour/a a: 1933263113 a: 831502921 [New process 8500] b: 918844931 process 8500 is executing new program: /home/llasram/ws/detour/b [Switching to process 8500] Catchpoint 1 (exec'd /home/llasram/ws/detour/b), 0x00007ffff7ddfaf0 in _start () from /lib64/ld-linux-x86-64.so.2 (gdb) break main Breakpoint 2 at 0x4005d0: file b.c, line 7. (gdb) cont Continuing. Breakpoint 2, main (argc=1, argv=0x7fffffffdd68) at b.c:7 7 { (gdb) detour libc.so.6:rand inject2.so:rand inject2.so:rand__ (gdb) cont Continuing. rand invoked! result = 392103444 b: 22 Program exited normally.
In the child process, I create a detour from the rand()
function in libc.so.6
to the rand()
function in inject2.so
and store a pointer to a trampoline for the original rand()
in the rand__
variable of inject2.so
. And as expected, the injected code calls the original, displays the full result, and returns that result modulo 47.
Due to length, I'm just linking to a pastie containing the code for my detour command. This is a fairly superficial implementation (especially in terms of the trampoline generation), but it should work well in a large percentage of cases. I've tested it with gdb
7.2 (most recently released version) on Linux with both 32-bit and 64-bit executables. I haven't tested it on OS X, but any differences should be relatively minor.
For executables you can easily find the address where the function pointer is stored by using objdump. For example:
objdump -R /bin/bash | grep write
00000000006db558 R_X86_64_JUMP_SLOT fwrite
00000000006db5a0 R_X86_64_JUMP_SLOT write
Therefore, 0x6db5a0 is the adress of the pointer for write
. If you change it, calls to write will be redirected to your chosen function. Loading new libraries in gdb and getting function pointers has been covered in earlier posts. The executable and every library have their own pointers. Replacing affects only the module whose pointer was changed.
For libraries, you need to find the base address of the library and add it to the address given by objdump. In Linux, /proc/<pid>/maps
gives it out. I don't know whether position-independent executables with address randomization would work. maps
-information might be unavailable in such cases.
This question intrigued me, so I did a little research. What you are looking for is a 'dll injection'. You write a function to replace some library function, put it in a .so, and tell ld to preload your dll. I just tried it out and it worked great! I realize this doesn't really answer your question in relation to gdb, but I think it offers a viable workaround.
For a gdb-only solution, see my other solution.
// -*- compile-command: "gcc -Wall -ggdb -o test test.c"; -*-
// test.c
#include "stdio.h"
#include "stdlib.h"
int main(int argc, char** argv)
{
//should print a fairly random number...
printf("Super random number: %d\n", rand());
return 0;
}
/ -*- compile-command: "gcc -Wall -fPIC -shared my_rand.c -o my_rand.so"; -*-
//my_rand.c
int rand(void)
{
return 42;
}
compile both files, then run:
LD_PRELOAD="./my_rand.so" ./test
Super random number: 42
I found this tutorial incredibly useful, and so far its the only way I managed to achieve what I was looking with GDB: Code Injection into Running Linux Application: http://www.codeproject.com/KB/DLL/code_injection.aspx
There is also a good Q&A on code injection for Mac here: http://www.mikeash.com/pyblog/friday-qa-2009-01-30-code-injection.html
As long as the function you want to replace is in a shared library, you can redirect calls to that function at runtime (during debugging) by poking at the PLT. Here is an article that might be helpful:
Shared library call redirection using ELF PLT infection
It's written from the standpoint of malware modifying a program, but a much easier procedure is adaptable to live use in the debugger. Basically you just need to find the function's entry in the PLT and overwrite the address with the address of the function you want to replace it with.
Googling for "PLT" along with terms like "ELF", "shared library", "dynamic linking", "PIC", etc. might find you more details on the subject.
I frequently use code injection as a method of mocking for automated testing of C code. If that's the sort of situation you're in -- if your use of GDB is simply because you're not interested in the parent processes, and not because you want to interactively select the processes which are of interest -- then you can still use LD_PRELOAD
to achieve your solution. Your injected code just needs to determine whether it is in the parent or child processes. There are several ways you could do this, but on Linux, since your child processes exec()
, the simplest is probably to look at the active executable image.
I produced two executables, one named a
and the other b
. Executable a
prints the result of calling rand()
twice, then fork()
s and exec()
s b
twice. Executable b
print the result of calling rand()
once. I use LD_PRELOAD
to inject the result of compiling the following code into the executables:
// -*- compile-command: "gcc -D_GNU_SOURCE=1 -Wall -std=gnu99 -O2 -pipe -fPIC -shared -o inject.so inject.c"; -*-
#include <sys/types.h>
#include <unistd.h>
#include <limits.h>
#include <stdio.h>
#include <dlfcn.h>
#define constructor __attribute__((__constructor__))
typedef int (*rand_t)(void);
typedef enum {
UNKNOWN,
PARENT,
CHILD
} state_t;
state_t state = UNKNOWN;
rand_t rand__ = NULL;
state_t
determine_state(void)
{
pid_t pid = getpid();
char linkpath[PATH_MAX] = { 0, };
char exepath[PATH_MAX] = { 0, };
ssize_t exesz = 0;
snprintf(linkpath, PATH_MAX, "/proc/%d/exe", pid);
exesz = readlink(linkpath, exepath, PATH_MAX);
if (exesz < 0)
return UNKNOWN;
switch (exepath[exesz - 1]) {
case 'a':
return PARENT;
case 'b':
return CHILD;
}
return UNKNOWN;
}
int
rand(void)
{
if (state == CHILD)
return 47;
return rand__();
}
constructor static void
inject_init(void)
{
rand__ = dlsym(RTLD_NEXT, "rand");
state = determine_state();
}
The result of running a
with and without injection:
$ ./a
a: 644034683
a: 2011954203
b: 375870504
b: 1222326746
$ LD_PRELOAD=$PWD/inject.so ./a
a: 1023059566
a: 986551064
b: 47
b: 47
I'll post a gdb-oriented solution later.