I have the assembly code of some code that will be executed at a point in the program. I don\'t know the address of the code in memory.
Is it possible to make gdb br
As others said, it is likely impossible to do it efficiently because there is no hardware support.
But if you really want to do it, this Python command can serve as a starting point:
class ContinueI(gdb.Command):
"""
Continue until instruction with given opcode.
ci OPCODE
Example:
ci callq
ci mov
"""
def __init__(self):
super().__init__(
'ci',
gdb.COMMAND_BREAKPOINTS,
gdb.COMPLETE_NONE,
False
)
def invoke(self, arg, from_tty):
if arg == '':
gdb.write('Argument missing.\n')
else:
thread = gdb.inferiors()[0].threads()[0]
while thread.is_valid():
gdb.execute('si', to_string=True)
frame = gdb.selected_frame()
arch = frame.architecture()
pc = gdb.selected_frame().pc()
instruction = arch.disassemble(pc)[0]['asm']
if instruction.startswith(arg + ' '):
gdb.write(instruction + '\n')
break
ContinueI()
Just source it with:
source gdb.py
and use the command as:
breaki mov
breaki callq
and you will be left on the fist instruction executed with a given opcode.
TODO: this will ignore your other breakpoints.
For the particular common case of syscall
, you can use catch syscall
: https://reverseengineering.stackexchange.com/questions/6835/setting-a-breakpoint-at-system-call
I don't know the address of the code in memory.
What prevents you from finding that address? Run objdump -d
, find the instruction of interest, note its address. Problem solved? (This is trivially extended to shared libraries as well.)
No, this is not possible and it would also be very inefficient to implement.
Debugger's typically support two kinds of breakpoints:
int 3
/ 0xcc
on the x86 architecture).Matching the current instruction's opcode would either require CPU support to insert a hardware breakpoint or the debugger needs to know the address to use a software breakpoint.
In theory, the debugger could just search the entire memory for the instruction's byte sequence, but since the byte sequence could also occur in the middle of an instruction or in data, it may get false positives.
Since assembly instructions are variable-length, control could jump to any arbitrary address or code could modify itself, it's also not trivial to disassemble an entire region of memory to find some particular instruction.
So basically, the only way of reliably finding the instruction in arbitrary assembly code would be by single-stepping on the instruction level. And this would be extremely expensive, even a trivial library call such as printf()
could take minutes on today's hardware if you single-step every instruction.