Keywords: Linux disassembly | objdump tool | GDB debugging | binary analysis | assembly code
Abstract: This technical paper provides an in-depth exploration of binary executable disassembly techniques in Linux systems, focusing on the objdump tool and its output analysis while comparing GDB's disassembly capabilities. Through detailed code examples and step-by-step explanations, readers will gain practical understanding of disassembly processes and their applications in program analysis and reverse engineering.
Fundamentals of Binary Disassembly
In Linux environments, disassembly is the process of converting compiled binary executable files back into assembly code. This technique is crucial for program analysis, debugging, and reverse engineering. Unlike source code, binary files contain machine instructions, and disassembly restores them to human-readable assembly language representations.
Using objdump for Disassembly
The objdump utility from the GNU toolchain is the primary tool for disassembly. As part of the binutils package, it specializes in analyzing object files and executables. The -d or --disassemble option initiates the disassembly process:
$ objdump -d /path/to/binary
This command outputs complete disassembly results, including all executable segments. The output format typically includes addresses, opcode bytes, and corresponding assembly instructions:
080483b4 <main>:
80483b4: 8d 4c 24 04 lea 0x4(%esp),%ecx
80483b8: 83 e4 f0 and $0xfffffff0,%esp
80483bb: ff 71 fc pushl -0x4(%ecx)
80483be: 55 push %ebp
80483bf: 89 e5 mov %esp,%ebp
80483c1: 51 push %ecx
80483c2: b8 00 00 00 00 mov $0x0,%eax
80483c7: 59 pop %ecx
80483c8: 5d pop %ebp
80483c9: 8d 61 fc lea -0x4(%ecx),%esp
80483cc: c3 ret
80483cd: 90 nop
80483ce: 90 nop
80483cf: 90 nop
In this example, we can see the complete assembly code for the main function. Each line displays the virtual address of the instruction, machine code bytes, and corresponding assembly mnemonics. For instance, the instruction at address 0x80483b4, lea 0x4(%esp),%ecx, loads the address of the stack pointer plus 4 into the ECX register.
GDB as a Disassembly Tool
Besides objdump, the GNU Debugger GDB also provides powerful disassembly capabilities. GDB's advantage lies in its ability to interactively analyze program states during runtime:
$ gdb -q ./a.out
(gdb) info functions
(gdb) disassemble main
The info functions command lists addresses and names of all functions in the program, while the disassemble command displays assembly code for specified functions. When the program contains debugging information, the disassemble /m option can be used to mix source code with assembly code:
(gdb) disassemble /m main
Dump of assembler code for function main:
9 {
0x00000000004004fb <+0>: push %rbp
0x00000000004004fc <+1>: mov %rsp,%rbp
0x00000000004004ff <+4>: sub $0x10,%rsp
10 int x = fce ();
0x0000000000400503 <+8>: callq 0x4004f0 <fce>
0x0000000000400508 <+13>: mov %eax,-0x4(%rbp)
11 return x;
0x000000000040050b <+16>: mov -0x4(%rbp),%eax
12 }
0x000000000040050e <+19>: leaveq
0x000000000040050f <+20>: retq
Tool Comparison and Selection Guidelines
objdump and GDB each have distinct advantages in disassembly functionality. objdump is suitable for quickly obtaining complete disassembly output, while GDB is better for interactive analysis and debugging scenarios. For most static analysis needs, objdump -d is the most straightforward and effective choice. When runtime information or source code context is required, GDB's mixed disassembly mode proves more practical.
Key Points in Disassembly Result Analysis
Understanding disassembly output requires attention to several key elements: instruction addresses reflect the program's memory layout; opcode bytes show actual machine instructions; function calls and return instructions reveal the program's control flow. By analyzing these components, one can reconstruct the program's logical structure and execution flow.