Comprehensive Guide to Binary Executable Disassembly in Linux

Keywords: Linux disassembly | objdump tool | GDB debugging | binary analysis | assembly code

Abstract: This technical paper provides an in-depth exploration of binary executable disassembly techniques in Linux systems, focusing on the objdump tool and its output analysis while comparing GDB's disassembly capabilities. Through detailed code examples and step-by-step explanations, readers will gain practical understanding of disassembly processes and their applications in program analysis and reverse engineering.

Fundamentals of Binary Disassembly

In Linux environments, disassembly is the process of converting compiled binary executable files back into assembly code. This technique is crucial for program analysis, debugging, and reverse engineering. Unlike source code, binary files contain machine instructions, and disassembly restores them to human-readable assembly language representations.

Using objdump for Disassembly

The objdump utility from the GNU toolchain is the primary tool for disassembly. As part of the binutils package, it specializes in analyzing object files and executables. The -d or --disassemble option initiates the disassembly process:

$ objdump -d /path/to/binary

This command outputs complete disassembly results, including all executable segments. The output format typically includes addresses, opcode bytes, and corresponding assembly instructions:

080483b4 <main>:
 80483b4:   8d 4c 24 04             lea    0x4(%esp),%ecx
 80483b8:   83 e4 f0                and    $0xfffffff0,%esp
 80483bb:   ff 71 fc                pushl  -0x4(%ecx)
 80483be:   55                      push   %ebp
 80483bf:   89 e5                   mov    %esp,%ebp
 80483c1:   51                      push   %ecx
 80483c2:   b8 00 00 00 00          mov    $0x0,%eax
 80483c7:   59                      pop    %ecx
 80483c8:   5d                      pop    %ebp
 80483c9:   8d 61 fc                lea    -0x4(%ecx),%esp
 80483cc:   c3                      ret    
 80483cd:   90                      nop
 80483ce:   90                      nop
 80483cf:   90                      nop

In this example, we can see the complete assembly code for the main function. Each line displays the virtual address of the instruction, machine code bytes, and corresponding assembly mnemonics. For instance, the instruction at address 0x80483b4, lea 0x4(%esp),%ecx, loads the address of the stack pointer plus 4 into the ECX register.

GDB as a Disassembly Tool

Besides objdump, the GNU Debugger GDB also provides powerful disassembly capabilities. GDB's advantage lies in its ability to interactively analyze program states during runtime:

$ gdb -q ./a.out 
(gdb) info functions 
(gdb) disassemble main

The info functions command lists addresses and names of all functions in the program, while the disassemble command displays assembly code for specified functions. When the program contains debugging information, the disassemble /m option can be used to mix source code with assembly code:

(gdb) disassemble /m main
Dump of assembler code for function main:
9       {
   0x00000000004004fb <+0>:     push   %rbp
   0x00000000004004fc <+1>:     mov    %rsp,%rbp
   0x00000000004004ff <+4>:     sub    $0x10,%rsp

10        int x = fce ();
   0x0000000000400503 <+8>:     callq  0x4004f0 <fce>
   0x0000000000400508 <+13>:    mov    %eax,-0x4(%rbp)

11        return x;
   0x000000000040050b <+16>:    mov    -0x4(%rbp),%eax

12      }
   0x000000000040050e <+19>:    leaveq 
   0x000000000040050f <+20>:    retq

Tool Comparison and Selection Guidelines

objdump and GDB each have distinct advantages in disassembly functionality. objdump is suitable for quickly obtaining complete disassembly output, while GDB is better for interactive analysis and debugging scenarios. For most static analysis needs, objdump -d is the most straightforward and effective choice. When runtime information or source code context is required, GDB's mixed disassembly mode proves more practical.

Key Points in Disassembly Result Analysis

Understanding disassembly output requires attention to several key elements: instruction addresses reflect the program's memory layout; opcode bytes show actual machine instructions; function calls and return instructions reveal the program's control flow. By analyzing these components, one can reconstruct the program's logical structure and execution flow.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.

Fundamentals of Binary Disassembly

Using objdump for Disassembly

GDB as a Disassembly Tool

Tool Comparison and Selection Guidelines

Key Points in Disassembly Result Analysis

Cite this article