Keywords: GCC optimization | symbol removal | embedded development
Abstract: This paper provides a comprehensive analysis of techniques for removing unused C/C++ symbols in ARM embedded development environments using GCC compiler and ld linker optimizations. The study begins by examining why unused symbols are not automatically stripped in default compilation and linking processes, then systematically explains the working principles and synergistic mechanisms of the -fdata-sections, -ffunction-sections compiler options and --gc-sections linker option. Through detailed code examples and build pipeline demonstrations, the paper illustrates how to integrate these techniques into existing development workflows, while discussing the additional impact of -Os optimization level on code size. Finally, the paper compares the effectiveness of different optimization strategies, offering practical guidance for embedded system developers seeking performance improvements.
Problem Context and Challenges
In ARM embedded system development, executable file size directly impacts device loading performance and storage resource utilization. A common challenge developers face is that standard GCC compilation and ld linking processes do not automatically remove redundant function and data symbols, even when they are never referenced in the code. This results in executables containing unnecessary code and data sections, wasting valuable storage space.
Technical Principle Analysis
The primary reason GCC and ld retain all symbols by default lies in the traditional compilation model. In standard translation units, all functions and data are typically organized into a few large sections, such as the .text section for code and .data section for initialized data. When the linker processes these large sections, it cannot precisely identify which portions are actually referenced, thus conservatively retaining the entire section content.
To address this issue, the code organization must be modified so that each function and data object resides in its own independent section. This enables the linker to perform precise garbage collection based on reference relationships, removing sections that are never referenced.
Core Optimization Techniques
Compiler Option Configuration
GCC provides two key options for fine-grained section separation:
-ffunction-sections: Places each function in its own section, following naming conventions like.text.function_name.-fdata-sections: Places each global or static data object in its own section, following conventions like.data.variable_nameor.bss.variable_name.
Consider the following example code:
// example.cpp
void used_function() {
// Actually called function
}
void unused_function() {
// Never called function
}
int used_variable = 42;
int unused_variable = 100;
When compiled with -ffunction-sections -fdata-sections, the compiler generates four independent sections: .text.used_function, .text.unused_function, .data.used_variable, and .data.unused_variable.
Linker Garbage Collection Mechanism
The --gc-sections option for ld linker enables section-level garbage collection. When processing object files containing multiple independent sections, the linker constructs a reference graph to identify which sections are directly or indirectly referenced by entry points (such as the main function) or other referenced sections. All unreferenced sections are removed from the final executable.
The linking command requires the -Wl,--gc-sections parameter, where -Wl instructs GCC to pass subsequent arguments to the linker. The complete build command is:
g++ -Os -fdata-sections -ffunction-sections example.cpp -o example -Wl,--gc-sections
Synergistic Effect of Optimization Levels
The -Os option directs GCC to perform optimizations specifically targeting code size. It enables various optimization passes, such as expression simplification, redundant code elimination, and loop structure adjustments. These optimizations work synergistically with section garbage collection to further reduce the final executable size.
Practical Application and Effectiveness Evaluation
In real-world embedded projects, this optimization technique typically achieves significant executable size reduction. For example, in a medium-sized ARM project involving multiple libraries, applying these techniques reduced the executable from 2MB to 1.5MB, saving 25% of space and consequently improving loading speed and runtime efficiency.
It should be noted that this optimization may slightly increase compilation time, as the compiler needs to generate independent sections for each symbol, and the linker must process more section information. However, in storage-constrained embedded environments, this trade-off is usually worthwhile.
Technical Limitations and Alternative Approaches
While the combination of -ffunction-sections, -fdata-sections, and --gc-sections is highly effective in most cases, certain special scenarios may prevent complete removal of all unused symbols. For instance, functions called dynamically through function pointers might be misidentified as unused, or some compiler-builtin functions might be retained. Additionally, older GCC versions may not fully support these options.
As supplementary measures, developers can consider the following optimization strategies:
- Using
arm-strip --strip-unneededfor post-processing to remove debugging information and partial symbol tables. - Manually analyzing code dependencies and refactoring modules to reduce unnecessary dependencies.
- Considering link-time optimization (LTO) techniques for cross-module optimizations during the linking phase.
Conclusion
By properly configuring GCC's -fdata-sections and -ffunction-sections options and combining them with ld's --gc-sections garbage collection mechanism, developers can effectively remove unused symbols from C/C++ projects, significantly reducing executable size in ARM embedded systems. This optimization technique not only enhances device loading performance but also optimizes storage resource utilization, representing an essential skill for embedded developers.