Keywords: rsync | file filtering | include option
Abstract: This paper thoroughly examines the working principle of the --include option in rsync commands, revealing its collaborative filtering mechanism with the --exclude option. By analyzing common error cases, it explains how to correctly combine include/exclude patterns to copy only specific file types (e.g., *.sh script files), providing optimized solutions for different rsync versions and directory handling techniques.
Fundamental Principles of rsync Filtering
rsync, as a powerful file synchronization tool, implements its filtering functionality through include/exclude pattern matching. Beginners often misinterpret the --include option as an independent selector, when in reality it functions as an exception rule within an exclusion framework. When using --include="*.sh" alone, the system still processes all files because no base exclusion condition has been established.
Error Case Analysis and Correction
The issue with the original script rsync -zarv --include="*.sh" $from $to lies in incomplete logic. The correct approach requires establishing a three-layer filtering system:
- Include all directory structures:
--include="*/" - Exclude all files:
--exclude="*" - Re-include target files:
--include="*.sh"
The complete command is: rsync -zarv --include="*/" --exclude="*" --include="*.sh" "$from" "$to"
Version Adaptation and Optimization
rsync version 3.0.6 and above requires adjusting option order: rsync -zarv --include="*/" --include="*.sh" --exclude="*" "$from" "$to". Adding the -m flag prevents creating empty directories: rsync -zarvm --include="*/" --include="*.sh" --exclude="*" "$from" "$to".
Advanced Directory Handling
When needing to exclude empty directories that don't contain target files, combine with the --prune-empty-dirs option: rsync -zarv --prune-empty-dirs --include="*/" --include="*.sh" --exclude="*" "$from" "$to". This ensures only directory structures containing *.sh files are preserved.
Pattern Matching Rules Explained
rsync's filtering patterns follow specific matching order:
- Patterns are evaluated sequentially as they appear in the command line
- The first matching pattern determines the file's fate
- Files not matching any pattern are included by default
Therefore, --include="*/" must precede --exclude="*" to ensure directory structures remain traversable.
Practical Recommendations and Debugging Techniques
It's recommended to use the --dry-run or -n option to test filtering effects beforehand: rsync -zarvn --include="*/" --include="*.sh" --exclude="*" "$from" "$to". The --verbose option provides detailed matching process visibility, helping understand filtering logic.