Keywords: rsync | file synchronization | command-line arguments
Abstract: This paper explores two primary methods for syncing specific file lists using rsync: direct command-line arguments and the --files-from option. By analyzing real-world user issues, it explains the workings, implicit behaviors, and best practices of --files-from. The article compares the pros and cons of both approaches, provides code examples and configuration tips, and helps readers choose the optimal sync strategy based on their needs. Key technical details such as file list formatting, path handling, and performance optimization are discussed, offering practical guidance for system administrators and developers.
Introduction
In file synchronization and backup operations, rsync is a powerful tool widely used in Unix-like systems. Users often need to sync specific file lists rather than entire directory structures, which is common in deployment, backup, and configuration management. This paper is based on a typical scenario: a user attempted to sync about 50 files across subdirectories using the --include-from option but encountered unexpected sync behavior. Through in-depth analysis, we find that the --files-from option provides a more direct and efficient solution.
Problem Background and Initial Attempt
The user initially tried to sync files with the following command:
rsync -avP -e ssh --include-from=deploy/rsync_include.txt --exclude=* ./ root@0.0.0.0:/var/www/ --dry-runHere, rsync_include.txt contains a newline-separated list of relative file paths. The user observed that without --exclude="*", all files were synced; with it, no files were synced. This reveals the complexity of interactions between --include-from and --exclude, especially under default inclusion behaviors.
Core Solution: The --files-from Option
According to the rsync manual, the --files-from option is designed specifically to specify the exact list of files to transfer. Its workings include:
- Reading filenames from a specified file, with each path relative to the source directory.
- Implicitly enabling the
--relativeoption to preserve path information from the file list. - Implicitly enabling the
--dirsoption to automatically create necessary directories on the destination. - The
--archiveoption does not imply--recursive, which must be specified explicitly.
Example command:
rsync -a --files-from=/tmp/foo /usr remote:/backupIf /tmp/foo contains the string "bin", the /usr/bin directory will be created as /backup/bin on the remote host. If it contains "bin/" (with a trailing slash), the immediate contents of the directory will also be sent.
Direct Command-line Argument Passing
As an alternative, file lists can be passed directly as command-line arguments:
rsync -avP -e ssh `cat deploy/rsync_include.txt` root@0.0.0.0:/var/www/This method assumes the file list is not too long to exceed command-line length limits and contains only real paths (no comments or regex patterns). Its advantage is simplicity, avoiding potential surprises from the implicit behaviors of --files-from.
Technical Details and Best Practices
When using --files-from, note these key points:
- Sort the file list to improve rsync efficiency and avoid rescanning shared path elements.
- Leading slashes in paths are removed, and ".." references above the source directory are not allowed.
- File lists can be read from remote hosts by specifying a "host:" prefix.
- Combine with
--iconvand--protect-argsfor filename charset conversion.
For the command-line argument method, ensure correct file list formatting to prevent parsing errors from special characters. In automation scripts, using --files-from is recommended for better maintainability and readability.
Comparison and Selection Advice
Both methods have their strengths and weaknesses:
--files-fromis better for long file lists or scenarios requiring path structure preservation, but watch for its implicit behaviors.- The command-line argument method is simpler, suitable for short lists and quick operations, but limited by command-line length.
In practice, choose based on file count, path complexity, and automation needs. For example, in CI/CD pipelines, --files-from might be more reliable.
Conclusion
By analyzing rsync's --files-from option and command-line argument methods, this paper provides a practical guide for syncing specific file lists. Understanding how these tools work and their limitations helps users avoid common pitfalls and optimize file synchronization. Whether for simple command-line use or complex automation scripts, selecting the right method enhances efficiency and reliability.