Keywords: recursive file transfer | directory exclusion | SCP alternatives
Abstract: This paper provides an in-depth exploration of technical solutions for excluding specific directories in recursive file transfer scenarios. By analyzing the limitations of the SCP command, it systematically introduces alternative methods including rsync with --exclude parameters, and find combined with tar and SSH pipelines. The article details the working principles, applicable scenarios, and implementation specifics of each approach, offering complete code examples and configuration instructions to help readers address complex file transfer requirements in practical work.
The Need for Directory Exclusion in Recursive File Transfer
In distributed system management and data synchronization tasks, recursive file transfer is a common operation. The standard SCP (Secure Copy Protocol) command supports recursive copying through the -r parameter, but its design初衷 is simple secure file transfer, lacking filtering capabilities for transmission content. When specific directories need to be excluded, SCP's limitations become apparent.
Analysis of SCP Limitations
The SCP protocol itself does not include file filtering mechanisms, which is determined by its design goals. SCP's main advantage lies in providing secure file transfer through SSH encrypted channels, but at the cost of relatively单一 functionality. In scenarios requiring exclusion of directories like fl_*, direct use of SCP cannot meet the requirements.
Detailed rsync Solution
rsync, as a more powerful file synchronization tool, provides complete filtering capabilities. The correct rsync command should include the following key parameters:
rsync -avr -e "ssh -l user" --exclude 'fl_*' ./bench* remote:/my/dir
Parameter analysis: -a preserves file attributes, -v displays verbose output, -r enables recursion, -e specifies remote shell, --exclude defines exclusion patterns. A common error is omitting the -r parameter, which results in transferring only directory lists rather than actual content.
find and tar Combination Solution
For more complex filtering requirements, find can be used to locate specific directories, then transfer through tar packaging and SSH pipelines:
find . -type d -wholename '*bench*/image' \
| xargs tar cf - \
| ssh user@remote tar xf - -C /my/dir
The advantages of this solution include: 1) find provides powerful path matching capabilities; 2) tar maintains directory structure; 3) SSH pipelines ensure secure transmission. Particularly suitable for scenarios requiring precise control over transferred directories.
GLOBIGNORE Alternative Solution
In some simple scenarios, the GLOBIGNORE environment variable can be set to影响 wildcard expansion:
GLOBIGNORE='fl_*' scp -r source/* remoteurl:remoteDir
This method is applicable in Bash environments, but需要注意: 1) only affects shell wildcard expansion; 2) multiple exclusion patterns are separated by colons; 3) has limited effectiveness for deeply nested directories.
Solution Comparison and Selection Recommendations
The rsync solution is most suitable for regular file synchronization tasks, especially scenarios requiring incremental transfer and attribute preservation. The find+tar solution has greater advantages when complex path matching is needed. GLOBIGNORE is suitable for simple local file operations. Practical selection should consider: network environment, directory structure complexity, transfer frequency, and other factors.
Security Considerations
All solutions should transmit data through SSH encrypted channels. Both rsync's -e parameter and the find solution's SSH pipelines ensure this. Simultaneously, attention should be paid to preserving file permissions to avoid losing important attribute information during transmission.