Keywords: Bash arrays | directory listing | glob patterns | shell programming | Linux commands
Abstract: This article provides an in-depth exploration of two primary methods for storing directory lists into arrays in Bash shell: parsing ls command output and direct glob pattern expansion. Through comparative analysis of syntax differences, potential issues, and application scenarios, it explains why directly using glob patterns (*/) with the nullglob option is a more robust and recommended approach, especially when dealing with filenames containing special characters. The article includes complete code examples and error handling mechanisms to help developers write more reliable shell scripts.
Basic Syntax for Bash Array Assignment
In Bash shell programming, assigning command output to arrays requires correct syntax. The original code array=${ls -d */} contains a syntax error because ${} is for variable expansion and cannot be used directly for command substitution. The correct command substitution syntax is $(command) or backticks `command`.
Method 1: Parsing ls Command Output
The first method uses the ls -d */ command to obtain directory listings, then assigns them to an array through command substitution:
array=($(ls -d */))
The execution flow of this method is as follows:
- The shell first executes the
ls -d */command - The
-doption ensureslslists directories themselves rather than their contents - The
*/pattern matches all subdirectories in the current directory - Command output is captured via
$(...) - The output is word-split and assigned to the array variable
array
To print array contents, use:
echo "${array[@]}"
The use of double quotes is crucial as it prevents spaces in array elements from being incorrectly parsed.
Limitations of Method 1
While the above method works in simple cases, it has several significant issues:
- Filename Parsing Problems: When directory names contain spaces, newlines, or other special characters,
lsoutput may be incorrectly parsed. For example, directory name "my dir" might be split into two array elements "my" and "dir". - Unnecessary Process Overhead: Using the
lscommand actually performs redundant operations. The shell can already handle*/pattern matching directly without needing an external command intermediary. - Inconsistent Error Handling: When no directories match, the
*/pattern remains unchanged and is passed tols, causinglsto attempt finding a directory named "*" and return an error.
Method 2: Direct Glob Pattern Expansion
A more robust approach is to use glob pattern expansion directly, avoiding ls command parsing issues:
shopt -s nullglob
array=(*/)
shopt -u nullglob
This method works as follows:
shopt -s nullglobenables the nullglob option, causing patterns that match no files to expand to nothing rather than remaining unchangedarray=(*/)directly assigns the results of*/pattern matching to the arrayshopt -u nullglobdisables the nullglob option to avoid affecting subsequent operations
To safely print array contents, always use double quotes:
echo "${array[@]}"
Error Handling Mechanism
For better user experience, explicit error checking can be added:
if (( ${#array[@]} == 0 )); then
echo "No subdirectories found" >&2
fi
This code checks the array length and outputs an error message to standard error if it is zero.
Performance and Reliability Comparison
From a performance perspective, direct glob pattern usage avoids the overhead of creating a ls process, resulting in faster execution. From a reliability perspective, glob patterns handle filenames directly without introducing additional parsing errors.
Consider the following scenario with directory names containing special characters:
# Create test directories
mkdir 'dir with spaces'
mkdir 'dir
with
newlines'
mkdir 'dir*with*stars'
# Method 1 may fail
array1=($(ls -d */))
echo "Method 1 results: ${#array1[@]} elements"
# Method 2 handles correctly
shopt -s nullglob
array2=(*/)
shopt -u nullglob
echo "Method 2 results: ${#array2[@]} elements"
Practical Application Example
The following is a complete script example demonstrating how to use these techniques in practical applications:
#!/bin/bash
# Method 1: Using ls (simple but unreliable)
function get_dirs_ls() {
local dirs=($(ls -d */ 2>/dev/null))
echo "Found ${#dirs[@]} directories using ls"
printf ' %s\n' "${dirs[@]}"
}
# Method 2: Using glob pattern (recommended)
function get_dirs_glob() {
shopt -s nullglob
local dirs=(*/)
shopt -u nullglob
if (( ${#dirs[@]} == 0 )); then
echo "No directories found" >&2
return 1
fi
echo "Found ${#dirs[@]} directories using glob"
printf ' %s\n' "${dirs[@]}"
return 0
}
# Main program
echo "=== Testing Directory Listing Methods ==="
get_dirs_ls
get_dirs_glob
Best Practices Summary
Based on the above analysis, the following best practices can be summarized:
- Prefer Glob Patterns: Directly use
array=(*/)with the nullglob option to avoidlscommand parsing issues. - Always Use Double Quotes: When handling arrays, use
"${array[@]}"to ensure special characters in elements are properly preserved. - Add Error Checking: Check if array length is zero and provide meaningful error messages.
- Consider Portability: If scripts need to run in different shells, note that the
nullgloboption is Bash-specific. - Handle Edge Cases: Consider special cases like hidden directories (starting with dots), symbolic links, etc.
By following these best practices, developers can write more robust and reliable Bash scripts that properly handle various directory structure scenarios.