Keywords: YAML syntax error | indentation issues | Travis CI configuration | literal scalar | multi-line string handling
Abstract: This article provides an in-depth analysis of the common YAML syntax error "did not find expected '-' indicator while parsing a block", using a Travis CI configuration file as a case study. It explains the root cause of the error and presents effective solutions, focusing on the use of YAML literal scalar indicator "|" for handling multi-line strings properly. The discussion covers YAML indentation rules, debugging tools, and limitations of automated formatting utilities. By synthesizing insights from multiple answers, it offers comprehensive guidance for developers facing similar issues.
Problem Context and Error Analysis
Syntax errors in YAML files can disrupt continuous integration workflows. The specific error message discussed here is: syntax error: (<unknown>): did not find expected '-' indicator while parsing a block collection at line 32 column 3. This typically occurs when the YAML parser expects a list item indicator "-" but encounters different content, particularly when dealing with multi-line script blocks.
Root Cause and YAML Indentation Rules
Indentation is fundamental to YAML syntax. In the original problem, the install section of the .travis.yml file contained a Bash script block:
install:
- if [[ "${TEST_PY3}" == "false" ]]; then
pip install Cython;
python setup.py build; # To build networkx-metis
mkdir core; # For the installation of networkx core
cd core;
git clone https://github.com/orkohunter/networkx.git;
cd networkx/;
git checkout addons;
python setup.py install;
cd ..;
fi
The core issue is that the YAML parser interprets the if statement as a scalar value rather than a multi-line string. When it encounters the indented script lines, it expects them to be part of the YAML structure, but the lack of proper indicators causes parsing failure.
Solution: Using Literal Scalar Indicator
The correct approach employs YAML's literal scalar indicator "|", which preserves newlines and indentation within the string:
install:
- |
if [[ "${TEST_PY3}" == "false" ]]; then
pip install Cython;
python setup.py build; # To build networkx-metis
mkdir core; # For the installation of networkx core
cd core;
git clone https://github.com/orkohunter/networkx.git;
cd networkx/;
git checkout addons;
python setup.py install;
cd ..;
fi
This notation offers several advantages:
- Explicitly informs the YAML parser that subsequent content is a multi-line string
- Maintains the original formatting and indentation of the script
- Prevents the parser from misinterpreting script content as YAML structure
It's important to note that in this format, comments within the script (like # To build networkx-metis) become part of the string rather than YAML comments. Genuine YAML comments should appear on lines before or after the string.
Debugging Tools and Best Practices
While the question expresses desire for automated tools similar to autopep8 for YAML indentation, existing tools have limitations. The yaml utility from the ruamel.yaml package can be used for formatting and validation:
yaml round-trip .travis.yml --save
This command performs round-trip processing on YAML files, automatically correcting indentation issues. However, for files with syntax errors, the tool may not provide precise error localization.
Additional debugging methods include:
- Using the
travis lintcommand to validate configuration files - Performing syntax checks via Ruby's YAML library:
ruby -e "require 'yaml';puts YAML.load_file('.travis.yml')" - Comparing historical versions in Git to identify changes that introduced errors
Common Pitfalls and Considerations
When editing YAML files, several common pitfalls should be avoided:
- Comment Indentation: YAML comments must align with the indentation level of their containing block. Incorrect comment indentation can cause parsing errors.
- Editor Behavior: Some editors (like Vim) may automatically adjust comment indentation, potentially breaking YAML structure.
- Special Character Handling: Strings containing special characters may require quotation marks, though literal indicators generally avoid such issues.
Conclusion
Proper handling of multi-line content in YAML requires understanding its scalar type system. The literal scalar indicator "|" represents best practice for multi-line script blocks, maintaining code readability while preventing parsing errors. Although automated formatting tools exist, manual debugging and correction remain essential skills for files with syntax errors. By mastering YAML fundamentals and debugging techniques, developers can more effectively manage and maintain configuration files.