Keywords: PKG Files | XAR Archives | Linux Unpacking | macOS Installer Packages | Bom Files | Payload Processing
Abstract: This technical paper provides a comprehensive guide for handling macOS PKG files in Linux environments. PKG files are essentially XAR archives with specific hierarchical structures, where Payload files contain the actual installable content. The article demonstrates step-by-step procedures for unpacking PKG files, modifying internal files, updating Bom manifests, and repackaging into functional PKG files. Practical recommendations for tool availability in Linux environments are included, covering mkbom and lsbom utilities.
PKG File Structure Analysis
macOS PKG files are fundamentally specialized XAR (eXtensible ARchiver) archive files, differing only in file extension and containing specific internal hierarchical structures. Understanding this architecture is crucial for successful PKG file manipulation.
Typical PKG files contain the following key components:
- PackageInfo: Contains package metadata such as version numbers and descriptions
- Bom File: Bill of Materials abbreviation, recording detailed information about all files in the package including permissions and ownership
- Payload: gzip-compressed cpio archive containing the actual installable file content
- Scripts: Optional pre-installation and post-installation scripts
Basic Unpacking and Repacking Operations
When only modifications to information files (such as PackageInfo) are required, the process is relatively straightforward:
mkdir Foo
cd Foo
xar -xf ../Foo.pkg
# Edit PackageInfo or other information files
xar -cf ../Foo-new.pkg *
This process leverages the basic functionality of XAR tools to unpack the PKG file into a temporary directory, make modifications, and then repackage. It is essential to maintain the original directory structure during repackaging.
Complete Workflow for Modifying Installable Files
When modifications to actual installable files within the Payload are necessary, the procedure becomes more complex:
mkdir Foo
cd Foo
xar -xf ../Foo.pkg
cd foo.pkg
cat Payload | gunzip -dc | cpio -i
# Modify files within Foo.app directory
rm Payload
find ./Foo.app | cpio -o | gzip -c > Payload
mkbom Foo.app Bom
# Update PackageInfo file
rm -rf Foo.app
cd ..
xar -cf ../Foo-new.pkg
Detailed Explanation of Key Steps
Payload File Processing
The Payload file represents the core component of PKG files, storing actual installable content in gzip-compressed cpio format. The extraction process requires sequential use of gunzip and cpio tools:
cat Payload | gunzip -dc | cpio -i
This pipeline command first reads the Payload file, decompresses it through gunzip, then passes the result to cpio for extraction. The -dc parameter ensures gunzip outputs decompressed data to standard output, while cpio's -i parameter indicates extraction from standard input.
Bom File Management
The Bom (Bill of Materials) file records comprehensive details about all files in the package, including:
- File paths and sizes
- Permission settings (chmod values)
- User and group ownership
- Checksum information
After modifying installable files, the Bom file must be regenerated using the mkbom utility:
mkbom Foo.app Bom
This command scans the Foo.app directory and creates a new Bom file, ensuring the installer can correctly identify and process all files.
Recreating the Payload
After file modifications, the Payload file must be recreated:
find ./Foo.app | cpio -o | gzip -c > Payload
This command uses find to list all files, creates an archive through cpio, compresses it using gzip, and finally outputs to the Payload file. Ensure consistent compression levels and archive formats with the original PKG file.
Tool Availability in Linux Environments
Processing macOS PKG files on Linux systems requires ensuring availability of the following tools:
- xar: XAR archive utility, installable via package managers in most Linux distributions
- cpio: Archive utility, typically pre-installed or available through package managers
- gzip/gunzip: Compression utilities, standard Linux components
- mkbom/lsbom: Bom file processing tools, may require acquisition from third-party sources
For mkbom and lsbom utilities, consider the following acquisition methods:
- Extract from macOS systems and cross-compile
- Use third-party compatible implementations
- Search for relevant packages through package managers
Considerations and Best Practices
File Permission Preservation
When modifying files, it is crucial to maintain original file permission settings. macOS installers verify file permissions recorded in Bom files, and any mismatches may cause installation failures.
Temporary Directory Management
Utilize temporary working directories for operations to prevent accidental modifications to original files. Clean up temporary files promptly after operation completion.
Backup Strategy
Always create backups before making any modifications to PKG files. Complex PKG files may contain signatures or other security mechanisms that could be compromised by modifications.
Compatibility Considerations
Different macOS versions may employ slightly varied PKG formats. Consider target system compatibility requirements when modifying PKG files.
Advanced Techniques and Alternative Approaches
Using ditto Utility
If macOS's ditto utility is available, certain operations can be simplified:
ditto -x -k Foo.pkg extracted_dir
# Modify files
ditto -c -k --keepParent extracted_dir Foo-new.pkg
The ditto utility provides better handling of macOS-specific file attributes and extended attributes.
Handling Compression Variants
Some PKG files may employ different compression algorithms or archive formats. If encountering issues during Payload extraction, consider:
file Payload # Check file type
strings Payload | head # Examine file header information
Conclusion
Processing macOS PKG files on Linux systems requires deep understanding of their internal structures and proficient use of related tools. Through the step-by-step process of unpacking, modifying, and repackaging, customized modifications to PKG files can be achieved. The key lies in ensuring toolchain completeness and operational accuracy, particularly when handling Bom files and Payload components. This methodology provides a viable technical solution for automated PKG file processing in non-macOS environments.