Keywords: Axios | Blob | ArrayBuffer | Node.js | Binary Data Processing
Abstract: This article provides an in-depth examination of the data discrepancies that occur when using Axios in Node.js environments with responseType set to 'blob' versus 'arraybuffer'. By analyzing the conversion mechanisms of binary data during UTF-8 encoding processes, it explains why certain compression libraries report errors when processing data converted from Blobs. The paper includes detailed code examples and solutions to help developers correctly obtain original downloaded data.
Problem Background and Phenomenon Analysis
When using Axios to download ZIP files, developers often face the dilemma of choosing between responseType as 'blob' or 'arraybuffer'. Superficially, both types should provide the raw binary data of the file, but practical testing reveals significant differences in the returned data content.
When setting responseType: 'blob', Axios converts the response data to string format. Calculating MD5 hash values using Node.js's crypto module shows that the original Blob object and the Buffer converted via Buffer.from(response.data, 'utf8') share identical hash values. This indicates that Axios internally uses UTF-8 encoding to serialize binary data into strings.
In contrast, directly setting responseType: 'arraybuffer' returns the original ArrayBuffer object, whose hash value completely differs from the Buffer converted from Blob. This discrepancy leads to inconsistencies in subsequent processing—some libraries like js-zip cannot properly handle data converted from Blobs, while adm-zip can decompress normally.
Technical Principle Deep Dive
The root cause lies in the special behavior of the 'blob' option in Node.js environments. According to Axios official documentation, 'blob' is a browser-only option. When this option is specified in Node.js, Axios falls back to the default 'json' type, and further falls back to 'text' mode when JSON parsing fails.
When binary data is treated as text, it undergoes UTF-8 decoding process. This process replaces bytes that cannot be mapped to valid Unicode characters. Specifically:
// Example: PNG file header bytes [0x89, 0x50, 0x4E, 0x47]
// Original binary representation: 137, 80, 78, 71
// After UTF-8 decoding: "�PNG" (U+FFDD replacement character)Byte 0x89 (137) requires two bytes (0xC2 0x89) in UTF-8 encoding, but when input as a single byte, it cannot be correctly mapped and is therefore replaced by the U+FFDD replacement character. This conversion is irreversible, causing permanent damage to the original data.
Browser vs Node.js Environment Comparison
In browser environments, both Blob and ArrayBuffer can correctly represent binary data:
// Browser environment example
async function fetchBinaryData() {
const response = await axios.get(url, { responseType: 'blob' });
const blob = response.data;
// Blob can be directly converted to ArrayBuffer
const arrayBuffer = await blob.arrayBuffer();
return arrayBuffer;
}However, in Node.js, due to the lack of native Blob support, Axios cannot provide genuine Blob objects. This environmental difference is the main cause of confusion.
Solutions and Best Practices
To ensure complete original data acquisition, always use responseType: 'arraybuffer' in Node.js environments:
import axios from 'axios';
async function downloadFile(url) {
const response = await axios.get(url, {
responseType: 'arraybuffer'
});
// response.data is now a complete ArrayBuffer
return response.data;
}If dealing with corrupted data obtained from misconfiguration, the only solution is to re-request the original data from the server. Any conversion attempts based on corrupted data cannot recover the lost information.
Deep Understanding of Encoding Conversion
Understanding the relationship between text encoding and binary data is crucial. When binary data is misinterpreted as text:
- Valid UTF-8 sequences are correctly parsed
- Invalid byte sequences are replaced with U+FFDD
- The converted text length may differ from the original data
- Hash values change unpredictably
This mechanism explains why data converted from Blobs may appear usable in some cases (when files primarily consist of ASCII characters), but inevitably fail when processing arbitrary binary files.
Conclusion and Recommendations
This article provides a detailed analysis of the pitfalls in Axios responseType configuration. Key conclusions include: avoiding the 'blob' option in Node.js, understanding the impact of encoding conversion on binary data, and selecting the correct response type to ensure data integrity. This knowledge has significant guidance value for scenarios involving file downloads, data stream processing, and similar applications.