Keywords: Node.js | Axios | File Download | Stream Processing | Promise
Abstract: This article provides an in-depth exploration of correctly downloading file streams and saving them to disk in Node.js using the Axios library. By analyzing common error cases, it explains backpressure issues in stream processing and offers multiple solutions based on Promises and stream pipelines. The focus is on technical details such as using responseType: 'stream' configuration, createWriteStream piping, and promisify utilities to ensure complete downloads, helping developers avoid file corruption and achieve efficient, reliable file downloading.
Introduction
In Node.js server-side development, downloading remote files and saving them to local disk using the Axios library is a common requirement. However, many developers encounter file corruption issues when attempting to directly write downloaded file streams via fs.writeFile. This article will explore a typical problem scenario, analyze the root causes, and provide solutions based on best practices.
Problem Analysis
The original problem code attempted to download a PDF file as follows:
axios.get('https://xxx/my.pdf', {responseType: 'blob'}).then(response => {
fs.writeFile('/temp/my.pdf', response.data, (err) => {
if (err) throw err;
console.log('The file has been saved!');
});
});Although the file could be saved, the content was corrupted. The fundamental issue is that responseType: 'blob' is not suitable for Node.js environments, and fs.writeFile cannot properly handle stream data. This leads to backpressure problems, where data production exceeds consumption speed, causing data loss or corruption.
Core Solutions
The correct approach is to use streams for file downloading. Here are improved solutions based on the best answer:
Solution 1: Promise-Wrapped Stream Pipeline
export async function downloadFile(fileUrl: string, outputLocationPath: string) {
const writer = createWriteStream(outputLocationPath);
return Axios({
method: 'get',
url: fileUrl,
responseType: 'stream',
}).then(response => {
return new Promise((resolve, reject) => {
response.data.pipe(writer);
let error = null;
writer.on('error', err => {
error = err;
writer.close();
reject(err);
});
writer.on('close', () => {
if (!error) {
resolve(true);
}
});
});
});
}The key aspects of this solution are:
- Setting
responseType: 'stream'to make Axios return a readable stream - Using
createWriteStreamto create a writable stream - Piping the readable stream to the writable stream via
pipe() - Using a Promise to ensure the file is completely downloaded before resolution
Solution 2: Using Node.js Built-in Promisify Utility
For newer versions of Node.js, a more concise implementation is available:
import * as stream from 'stream';
import { promisify } from 'util';
const finished = promisify(stream.finished);
export async function downloadFile(fileUrl: string, outputLocationPath: string): Promise<any> {
const writer = createWriteStream(outputLocationPath);
return Axios({
method: 'get',
url: fileUrl,
responseType: 'stream',
}).then(response => {
response.data.pipe(writer);
return finished(writer);
});
}This solution utilizes the promisified version of stream.finished, resulting in cleaner, more readable code.
Technical Deep Dive
Backpressure Handling Mechanism
The core advantage of Node.js streams lies in their backpressure handling capability. When a writable stream cannot process data quickly enough, the readable stream automatically pauses, preventing memory overflow. Through the pipe() method, this backpressure control is automatic, ensuring stable data transmission.
Error Handling Strategies
Comprehensive error handling is crucial for file downloading functionality. In the above solutions:
- The
writer.on('error')event handles write errors - Promise reject mechanisms propagate errors upward
- Stream resources are properly closed when errors occur
Alternative Approaches Comparison
Referencing other answers and articles, several alternative approaches exist:
Using Stream Pipeline (stream.pipeline):
const util = require('util');
const stream = require('stream');
const pipeline = util.promisify(stream.pipeline);
const downloadFile = async () => {
try {
const request = await axios.get('https://xxx/my.pdf', {
responseType: 'stream',
});
await pipeline(request.data, fs.createWriteStream('/temp/my.pdf'));
console.log('download pdf pipeline successful');
} catch (error) {
console.error('download pdf pipeline failed', error);
}
}Using ArrayBuffer (Suitable for Small Files):
const res = await axios.get(url, { responseType: "arraybuffer" });
await fs.promises.writeFile(downloadDestination, res.data);This method loads the entire file into memory and is only suitable for small file downloads.
Best Practice Recommendations
- Always Use Streams for Large Files: For files exceeding a few MB, stream processing is the only reliable option
- Correctly Configure Response Type: In Node.js,
responseType: 'stream'is the correct configuration for file downloads - Implement Comprehensive Error Handling: Include network errors, file system errors, and stream errors
- Consider Memory Management: Close streams and file handles promptly to avoid memory leaks
- Add Progress Monitoring: For large file downloads, consider adding progress indication functionality
Conclusion
When downloading files with Axios and saving them to disk in Node.js, the correct approach is to fully leverage Node.js's stream processing capabilities. By setting responseType: 'stream', piping the response stream to a file stream via pipe(), and using Promises to ensure download completeness, file corruption can be avoided. The various solutions presented in this article each have their applicable scenarios, allowing developers to choose the most suitable implementation based on specific needs.