Complete Guide to Downloading File Streams with Axios and Writing to Disk in Node.js

Keywords: Node.js | Axios | File Download | Stream Processing | Promise

Abstract: This article provides an in-depth exploration of correctly downloading file streams and saving them to disk in Node.js using the Axios library. By analyzing common error cases, it explains backpressure issues in stream processing and offers multiple solutions based on Promises and stream pipelines. The focus is on technical details such as using responseType: 'stream' configuration, createWriteStream piping, and promisify utilities to ensure complete downloads, helping developers avoid file corruption and achieve efficient, reliable file downloading.

Introduction

In Node.js server-side development, downloading remote files and saving them to local disk using the Axios library is a common requirement. However, many developers encounter file corruption issues when attempting to directly write downloaded file streams via fs.writeFile. This article will explore a typical problem scenario, analyze the root causes, and provide solutions based on best practices.

Problem Analysis

The original problem code attempted to download a PDF file as follows:

axios.get('https://xxx/my.pdf', {responseType: 'blob'}).then(response => {
    fs.writeFile('/temp/my.pdf', response.data, (err) => {
        if (err) throw err;
        console.log('The file has been saved!');
    });
});

Although the file could be saved, the content was corrupted. The fundamental issue is that responseType: 'blob' is not suitable for Node.js environments, and fs.writeFile cannot properly handle stream data. This leads to backpressure problems, where data production exceeds consumption speed, causing data loss or corruption.

Core Solutions

The correct approach is to use streams for file downloading. Here are improved solutions based on the best answer:

Solution 1: Promise-Wrapped Stream Pipeline

export async function downloadFile(fileUrl: string, outputLocationPath: string) {
  const writer = createWriteStream(outputLocationPath);

  return Axios({
    method: 'get',
    url: fileUrl,
    responseType: 'stream',
  }).then(response => {
    return new Promise((resolve, reject) => {
      response.data.pipe(writer);
      let error = null;
      writer.on('error', err => {
        error = err;
        writer.close();
        reject(err);
      });
      writer.on('close', () => {
        if (!error) {
          resolve(true);
        }
      });
    });
  });
}

The key aspects of this solution are:

Setting responseType: 'stream' to make Axios return a readable stream
Using createWriteStream to create a writable stream
Piping the readable stream to the writable stream via pipe()
Using a Promise to ensure the file is completely downloaded before resolution

Solution 2: Using Node.js Built-in Promisify Utility

For newer versions of Node.js, a more concise implementation is available:

import * as stream from 'stream';
import { promisify } from 'util';

const finished = promisify(stream.finished);

export async function downloadFile(fileUrl: string, outputLocationPath: string): Promise<any> {
  const writer = createWriteStream(outputLocationPath);
  return Axios({
    method: 'get',
    url: fileUrl,
    responseType: 'stream',
  }).then(response => {
    response.data.pipe(writer);
    return finished(writer);
  });
}

This solution utilizes the promisified version of stream.finished, resulting in cleaner, more readable code.

Technical Deep Dive

Backpressure Handling Mechanism

The core advantage of Node.js streams lies in their backpressure handling capability. When a writable stream cannot process data quickly enough, the readable stream automatically pauses, preventing memory overflow. Through the pipe() method, this backpressure control is automatic, ensuring stable data transmission.

Error Handling Strategies

Comprehensive error handling is crucial for file downloading functionality. In the above solutions:

The writer.on('error') event handles write errors
Promise reject mechanisms propagate errors upward
Stream resources are properly closed when errors occur

Alternative Approaches Comparison

Referencing other answers and articles, several alternative approaches exist:

Using Stream Pipeline (stream.pipeline):

const util = require('util');
const stream = require('stream');
const pipeline = util.promisify(stream.pipeline);

const downloadFile = async () => {
  try {
    const request = await axios.get('https://xxx/my.pdf', {
      responseType: 'stream',
    });
    await pipeline(request.data, fs.createWriteStream('/temp/my.pdf'));
    console.log('download pdf pipeline successful');   
  } catch (error) {
    console.error('download pdf pipeline failed', error);
  }
}

Using ArrayBuffer (Suitable for Small Files):

const res = await axios.get(url, { responseType: "arraybuffer" });
await fs.promises.writeFile(downloadDestination, res.data);

This method loads the entire file into memory and is only suitable for small file downloads.

Best Practice Recommendations

Always Use Streams for Large Files: For files exceeding a few MB, stream processing is the only reliable option
Correctly Configure Response Type: In Node.js, responseType: 'stream' is the correct configuration for file downloads
Implement Comprehensive Error Handling: Include network errors, file system errors, and stream errors
Consider Memory Management: Close streams and file handles promptly to avoid memory leaks
Add Progress Monitoring: For large file downloads, consider adding progress indication functionality

Conclusion

When downloading files with Axios and saving them to disk in Node.js, the correct approach is to fully leverage Node.js's stream processing capabilities. By setting responseType: 'stream', piping the response stream to a file stream via pipe(), and using Promises to ensure download completeness, file corruption can be avoided. The various solutions presented in this article each have their applicable scenarios, allowing developers to choose the most suitable implementation based on specific needs.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.