A Comprehensive Guide to Converting HTML to PDF with Node.js

Nov 28, 2025 · Programming · 9 views · 7.8

Keywords: Node.js | PDF Generation | HTML to PDF | PhantomJS | Puppeteer

Abstract: This article delves into various methods for converting HTML content to PDF documents in Node.js, focusing on popular libraries like PhantomJS, Puppeteer, jsPDF, and Playwright. Through detailed code examples and comparative analysis, it aids developers in selecting appropriate tools based on project needs, covering scenarios from simple documents to complex web page PDF generation.

Introduction

In modern web development, converting HTML content to PDF documents is a common requirement, especially for generating reports, invoices, or printable versions of web pages. Node.js, with its rich ecosystem, offers multiple libraries to achieve this conversion seamlessly. This article systematically introduces mainstream methods based on Q&A data and reference articles.

Using PhantomJS for PDF Generation

PhantomJS is a headless WebKit-based browser that enables rendering of web pages and their export as PDF files. Although it has been deprecated, it was widely used in the past. Integration with Node.js is possible via the phantomjs-node module. Below is a step-by-step implementation example based on the best answer.

First, install the necessary modules:

npm install phantom

Note that PhantomJS itself may need to be installed separately, but the phantom module handles dependencies. Then, use the following code to generate a PDF from a URL:

var phantom = require('phantom');
phantom.create().then(function(ph) {
    ph.createPage().then(function(page) {
        page.open("http://www.google.com").then(function(status) {
            page.render('google.pdf').then(function() {
                console.log('Page Rendered');
                ph.exit();
            });
        });
    });
});

This code initializes a PhantomJS instance, creates a page, navigates to a specified URL, and renders it as a PDF file. Asynchronous operations are handled with promises to ensure reliability.

Other Popular Libraries

Given the deprecation of PhantomJS, developers are encouraged to use more modern libraries such as Puppeteer, jsPDF, Playwright, and html-pdf. Each has its strengths and is suited for different scenarios.

Puppeteer

Puppeteer is a Node library developed by Google that provides a high-level API to control headless Chrome or Firefox. It supports full web rendering, including JavaScript execution, making it ideal for complex pages.

Installation:

npm install puppeteer

Example code for generating PDF from a URL:

const puppeteer = require('puppeteer');
async function generatePDF(url, outputPath) {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto(url);
    await page.pdf({ path: outputPath, format: 'A4' });
    await browser.close();
}
generatePDF('https://google.com', 'google.pdf')
    .then(() => console.log('PDF generated successfully'))
    .catch(err => console.error('Error generating PDF:', err));

jsPDF

jsPDF is a lightweight library that works in both Node.js and browser environments. It is best for generating simple PDFs from text or basic HTML but lacks advanced rendering capabilities.

Installation:

npm install jspdf

Example for custom HTML content:

const jsPDF = require('jspdf');
function generatePDF(htmlContent, outputPath) {
    const doc = new jsPDF();
    doc.text(htmlContent, 10, 10);
    doc.save(outputPath);
}
const htmlContent = 'Hello World. This is custom HTML content.';
generatePDF(htmlContent, 'custom.pdf');

Playwright

Playwright is similar to Puppeteer and supports multiple browsers (e.g., Chromium, WebKit, Firefox). It excels in automation and high-fidelity PDF generation.

Installation:

npm install playwright

Example code:

const playwright = require('playwright');
async function generatePDF(url, outputPath) {
    const browser = await playwright.chromium.launch();
    const page = await browser.newPage();
    await page.goto(url);
    await page.pdf({ path: outputPath });
    await browser.close();
}
generatePDF('https://google.com', 'output.pdf');

html-pdf

html-pdf is a Node.js library that internally uses PhantomJS. It is simple but deprecated and not recommended for new projects.

Installation:

npm install html-pdf

Example:

const pdf = require('html-pdf');
function generatePDF(htmlContent, outputPath) {
    pdf.create(htmlContent).toFile(outputPath, function(err, res) {
        if (err) return console.log(err);
        console.log('PDF generated successfully:', res);
    });
}
const htmlContent = '<h1>Hello World</h1><p>This is custom HTML content.</p>';
generatePDF(htmlContent, 'custom.pdf');

Comparison and Recommendations

When selecting a library, consider factors such as rendering quality, performance, and community support. Puppeteer and Playwright are suitable for complex web pages, offering high-fidelity output but potentially higher resource consumption; jsPDF is ideal for simple tasks, being fast and lightweight; html-pdf is outdated. For most modern applications, Puppeteer or Playwright are recommended due to active development and comprehensive features.

Conclusion

Converting HTML to PDF in Node.js can be efficiently achieved using libraries like Puppeteer, Playwright, or jsPDF. While PhantomJS was a historical solution, migrating to newer tools provides better performance and support. Developers should evaluate specific needs, such as JavaScript execution or template management, to choose the most appropriate library for optimizing development workflows.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.