Technical Analysis of Formatting XML Output in PHP

Dec 06, 2025 · Programming · 7 views · 7.8

Keywords: PHP | XML | DOMDocument | formatting | UTF-8

Abstract: This article explores methods for outputting formatted XML using PHP's DOMDocument class, including setting the preserveWhiteSpace and formatOutput properties, and introduces alternative approaches such as the tidy extension, to aid developers in generating readable XML documents.

Introduction

In PHP development, handling XML data is a common task. Users inquire how to output formatted XML, such as displaying a structure with indentation and line breaks in browsers, rather than compressed strings. The DOMDocument class offers robust functionality, but default output may not be user-friendly. This article delves into core methods for formatted output.

Basic Usage of DOMDocument

DOMDocument is part of PHP's core library for creating and manipulating XML documents. First, instantiate a DOMDocument object, e.g., using new DOMDocument('1.0') to specify the XML version. Then, build the document structure with methods like createElement and appendChild. In the original code, the issue was lack of formatting, causing all elements to be compressed into a single line.

Setting Formatting Properties

To achieve formatted output, DOMDocument provides two key properties: preserveWhiteSpace and formatOutput. Setting preserveWhiteSpace = false removes extra whitespace from the document, while formatOutput = true enables automatic formatting, adding indentation and line breaks for readability. Best practice is to set these properties immediately after creating the document.

Code Example

Based on the Q&A data, the following rewritten code demonstrates the complete process of formatting XML. It first creates a DOMDocument instance with formatting properties, then constructs a simple XML structure.

$doc = new DOMDocument('1.0');
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;

$root = $doc->createElement('root');
$doc->appendChild($root);

$signed_values = array('a' => 'eee', 'b' => 'sd', 'c' => 'df');

foreach ($signed_values as $key => $val) {
    $occ = $doc->createElement('error');
    $root->appendChild($occ);
    
    foreach ($signed_values as $fieldname => $fieldvalue) {
        $child = $doc->createElement($fieldname);
        $occ->appendChild($child);
        $value = $doc->createTextNode($fieldvalue);
        $child->appendChild($value);
    }
}

$xml_string = $doc->saveXML();
echo $xml_string;

Running this code outputs formatted XML with indentation and line breaks, for example:

<?xml version="1.0"?>
<root>
  <error>
    <a>eee</a>
    <b>sd</b>
    <c>df</c>
  </error>
  <error>
    <a>eee</a>
    <b>sd</b>
    <c>df</c>
  </error>
  <error>
    <a>eee</a>
    <b>sd</b>
    <c>df</c>
  </error>
</root>

Alternative Methods

Beyond DOMDocument, libraries like tidy can be used for XML formatting. For instance, the tidy_repair_string function can adjust indentation, but does not support tabs. Code example: $xml_string = tidy_repair_string($xml_string, ['input-xml'=> 1, 'indent' => 1, 'wrap' => 0]);. For UTF-8 encoding, DOMDocument supports it by default; ensure input data is in UTF-8. As a supplement, SimpleXML objects can also achieve formatting via DOMDocument conversion, but with lower efficiency.

Conclusion

By setting the preserveWhiteSpace and formatOutput properties, developers can easily output formatted XML, enhancing code maintainability. The DOMDocument approach is efficient and standard, while extensions like tidy offer more customization. In practice, it is recommended to prioritize DOMDocument's built-in features to streamline development.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.