Research on Filename Parameter Encoding in HTTP Content-Disposition Header

Nov 15, 2025 · Programming · 23 views · 7.8

Keywords: HTTP | Content-Disposition | Filename Encoding | RFC 5987 | Browser Compatibility

Abstract: This paper thoroughly examines the encoding challenges of filename parameters in HTTP Content-Disposition headers. Addressing RFC 2183's US-ASCII character set limitations, it analyzes the UTF-8 encoding scheme proposed in RFC 5987 and its implementation variations across major browsers. Through detailed encoding examples and browser compatibility testing, practical encoding strategies are provided to assist developers in correctly handling filename downloads containing non-ASCII characters.

Introduction

In modern web applications, forcing resource downloads instead of inline display is a common requirement, achieved through the Content-Disposition header in HTTP responses. The filename parameter of this header suggests a name for the file when downloaded by the browser. However, RFC 2183 explicitly restricts this parameter to the US-ASCII character set, creating significant limitations in practical applications.

RFC Specification Evolution

The original RFC 2183 states that filename parameters should follow RFC 2045 syntax, limited to US-ASCII characters. While the document acknowledges the need for arbitrary character set support, it does not define specific mechanisms.

Subsequent RFC 2184 was replaced by RFC 2231, which provides more comprehensive parameter encoding mechanisms for MIME messages. Ultimately, RFC 5987 specifically defines character set and language encoding schemes for HTTP header field parameters, forming the theoretical basis for handling non-ASCII filenames.

RFC 5987 Encoding Scheme

RFC 5987 introduces the filename* parameter, supporting UTF-8 encoding through a specific syntax format:

Content-Disposition: attachment; filename*=UTF-8''Na%C3%AFve%20file.txt

Here, UTF-8 specifies the character set, the single quote pair indicates language (can be empty), and the subsequent part is the percent-encoded UTF-8 byte sequence. This format allows complete representation of Unicode characters, such as "naïve" in the example (third character U+00EF).

Browser Compatibility Analysis

Despite RFC 5987 providing a standard scheme, browser implementations vary:

For maximum compatibility, it is recommended to provide both filename and filename* parameters:

Content-Disposition: attachment; filename="naive file.txt"; filename*=UTF-8''Na%C3%AFve%20file.txt

Browsers not supporting RFC 5987 will ignore filename* and use the fallback ASCII approximation.

Practical Encoding Examples

The following C# code demonstrates how to dynamically generate appropriate Content-Disposition headers based on browser type:

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.Browser.Browser == "Safari")
    contentDisposition = "attachment; filename=" + fileName;
else
    contentDisposition = "attachment; filename=\"" + fileName + "\"; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

This code addresses the different requirements of legacy IE versions, Safari, and modern browsers, ensuring correct filename display.

Special Handling for Android Devices

Android's built-in download manager has limitations in parsing filenames, requiring additional processing:

private static readonly Dictionary<char, char> AndroidAllowedChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-+,@£$€!½§~'=()[]{}0123456789".ToDictionary(c => c);
private string MakeAndroidSafeFileName(string fileName)
{
    char[] newFileName = fileName.ToCharArray();
    for (int i = 0; i < newFileName.Length; i++)
    {
        if (!AndroidAllowedChars.ContainsKey(newFileName[i]))
            newFileName[i] = '_';
    }
    return new string(newFileName);
}

This function replaces unsupported characters with underscores, ensuring compatibility with Android devices.

Alternative Approaches

Beyond header encoding, filenames can be implied through URL paths:

/download_script.php/na%C3%AFve_file.txt

Browsers typically use the last part of the URL as the default filename, eliminating the need for a Content-Disposition header. This method offers excellent compatibility but requires server support for URL rewriting to hide the actual script path.

Conclusion

Filename encoding in HTTP Content-Disposition headers is a complex yet crucial issue. RFC 5987 provides a standard UTF-8 support scheme, but browser compatibility variations necessitate a progressive enhancement strategy from developers. By combining filename and filename* parameters and applying special handling for specific environments like Android, cross-platform filename correctness can be ensured. Continuous monitoring of browser updates and standard evolution will help simplify future implementation schemes.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.