Keywords: HTTP | Content-Type | Media Types | Validation | IANA
Abstract: This article provides an in-depth exploration of the HTTP Content-Type header field, covering its complete value range, syntax structure, practical application scenarios, and validation methods. Based on the IANA official media type registry, it systematically categorizes and introduces major media types including application, audio, image, multipart, text, video, and vnd, encompassing various content types from common application/json to complex multipart/form-data. The article also offers practical content type validation strategies, including regular expression validation, whitelist mechanisms, and server-side validation best practices, assisting developers in correctly setting and validating Content-Type headers in HTTP requests.
Overview of Content-Type Header Field
The HTTP Content-Type header field is a critical component of the HTTP protocol, used to indicate the original media type of a resource or data. This field plays important roles in both HTTP requests and responses: in responses, it informs the client about the media type of returned data; in requests (particularly POST and PUT methods), it specifies the type of content being sent to the server.
Media Type Classification and Complete Value Range
According to the IANA (Internet Assigned Numbers Authority) official media type registry, Content-Type values can be categorized into several major classes, each containing specific subsets of media types.
Application Types
Application types cover various application-specific data formats, including:
application/java-archive
application/EDI-X12
application/EDIFACT
application/javascript (obsolete)
application/octet-stream
application/ogg
application/pdf
application/xhtml+xml
application/x-shockwave-flash
application/json
application/ld+json
application/xml
application/zip
application/x-www-form-urlencoded
Audio Types
Audio media types are used for various audio formats:
audio/mpeg
audio/x-ms-wma
audio/vnd.rn-realaudio
audio/x-wav
Image Types
Image media types support multiple image formats:
image/gif
image/jpeg
image/png
image/tiff
image/vnd.microsoft.icon
image/x-icon
image/vnd.djvu
image/svg+xml
Multipart Types
Multipart media types are used for messages containing multiple parts:
multipart/mixed
multipart/alternative
multipart/related (used by MHTML HTML mail)
multipart/form-data
Text Types
Text media types handle various text formats:
text/css
text/csv
text/html
text/javascript
text/plain
text/xml
Video Types
Video media types cover mainstream video formats:
video/mpeg
video/mp4
video/quicktime
video/x-ms-wmv
video/x-msvideo
video/x-flv
video/webm
VND Types
Vendor-specific media types are used for proprietary formats:
application/vnd.android.package-archive
application/vnd.oasis.opendocument.text
application/vnd.oasis.opendocument.spreadsheet
application/vnd.oasis.opendocument.presentation
application/vnd.oasis.opendocument.graphics
application/vnd.ms-excel
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
application/vnd.ms-powerpoint
application/vnd.openxmlformats-officedocument.presentationml.presentation
application/msword
application/vnd.openxmlformats-officedocument.wordprocessingml.document
application/vnd.mozilla.xul+xml
Content-Type Syntax Structure
The basic syntax of the Content-Type header is: Content-Type: <media-type>, where the media type can include optional parameters. Main parameters include:
charset Parameter
The charset parameter indicates the character encoding standard, with case-insensitive values but lowercase preferred. For example: Content-Type: text/html; charset=utf-8
boundary Parameter
The boundary parameter is required for multipart entities, used to demarcate boundaries between multiple parts of a message. The boundary value consists of 1 to 70 characters, cannot end with whitespace, and typically uses character sequences robust across different systems. For example: Content-Type: multipart/form-data; boundary=ExampleBoundaryString
Practical Application Scenarios
HTML Form Submission
In HTML form submissions, Content-Type is specified by the form element's enctype attribute:
<form action="/foo" method="post" enctype="multipart/form-data">
<input type="text" name="description" value="Description input value" />
<input type="file" name="myFile" />
<button type="submit">Submit</button>
</form>
Corresponding HTTP request example:
POST /foo HTTP/1.1
Content-Length: 68137
Content-Type: multipart/form-data; boundary=ExampleBoundaryString
--ExampleBoundaryString
Content-Disposition: form-data; name="description"
Description input value
--ExampleBoundaryString
Content-Disposition: form-data; name="myFile"; filename="foo.txt"
Content-Type: text/plain
[content of the file foo.txt chosen by the user]
--ExampleBoundaryString--
URL-Encoded Forms
For simple forms without file uploads, use URL-encoded format:
POST /submit HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 15
comment=Hello!
REST API JSON Interaction
In REST APIs, application/json is commonly used as the content type:
HTTP/1.1 201 Created
Content-Type: application/json
{
"message": "New user created",
"user": {
"id": 123,
"firstName": "Paul",
"lastName": "Klee",
"email": "p.klee@example.com"
}
}
Content Type Validation Strategies
Official Registry Reference
The IANA-maintained official media type registry (http://www.iana.org/assignments/media-types/media-types.xhtml) provides the most authoritative reference for media types. Developers should regularly consult this list to ensure usage of the latest standard types.
Regular Expression Validation
Use regular expressions to validate Content-Type format legality:
function isValidContentType(contentType) {
const pattern = /^[a-z]+\/[a-z0-9\-\+\.]+(;\s*[a-z]+=[a-z0-9\-\+\.]+)*$/i;
return pattern.test(contentType);
}
Whitelist Mechanism
Establish an allowed content type whitelist based on application requirements:
const allowedContentTypes = [
'application/json',
'application/xml',
'text/plain',
'text/html',
'multipart/form-data',
'application/x-www-form-urlencoded'
];
function isContentTypeAllowed(contentType) {
return allowedContentTypes.includes(contentType.split(';')[0].trim());
}
Server-Side Validation
Servers should implement strict content type validation, returning 415 status code for unsupported types:
if (!isContentTypeAllowed(req.headers['content-type'])) {
return res.status(415).json({
error: 'Unsupported Media Type',
message: 'The requested content type is not supported'
});
}
Security Considerations and Best Practices
MIME Sniffing Protection
To prevent browsers from performing MIME sniffing, set the X-Content-Type-Options header:
X-Content-Type-Options: nosniff
CORS Security Considerations
Content-Type is a CORS-safelisted response header and request header, but when used as a request header, its value cannot contain CORS-unsafe request header bytes, and the parsed media type (ignoring parameters) must be either application/x-www-form-urlencoded, multipart/form-data, or text/plain.
Character Encoding Standards
For text-type content, always explicitly specify the charset parameter, prioritizing UTF-8 encoding to ensure proper handling of international characters.
Browser Compatibility
The Content-Type header is well-supported across all major browsers, including Google Chrome, Mozilla Firefox, Apple Safari, Microsoft Edge, and Opera. This feature has been widely available in browsers since July 2015.
Conclusion
Proper understanding and application of the HTTP Content-Type header is crucial for building robust web applications. By combining the official media type registry, implementing strict validation mechanisms, and following security best practices, developers can ensure the reliability and security of HTTP communications. In practical development, it's recommended to select appropriate content types based on specific application scenarios and establish corresponding validation processes to prevent potential security risks.