Comprehensive Guide to Resolving UTF-8 Encoding Issues in Spring MVC

Nov 28, 2025 · Programming · 10 views · 7.8

Keywords: Spring MVC | UTF-8 Encoding | Maven Configuration | Character Encoding Filter | Internationalization

Abstract: This article provides an in-depth analysis of UTF-8 character encoding problems in Spring MVC applications, with particular focus on the critical role of Maven build configuration. Through detailed examination of Q&A data and reference cases, the article systematically introduces multi-dimensional solutions including CharacterEncodingFilter configuration, project source file encoding settings, and server-side URI encoding. The content not only offers specific code examples and configuration file modifications but also explains the fundamental principles of character encoding to help developers thoroughly understand and resolve international character display issues in Spring MVC.

Problem Background and Phenomenon Analysis

UTF-8 character encoding issues represent a common technical challenge in Spring MVC application development. From the provided Q&A data, it's evident that developers encounter character display abnormalities when using the Spring MVC framework. Specifically, the UTF-8 string "ölm" set in the controller displays as "√∂lm" on the JSP page, while the hardcoded "ö" character in JSP displays correctly.

This discrepancy indicates that the problem is not merely about page encoding settings but involves deeper build and transmission processes. Notably, when developers used the Unicode escape sequence "\u00f6lm" instead of the original string, the encoding issue was resolved, further confirming that the root cause lies in character encoding processing during the build process.

Root Cause Investigation

Through in-depth analysis of the Q&A data, the core issue stems from Maven build tool encoding configuration. By default, Maven uses the system default encoding (MacRoman on Mac systems) to compile Java source files. When source files contain non-ASCII characters (such as the German umlaut "ö"), if the build encoding doesn't match the actual source file encoding, characters become incorrectly transformed during the compilation phase.

From the perspective of character encoding principles, the "ö" character corresponds to the byte sequence "C3 B6" in UTF-8 encoding, while in MacRoman encoding it's interpreted as a different character. This encoding mismatch causes character distortion during the transformation from source code to compiled artifacts, ultimately manifesting as garbled text during runtime.

Solution Implementation

To thoroughly resolve this issue, configuration optimization is required at multiple levels:

Maven Build Configuration

Explicitly specify source file encoding as UTF-8 in the project's pom.xml file:

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

This configuration ensures Maven uses the correct character encoding when compiling Java source files, fundamentally preventing character mis-transformation during the build phase. Additionally, it's recommended to set project encoding to UTF-8 in the IDE to maintain consistency in the development environment.

Spring Framework Configuration

While Maven encoding configuration is the core solution, character encoding filtering at the Spring framework level remains crucial. Configure CharacterEncodingFilter in web.xml:

<filter>
    <filter-name>encodingFilter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
        <param-name>encoding</param-name>
        <param-value>UTF-8</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
</filter>
<filter-mapping>
    <filter-name>encodingFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

This filter ensures uniform UTF-8 character encoding for HTTP requests and responses, particularly important when handling form submissions and AJAX requests.

Server-Side Configuration

For Servlet containers like Tomcat, URI encoding must be configured in server.xml:

<Connector port="8080" protocol="HTTP/1.1"
           URIEncoding="UTF-8"
           ... />

This configuration ensures proper decoding of non-ASCII characters in URLs, preventing garbled text during parameter transmission.

JSP Page Optimization

Although encoding is correctly set in the Q&A JSP pages, configuration completeness should be ensured:

<%@ page language="java" pageEncoding="UTF-8"%>
<%@ page contentType="text/html;charset=UTF-8" %>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

This multi-level encoding declaration ensures that browsers, JSP engines, and Servlet containers all use unified UTF-8 encoding standards when processing page content.

Development Environment Configuration Recommendations

Based on the development environment mentioned in the Q&A (Mac OS X + SpringSource Tool Suite), the following configurations are recommended:

In Eclipse/STS, set text file encoding to UTF-8 through Window > Preferences > General > Workspace. For specific projects, also specify UTF-8 encoding in the project properties' Resource settings. This IDE-level configuration, combined with Maven build configuration, forms a comprehensive character encoding solution for the development environment.

Testing and Verification Methods

To verify the correctness of encoding configuration, create test cases containing characters from multiple languages:

@Controller
public class EncodingTestController {
    
    @RequestMapping("/encoding-test")
    public ModelAndView testEncoding() {
        ModelAndView mav = new ModelAndView("encoding-test");
        mav.addObject("german", "äöüß");
        mav.addObject("french", "éèêë");
        mav.addObject("chinese", "中文测试");
        mav.addObject("japanese", "日本語テスト");
        return mav;
    }
}

By observing the correct display of various language characters, the effectiveness of encoding configuration can be comprehensively verified.

Conclusion and Best Practices

UTF-8 encoding issues in Spring MVC represent a typical "end-to-end" character encoding consistency challenge. The solution requires systematic optimization across multiple levels including build tools, framework configuration, server environment, and development tools. The core principle is to ensure characters maintain consistent UTF-8 encoding throughout the entire process from source code to final display.

Best practices include: explicitly configuring Maven build encoding at project inception, properly setting character encoding filters in web.xml, ensuring server container URI encoding configuration, and unifying development environment encoding settings. Through this comprehensive configuration strategy, character encoding issues in Spring MVC applications can be completely avoided, laying a solid foundation for internationalized application development.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.