Keywords: Gson | MalformedJsonException | JSON Parsing
Abstract: This paper provides an in-depth analysis of the MalformedJsonException thrown by the Gson library during JSON string parsing, focusing on the strict definition of whitespace characters in the JSON specification and common hidden character issues. By comparing two seemingly identical JSON strings in a real-world case, it reveals how invisible trailing characters in HTTP responses can affect the parsing process. The article details the solution using JsonReader's lenient mode and provides complete code examples and best practice recommendations to help developers effectively avoid and resolve such parsing errors.
Problem Background and Phenomenon Description
When using the Gson library for JSON string to Java object conversion, developers often encounter the com.google.gson.JsonSyntaxException: com.google.gson.stream.MalformedJsonException exception. This exception typically manifests when the parser encounters unexpected characters at the position where the end-of-file (EOF) is expected. A common scenario involves JSON strings obtained from HTTP interfaces that validate as correct in online validation tools but throw exceptions during Gson parsing, while manually defined strings with identical content parse successfully.
Root Cause Analysis of the Exception
The Gson library strictly adheres to the JSON specification (RFC 7159) definition of whitespace characters, recognizing only tab (\t), newline (\n), carriage return (\r), and space characters as legitimate whitespace. Any other characters, including common null characters (\0), appearing after the end of a JSON object are considered illegal content, triggering the MalformedJsonException.
In practical development, HTTP responses often contain invisible trailing characters that may not be directly observable in debuggers but significantly affect string length. Comparing the string lengths of result1 (HTTP response) and result2 (manually defined) can quickly identify such issues.
Code Example and Problem Reproduction
Consider the following typical exception scenario code:
public static Userinfo getUserinfo() {
String result1 = http.POST("https://www.bitstamp.net/api/balance/",
postdata, true);
String result2 = "{\"btc_reserved\": \"0\", \"fee\": \"0.5000\", \"btc_available\": \"0.10000000\", \"usd_reserved\": \"0\", \"btc_balance\": \"0.10000000\", \"usd_balance\": \"30.00\", \"usd_available\": \"30.00\"}";
Gson gson = new Gson();
Userinfo userinfo1 = gson.fromJson(result1, Userinfo.class); // throws exception
Userinfo userinfo2 = gson.fromJson(result2, Userinfo.class); // parses successfully
return userinfo1;
}In this example, result1 and result2 appear identical in the debugger, but result1 fails to parse due to invisible trailing characters.
Solution: Lenient Parsing Mode
When trailing characters cannot be eliminated at the source, Gson's JsonReader class can be used to enable lenient parsing mode:
Gson gson = new Gson();
JsonReader reader = new JsonReader(new StringReader(result1));
reader.setLenient(true);
Userinfo userinfo1 = gson.fromJson(reader, Userinfo.class);Setting setLenient(true) allows the parser to tolerate certain formatting errors, including trailing non-whitespace characters. This approach is particularly useful for handling JSON data returned by third-party APIs that cannot be fully controlled.
Best Practices and Preventive Measures
In production environments, the following measures are recommended to prevent such issues:
- Preprocess HTTP response strings using the
trim()method to remove leading and trailing whitespace before parsing - Implement string validation logic to check JSON string length and end-position characters
- For critical business data, implement retry mechanisms and exception handling strategies
- During debugging, use hexadecimal viewers to inspect the complete byte sequence of strings
In-Depth Technical Discussion
Gson's strict parsing mechanism stems from the rigor of the JSON specification. According to RFC 7159, a JSON text must consist of exactly one JSON value, followed only by whitespace characters. Gson scans the input stream character by character through JsonReader, immediately throwing MalformedJsonException when non-whitespace characters are found at the expected EOF position.
This design ensures data consistency but also presents challenges when handling non-compliant inputs. Developers must find a balance between data accuracy and system robustness.