Keywords: MySQL | JSON conversion | data aggregation
Abstract: This technical article provides a comprehensive exploration of methods for converting relational query results into JSON arrays within MySQL. It begins with traditional string concatenation approaches using GROUP_CONCAT and CONCAT functions, then focuses on modern solutions leveraging JSON_ARRAYAGG and JSON_OBJECT functions available in MySQL 5.7 and later. Through detailed code examples, the article demonstrates implementation specifics, compares advantages and disadvantages of different approaches, and offers practical recommendations for real-world application scenarios. Additional discussions cover potential issues such as character encoding and data length limitations, along with their corresponding solutions, providing valuable technical reference for developers working on data transformation and API development.
Introduction
In modern web development and data exchange scenarios, JSON format has become the de facto standard. MySQL, as a widely used relational database, introduced native JSON support starting from version 5.7, providing developers with more convenient data processing capabilities. This article delves into the technical details of converting traditional tabular query results into structured JSON arrays within MySQL.
Problem Context and Requirements Analysis
Consider a typical data transformation scenario: querying name and phone number information from a person table, with the expected output format being a standard JSON array where each element is a JSON object containing name and phone fields.
Original query result example:
SELECT name, phone FROM person;
| name | phone |
| Jack | 12345 |
| John | 23455 |
Expected JSON output:
[
{
"name": "Jack",
"phone": 12345
},
{
"name": "John",
"phone": 23455
}
]
Traditional Concatenation Approach
In earlier versions of MySQL, due to the lack of native JSON support, developers needed to rely on string concatenation functions to achieve similar functionality. The core idea of this method is to manually construct JSON structures using GROUP_CONCAT and CONCAT functions.
Implementation code example:
SELECT CONCAT(
'[',
GROUP_CONCAT(JSON_OBJECT('name', name, 'phone', phone)),
']'
)
FROM person;
Analysis of how this method works:
JSON_OBJECT('name', name, 'phone', phone)generates independent JSON objects for each row of dataGROUP_CONCATfunction connects all JSON objects into a single string with commasCONCATfunction adds square brackets at the beginning and end to form a complete JSON array
While this approach achieves basic functionality, it has several notable drawbacks:
- Poor code readability with deep nesting levels
- Manual handling required for JSON format integrity
- Potential issues with
group_concat_max_lenlimitations
Modern JSON Function Approach
MySQL 5.7 and later versions provide specialized JSON processing functions, making data transformation more concise and reliable. The JSON_ARRAYAGG function is specifically designed to aggregate multiple rows of data into a JSON array.
Optimized implementation code:
SELECT JSON_ARRAYAGG(JSON_OBJECT('name', name, 'phone', phone))
FROM Person;
Advantages of this approach:
- Clear and concise syntax with obvious intent
- Native JSON format handling by MySQL, avoiding manual concatenation errors
- Automatic data type conversion (e.g., numbers don't require quotes)
- Better performance with reduced string operation overhead
Function Details and Technical Specifications
JSON_OBJECT Function
The JSON_OBJECT function is used to create JSON objects, accepting key-value pairs as parameters. Syntax structure:
JSON_OBJECT(key1, value1, key2, value2, ...)
Key characteristics:
- Automatic data type handling - string values get quotes, numbers remain as-is
- NULL value support - NULL values are represented as
nullin JSON - Key names must be strings, values can be any MySQL data type
JSON_ARRAYAGG Function
JSON_ARRAYAGG is an aggregate function used to combine multiple rows of data into a single JSON array. It works similarly to other aggregate functions like SUM or COUNT.
Important considerations:
- Returns a valid JSON format string as result
- Automatically handles comma separation of array elements
- In GROUP BY queries, aggregation occurs per group
- For empty result sets, returns
[](empty array)
Extended Practical Application Scenarios
Complex Data Structure Handling
In real-world applications, more complex data structures often need to be processed, such as cases involving nested objects or arrays:
-- Handling complex structures with address information
SELECT JSON_ARRAYAGG(
JSON_OBJECT(
'name', name,
'phone', phone,
'address', JSON_OBJECT(
'street', street,
'city', city,
'zipcode', zipcode
)
)
)
FROM person_with_address;
Conditional Data Inclusion
In some scenarios, you may need to decide whether to include certain fields based on conditions:
-- Include phone field only when phone number is not empty
SELECT JSON_ARRAYAGG(
CASE
WHEN phone IS NOT NULL THEN
JSON_OBJECT('name', name, 'phone', phone)
ELSE
JSON_OBJECT('name', name)
END
)
FROM person;
Performance Considerations and Best Practices
Character Encoding and Length Limitations
When using the GROUP_CONCAT approach, be aware of the group_concat_max_len system variable limitation. The default value is 1024, which may be insufficient for large amounts of data.
Solution:
-- Temporarily set a larger length limit
SET SESSION group_concat_max_len = 1000000;
-- Or permanently modify configuration file
-- group_concat_max_len = 1000000
Data Type Consistency
Ensuring data type consistency in JSON is important. While MySQL automatically performs type inference, explicit type casting can provide more controllable results:
-- Ensure numeric fields are correctly identified
SELECT JSON_ARRAYAGG(
JSON_OBJECT(
'name', CAST(name AS CHAR),
'phone', CAST(phone AS UNSIGNED)
)
)
FROM person;
Comparison with Alternative Approaches
Application Layer Processing
Another common practice is to perform JSON conversion at the application layer, for example using PHP's json_encode function:
<?php
// Get query results
$result = mysqli_query($conn, "SELECT name, phone FROM person");
$data = [];
while ($row = mysqli_fetch_assoc($result)) {
$data[] = $row;
}
// Convert to JSON
echo json_encode($data);
?>
Comparison between database layer and application layer approaches:
- Database Layer Approach: Reduces network transmission, higher server-side pressure
- Application Layer Approach: More flexible, higher client-side processing pressure
Error Handling and Edge Cases
Empty Result Set Handling
How both approaches handle empty result sets:
JSON_ARRAYAGGreturns[]- Traditional concatenation approach returns
[](sinceGROUP_CONCATon empty set returns empty string)
Special Character Escaping
When data contains JSON special characters, MySQL automatically handles escaping:
-- Data containing quotes
INSERT INTO person (name, phone) VALUES ('O"Reilly', '12345');
-- JSON output automatically escapes
-- {"name": "O\"Reilly", "phone": "12345"}
Version Compatibility Considerations
JSON support across different MySQL versions:
- MySQL 5.7+: Full support for
JSON_OBJECTandJSON_ARRAYAGG - MySQL 5.6 and earlier: Requires traditional concatenation approach or application layer processing
- MySQL 8.0+: Enhanced JSON performance and new function support
Conclusion
MySQL provides a complete solution system ranging from traditional string concatenation to modern JSON functions. For new projects, the combination of JSON_ARRAYAGG and JSON_OBJECT is recommended due to its concise code, superior performance, and ease of maintenance. For legacy systems or lower version MySQL environments, the traditional concatenation approach remains a viable option. In practical applications, the most suitable implementation should be chosen based on specific data scale, performance requirements, and MySQL version.
With the widespread application of JSON in web APIs and data exchange, mastering these data transformation techniques is crucial for modern database development. Developers are advised to thoroughly test the performance of various approaches in actual projects and make reasonable technology selection decisions based on specific requirements.