Keywords: MySQL | Emoji Storage | Character Set Configuration | utf8mb4 | Database Encoding
Abstract: This article provides a comprehensive analysis of encoding issues when storing Emoji characters in MySQL databases. It systematically addresses the common 1366 error through detailed configuration procedures from database level to application level, including character set settings, table structure modifications, connection configurations, and practical code examples with implementation recommendations.
Problem Background and Error Analysis
During MySQL database development, when attempting to store text data containing Emoji characters, developers frequently encounter the 1366 error: "Incorrect string value". The root cause of this error lies in MySQL's character set configuration not supporting four-byte UTF-8 encoding, which is required for Emoji characters.
Core Principles of the Solution
MySQL's default utf8 character set only supports up to three-byte UTF-8 encoding, making it incapable of properly handling Emoji characters. The utf8mb4 character set must be used instead, as it fully supports four-byte UTF-8 encoding and can correctly process all Unicode characters, including Emojis.
Database Level Configuration
First, the correct character set must be configured at the database level. If the database has not been created yet, use the following SQL statement:
CREATE DATABASE database_name DEFAULT CHARSET = utf8mb4 DEFAULT COLLATE = utf8mb4_unicode_ci;
For existing databases, use the ALTER statement to modify the character set:
ALTER DATABASE database_name CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
Table Level Configuration
When creating new tables, explicitly specify the character set and collation:
CREATE TABLE IF NOT EXISTS table_name (
id INT PRIMARY KEY AUTO_INCREMENT,
comment TEXT,
created_at TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE utf8mb4_unicode_ci;
For existing tables, character set conversion is required:
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
To modify the character set of a specific field individually:
ALTER TABLE table_name MODIFY field_name TEXT CHARSET utf8mb4;
Application Connection Configuration
In PHP applications, set the character set immediately after establishing the database connection:
<?php
$database_connection = new mysqli($server, $user, $password, $database_name);
if ($database_connection->connect_error) {
die("Connection failed: " . $database_connection->connect_error);
}
$database_connection->set_charset('utf8mb4');
?>
Complete Example and Verification
After configuration, insertion operations containing Emoji characters can be executed normally:
INSERT INTO tablename (column1, column2, column3, column4, column5, column6, column7)
VALUES ('273', '3', 'Hdhdhdhhzhzhzzhjzj 我爱你 ❌', 49, 1, '2016-09-13 08:02:29', '2016-09-13 08:02:29');
Common Issues and Troubleshooting
In practical applications, the following issues may be encountered:
- Ensure all relevant tables are correctly converted to
utf8mb4character set - Verify character set settings in application connection strings
- Check if the MySQL server version supports
utf8mb4 - Confirm that client tools (such as phpMyAdmin) also use the correct character set
Best Practice Recommendations
To ensure correct storage and display of Emoji characters, it is recommended to:
- Use
utf8mb4as the default character set in new projects - Regularly check character set consistency across database tables
- Uniformly use UTF-8 encoding in applications
- Conduct thorough testing, especially for edge cases