Understanding Default Character Encoding and Collation in SQL Server

Nov 23, 2025 · Programming · 13 views · 7.8

Keywords: SQL Server | Character Encoding | Collation

Abstract: This article provides an in-depth exploration of default character encoding settings in Microsoft SQL Server and their relationship with collation. It begins by explaining the different encoding methods for Unicode data (UCS-2/UTF-16) and non-Unicode data (8-bit encoding based on code pages). The article then details how to view current server and database collations using system functions and properties, and how these settings affect character encoding. It discusses the inheritance and override mechanisms of collation at different levels (server, database, column) and provides practical SQL query examples to help readers obtain and understand these critical configuration details.

Character Encoding Fundamentals

In Microsoft SQL Server, character encoding settings are closely tied to collation. Unicode data (such as XML and types prefixed with N) is stored by default using UCS-2 / UTF-16 encoding, which is not configurable. Non-Unicode data (e.g., CHAR, VARCHAR types) uses 8-bit encoding based on code pages, with the specific character set determined by the collation.

Viewing Current Settings

To view the default collation of a SQL Server instance, use the following query:

SELECT SERVERPROPERTY('Collation')

This query returns the server-level collation name, which influences the default collation for newly created databases. For example, if the result is SQL_Latin1_General_CP1_CI_AS, non-Unicode data uses Windows Code Page 1252 encoding.

Database-Level Collation

The default collation of a database can be retrieved with this query:

SELECT DATABASEPROPERTYEX('DBName', 'Collation')

Replace DBName with the actual database name. This setting determines the default collation for string literals, variables, and newly created columns, thereby affecting character encoding.

Collation Hierarchy and Encoding Impact

Collation exists at multiple levels in SQL Server: server, database, and column. Server collation controls instance-level metadata and object names, database collation serves as the default for new columns and affects string operations, and column-level collation can be specified during table creation or inherited from the database default. These hierarchical settings collectively determine the encoding of character data.

Practical Query Examples

The following query offers a comprehensive view, including the OS language version, instance and database collations, and related properties:

SELECT os_language_version,
       SERVERPROPERTY('LCID') AS 'Instance-LCID',
       SERVERPROPERTY('Collation') AS 'Instance-Collation',
       DATABASEPROPERTYEX(N'{database_name}', 'Collation') AS 'Database-Collation'
FROM sys.dm_os_windows_info;

By understanding these settings, users can better manage character data processing and storage in SQL Server.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.