Keywords: SQL Server | Unique Identifier | Auto-generated ID | Computed Column | IDENTITY Property
Abstract: This article provides an in-depth exploration of solutions for automatically generating custom-formatted unique identifiers with prefixes in SQL Server databases. By combining IDENTITY columns with computed columns, it enables the automatic generation of IDs in formats like UID00000001. The paper thoroughly analyzes implementation principles, performance considerations, and practical application scenarios.
Problem Background and Requirements Analysis
In database system design, there is often a need to generate unique identifiers with specific formats for entities. While the traditional uniqueidentifier data type guarantees uniqueness, its generated GUID format (such as 550e8400-e29b-41d4-a716-446655440000) is often unsuitable for user-facing identifiers. Users typically prefer ID formats that are readable and include specific prefixes, such as UID12345678 or CUSID5000.
Core Solution: Combination of IDENTITY and Computed Columns
In SQL Server, the most reliable and efficient solution involves combining the IDENTITY property with persisted computed columns. This approach leverages the database engine's built-in capabilities to ensure data integrity and performance.
Implementation Code Example
CREATE TABLE dbo.tblUsers
(
ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
UserID AS 'UID' + RIGHT('00000000' + CAST(ID AS VARCHAR(8)), 8) PERSISTED,
UserName NVARCHAR(50) NOT NULL,
Email NVARCHAR(100) NULL
)
In this implementation:
- The
ID INT IDENTITY(1,1)column handles automatic generation of incrementing numeric sequences, starting from 1 and increasing by 1 each time UserIDis a computed column that uses string concatenation and formatting functions to convert the numeric ID into the desired format- The
PERSISTEDkeyword ensures that the computed column's value is physically stored, improving query performance
Insert Operation Example
INSERT INTO dbo.tblUsers (UserName, Email)
VALUES ('John Doe', 'john.doe@example.com')
After executing the above insert statement, the system automatically generates:
IDcolumn: 1 (auto-incremented)UserIDcolumn:UID00000001(automatically computed)
Technical Details Deep Dive
Working Mechanism of IDENTITY Property
The IDENTITY property provides an atomic auto-increment mechanism in SQL Server, ensuring no duplicate values are generated in multi-user concurrent environments. Its internal implementation relies on the system table sys.identity_columns to maintain the current maximum value, automatically incrementing with each insert operation.
Formatting Logic in Computed Columns
The formatting expression 'UID' + RIGHT('00000000' + CAST(ID AS VARCHAR(8)), 8) in the computed column involves the following key steps:
CAST(ID AS VARCHAR(8)): Converts the numeric ID to string type'00000000' + ...: Pads the string with 8 zeros on the left to ensure a minimum length of 8 digitsRIGHT(..., 8): Extracts 8 characters from the right side to achieve fixed-length formatting'UID' + ...: Adds the prefix string
Advantages of Persisted Computed Columns
Computed columns using the PERSISTED keyword offer the following significant advantages:
- Performance Optimization: Values are computed during insertion and physically stored, avoiding repeated calculations during each query
- Index Support: Indexes can be created on persisted computed columns, further enhancing query performance
- Foreign Key References: Other tables can reference this column as a foreign key
Extended Applications and Variants
Generating IDs with Different Prefixes
Based on business requirements, prefixes can be easily modified to generate different types of identifiers:
-- Customer ID: CUSID00000001
CustomerID AS 'CUSID' + RIGHT('00000000' + CAST(ID AS VARCHAR(8)), 8) PERSISTED
-- Order ID: ORD00000001
OrderID AS 'ORD' + RIGHT('00000000' + CAST(ID AS VARCHAR(8)), 8) PERSISTED
-- Product ID: PROD00000001
ProductID AS 'PROD' + RIGHT('00000000' + CAST(ID AS VARCHAR(8)), 8) PERSISTED
Handling Variable-Length Formats
For scenarios that don't require fixed lengths, simpler formatting approaches can be used:
-- Generate UID1, UID2, ..., UID10 format
SimpleUserID AS 'UID' + CAST(ID AS VARCHAR(10)) PERSISTED
Performance Considerations and Best Practices
Concurrency Safety
This solution relies entirely on SQL Server's IDENTITY mechanism, which has built-in concurrency control to ensure ID uniqueness even in multi-user, high-concurrency environments.
Storage Efficiency
Although computed columns add additional storage overhead, the use of fixed-length string formats makes storage requirements predictable. For large-scale data tables, regular monitoring of storage usage is recommended.
Indexing Strategy
Creating an index on the UserID column can significantly improve query performance based on user IDs:
CREATE INDEX IX_tblUsers_UserID ON dbo.tblUsers(UserID)
Comparison with Alternative Solutions
Comparison with GUID Approach
Compared to the uniqueidentifier data type, this solution offers the following advantages:
- Readability: Generated IDs have clear business meaning and readability
- Sequentiality: IDs increment in insertion order, facilitating sorting and analysis
- Storage Efficiency: Fixed-length strings consume less storage space than GUIDs
Comparison with Application-Generated Solutions
Compared to generating IDs at the application layer, database-level generation provides:
- Data Consistency: Avoids synchronization issues between applications and databases
- Transaction Safety: ID generation and data insertion occur within the same transaction
- Architectural Simplification: Reduces complexity in application code
Practical Application Scenarios
User Management Systems
In user registration systems, automatically generate uniformly formatted user IDs that are easy for users to remember and administrators to identify.
Order Processing Systems
Generate unique order numbers for each new order, such as ORD202400001, containing year information and sequence numbers.
Inventory Management Systems
Assign unique SKU codes to products, combining product category prefixes with sequence numbers for easy inventory management and querying.
Conclusion
By combining IDENTITY columns with persisted computed columns, SQL Server provides a powerful and flexible mechanism for automatic ID generation. This approach not only ensures data uniqueness and integrity but also offers excellent readability and performance. In practical applications, developers can adjust prefixes and formats according to specific business requirements to create the most suitable identifier generation strategy.