Keywords: T-SQL | String Splitting | SQL Server 2008 | Custom Functions | XML Parsing | Recursive CTE
Abstract: This article provides an in-depth exploration of various methods for implementing string splitting functionality in SQL Server 2008 and later versions, focusing on solutions based on XML parsing, recursive CTE, and custom functions. Through detailed code examples and performance comparisons, it offers practical guidance for developers to choose appropriate splitting strategies in different scenarios. The article also discusses the advantages, disadvantages, applicable scenarios, and best practices in modern SQL Server versions.
Introduction
In database development, there is often a need to split comma-separated strings into individual rows of data. For example, user input like "1,2,3,4,5" needs to be converted into multiple records for further processing. Although SQL Server 2016 introduced the built-in STRING_SPLIT function, in earlier versions like SQL Server 2008, developers must implement splitting functionality themselves.
XML-Based String Splitting Method
The first common method utilizes XML parsing capabilities. By converting the original string into XML format and then using XQuery to extract node values:
DECLARE @xml xml, @str varchar(100), @delimiter varchar(10)
SET @str = '1,2,3,4,5,6,7,8,9,10,11,12,13,14,15'
SET @delimiter = ','
SET @xml = cast(('<X>'+replace(@str, @delimiter, '</X><X>')+'</X>') as xml)
SELECT C.value('.', 'varchar(10)') as value FROM @xml.nodes('X') as X(C)The core idea of this method is to replace delimiters with XML tags, construct a temporary XML document, and then use the .nodes() method to expand each <X> element into separate rows.
Recursive CTE Splitting Method
The second method uses recursive Common Table Expressions (CTE) to locate delimiter positions:
DECLARE @str varchar(100), @delimiter varchar(10)
SET @str = '1,2,3,4,5,6,7,8,9,10,11,12,13,14,15'
SET @delimiter = ','
;WITH cte AS
(
SELECT 0 a, 1 b
UNION ALL
SELECT b, CHARINDEX(@delimiter, @str, b) + LEN(@delimiter)
FROM CTE
WHERE b > a
)
SELECT SUBSTRING(@str, a,
CASE WHEN b > LEN(@delimiter)
THEN b - a - LEN(@delimiter)
ELSE LEN(@str) - a + 1 END) value
FROM cte WHERE a > 0The recursive CTE divides string boundaries by continuously searching for delimiter positions. The anchor member initializes the search position, and the recursive member progressively moves until the end of the string.
Custom Table-Valued Function Implementation
To provide a reusable solution, a custom table-valued function can be created:
CREATE FUNCTION [dbo].[SDF_SplitString]
(
@sString nvarchar(2048),
@cDelimiter nchar(1)
)
RETURNS @tParts TABLE ( part nvarchar(2048) )
AS
BEGIN
if @sString is null return
declare @iStart int,
@iPos int
if substring( @sString, 1, 1 ) = @cDelimiter
begin
set @iStart = 2
insert into @tParts
values( null )
end
else
set @iStart = 1
while 1=1
begin
set @iPos = charindex( @cDelimiter, @sString, @iStart )
if @iPos = 0
set @iPos = len( @sString )+1
if @iPos - @iStart > 0
insert into @tParts
values ( substring( @sString, @iStart, @iPos-@iStart ))
else
insert into @tParts
values( null )
set @iStart = @iPos+1
if @iStart > len( @sString )
break
end
RETURN
ENDThis function uses a WHILE loop to traverse the string, locates delimiter positions with CHARINDEX, and extracts substrings using SUBSTRING. The function handles special cases where the string starts with a delimiter and returns a table variable containing all split parts.
Built-in Functions in Modern SQL Server
Starting from SQL Server 2016, Microsoft provides the built-in STRING_SPLIT function:
SELECT TRY_CAST(value AS INT)
FROM STRING_SPLIT ('1,2,3,4,5,6,7,8,9,10,11,12,13,14,15', ',')This built-in function greatly simplifies string splitting operations but is not available in SQL Server 2008.
Performance Analysis and Selection Recommendations
When choosing a string splitting method, the following factors should be considered:
The XML method is suitable for simple splitting needs but may have performance issues with large data volumes. The recursive CTE method provides better control but has recursion depth limitations. Custom functions are most suitable for scenarios requiring repeated use, offering the best encapsulation and maintainability.
For SQL Server 2008 environments, custom table-valued functions are recommended as they combine performance, reusability, and error handling capabilities. If upgrading to newer versions, the built-in STRING_SPLIT function should be prioritized.
Conclusion
String splitting is a common requirement in database development, and in SQL Server 2008, it must be implemented through custom solutions. The three methods introduced in this article each have their advantages, and developers should choose the most appropriate solution based on specific scenarios. Custom table-valued functions provide the best balance, ensuring both performance and good maintainability.