Transact-Sql: How to Tokenize a String

Transact-SQL: How do I tokenize a string?

Using a string split function (for example, like this one), you could have something like this:

SELECT t.*
FROM atable t
INNER JOIN dbo.Split(@UserInput, ' ') s ON t.Name LIKE '%' + s.Data + '%'

Can I tokenize a string using t-SQL

The main problem with this type of code is re-use of calculations.

SQL Server is good at caching results (If you type the exact same CHARINDEX() caluculation 5 times, it only calculates once and re-uses that result 4 times).

That's little consolation for the poor coder who has to type or maintain that code though.

SQL Server 2005 onward has CROSS APPLY that does help somewhat. The logic is repeated, but the results can be referenced repeatedly, rather that the calculation typed repeatedly.

SELECT
*,
SUBSTRING(dvt, 1, ISNULL(comma1.pos-1, LEN(dvt)) ) AS item1,
SUBSTRING(dvt, comma1.pos+1, ISNULL(comma2.pos-1, LEN(dvt))-comma1.pos) AS item2,
SUBSTRING(dvt, comma2.pos+1, ISNULL(comma3.pos-1, LEN(dvt))-comma2.pos) AS item3
FROM
(
SELECT 'ab,c,def,hij' AS dvt
UNION ALL
SELECT 'xyz,abc' AS dvt
)
AS data
OUTER APPLY
(SELECT NULLIF(CHARINDEX(',', data.dvt, 1 ), 0) AS pos ) AS comma1
OUTER APPLY
(SELECT NULLIF(CHARINDEX(',', data.dvt, comma1.pos+1), 0) AS pos WHERE comma1.pos > 0) AS comma2
OUTER APPLY
(SELECT NULLIF(CHARINDEX(',', data.dvt, comma2.pos+1), 0) AS pos WHERE comma2.pos > 0) AS comma3
OUTER APPLY
(SELECT NULLIF(CHARINDEX(',', data.dvt, comma3.pos+1), 0) AS pos WHERE comma3.pos > 0) AS comma4


Another option is to simply write a table valued user defined function that does this (even when the result of the function is always one row). Then you simply CROSS APPLY that function.

T-SQL split string

I've used this SQL before which may work for you:-

CREATE FUNCTION dbo.splitstring ( @stringToSplit VARCHAR(MAX) )
RETURNS
@returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN

DECLARE @name NVARCHAR(255)
DECLARE @pos INT

WHILE CHARINDEX(',', @stringToSplit) > 0
BEGIN
SELECT @pos = CHARINDEX(',', @stringToSplit)
SELECT @name = SUBSTRING(@stringToSplit, 1, @pos-1)

INSERT INTO @returnList
SELECT @name

SELECT @stringToSplit = SUBSTRING(@stringToSplit, @pos+1, LEN(@stringToSplit)-@pos)
END

INSERT INTO @returnList
SELECT @stringToSplit

RETURN
END

and to use it:-

SELECT * FROM dbo.splitstring('91,12,65,78,56,789')

How do I split a delimited string so I can access individual items?

You may find the solution in SQL User Defined Function to Parse a Delimited String helpful (from The Code Project).

You can use this simple logic:

Declare @products varchar(200) = '1|20|3|343|44|6|8765'
Declare @individual varchar(20) = null

WHILE LEN(@products) > 0
BEGIN
IF PATINDEX('%|%', @products) > 0
BEGIN
SET @individual = SUBSTRING(@products,
0,
PATINDEX('%|%', @products))
SELECT @individual

SET @products = SUBSTRING(@products,
LEN(@individual + '|') + 1,
LEN(@products))
END
ELSE
BEGIN
SET @individual = @products
SET @products = NULL
SELECT @individual
END
END

SQL server tokenizer

The first question

declare @token varchar(20)
set @token = 'ABC,DEF,GHI'

select len(@token) - len(replace(@token ,',','')) + 1

T-SQL split string based on delimiter

May be this will help you.

SELECT SUBSTRING(myColumn, 1, CASE CHARINDEX('/', myColumn)
WHEN 0
THEN LEN(myColumn)
ELSE CHARINDEX('/', myColumn) - 1
END) AS FirstName
,SUBSTRING(myColumn, CASE CHARINDEX('/', myColumn)
WHEN 0
THEN LEN(myColumn) + 1
ELSE CHARINDEX('/', myColumn) + 1
END, 1000) AS LastName
FROM MyTable

How to split a string in TSQL by space character

Here you go:

First add a space to any comma (you want a comma treated as a word), then split the string on each space into rows using some Json, then assign groups to pair each row using modulo and lag over(), then aggregate based on the groups:

declare @s varchar(100)='This approach is traditional, and is supported in all versions and editions of SQL Server';

select Result = String_Agg(string,' ') within group (order by seq)
from (
select j.[value] string, Iif(j.[key] % 2 = 1, Lag(seq) over(order by seq) ,seq) gp, seq
from OpenJson(Concat('["',replace(Replace(@s,',',' ,'), ' ', '","'), '"]')) j
cross apply(values(Convert(tinyint,j.[key])))x(seq)
)x
group by gp;

Result:

Sample Image

See Demo Fiddle

Capitalize first letter, every word, fix possible string split order issues, multiple delimiters in T-SQL 2017 without using a user-defined function

Started from what @Larnu liked from this answer

/* Following spaces */
IF CHARINDEX(' ', @string)<>0 BEGIN
DECLARE @i INT=@first;
DECLARE @delimiter CHAR(1) =' ';
WHILE @i<=@last BEGIN
SET @string=REPLACE(@string, @delimiter+CHAR(@i), @delimiter+CHAR(@i));
SET @i=@i+1;
END;
END;
/* Following dashes */
IF CHARINDEX('-', @string)<>0 BEGIN
SET @i=@first;
SET @delimiter='-';
WHILE @i<=@last BEGIN
SET @string=REPLACE(@string, @delimiter+CHAR(@i), @delimiter+CHAR(@i));
SET @i=@i+1;
END;
END;
/* First Letter */
SET @string=UPPER(LEFT(@string, 1))+RIGHT(@string, LEN(@string)-1);
RETURN @string;

Then I ended up using the VB function in SSRS. Thanks for your opinion @Larnu

=StrConv(Fields!Column.Value, vbProperCase)

How to split a comma-separated value to columns

CREATE FUNCTION [dbo].[fn_split_string_to_column] (
@string NVARCHAR(MAX),
@delimiter CHAR(1)
)
RETURNS @out_put TABLE (
[column_id] INT IDENTITY(1, 1) NOT NULL,
[value] NVARCHAR(MAX)
)
AS
BEGIN
DECLARE @value NVARCHAR(MAX),
@pos INT = 0,
@len INT = 0

SET @string = CASE
WHEN RIGHT(@string, 1) != @delimiter
THEN @string + @delimiter
ELSE @string
END

WHILE CHARINDEX(@delimiter, @string, @pos + 1) > 0
BEGIN
SET @len = CHARINDEX(@delimiter, @string, @pos + 1) - @pos
SET @value = SUBSTRING(@string, @pos, @len)

INSERT INTO @out_put ([value])
SELECT LTRIM(RTRIM(@value)) AS [column]

SET @pos = CHARINDEX(@delimiter, @string, @pos + @len) + 1
END

RETURN
END


Related Topics



Leave a reply



Submit