Splitting variable length delimited string across multiple rows (SQL)
First, let me just say that this is the reason that you shouln't have comma separated data in a field in the first place. There is no easy or efficient way to work with it.
That said, you can use a recursive query to split the string and get the numbers from it:
with split as
(
select
item = cast('' as varchar(max)),
source = cast('20,0, 5,,^24,0, 0,,^26,0, 0,,^281,0, 0,,^34,0, 2,,^48,0, 2,,^44,0, 2,,^20,0, 10,,^20,5, 5,,^379,1, 1,,^26,1, 2,,^32,0, 1,,^71,0, 2,,^' as varchar(max))
union all
select
item = substring(source, 1, charindex(',,', source)),
source = substring(source, charindex(',,', source) + 2, 10000)
from split
where source > ''
)
select substring(item, 1, charindex(',', item) -1)
from split
where item > ''
Result:
20
^24
^26
^281
^34
^48
^44
^20
^20
^379
^26
^32
^71
Split string into multiple rows and columns
I rewrote this as a set based query instead of a cursor and this whole script, including the generation of 100k test records runs in about 40 seconds on my Azure SQL DB, which is a Serverless gen 5 with only 1 vCore. Work through the script to make sure you understand it.
NB I'm dropping the tables as this is a test rig - not production code:
------------------------------------------------------------------------------------------------
-- Setup START
------------------------------------------------------------------------------------------------
DROP TABLE IF EXISTS dbo.sourceTable
DROP TABLE IF EXISTS dbo.finalTable
GO
CREATE TABLE dbo.sourceTable
(
PersonId INT IDENTITY PRIMARY KEY,
LocationId INT,
StartDate DATE,
AttendanceString VARCHAR(1000)
)
GO
CREATE TABLE dbo.finalTable
(
DayDate DATE,
LocationId INT,
PersonId INT,
MorningValue CHAR(1),
AfternoonValue CHAR(1)
)
GO
-- Generate some test data
SET IDENTITY_INSERT dbo.sourceTable ON
INSERT INTO dbo.sourceTable ( PersonId, LocationId, StartDate, AttendanceString )
VALUES
( 123, 987, '2018-09-01', 'XXXXZZZZ######PPLL' ),
( 234, 678, '2018-10-01', 'PPPPLL######ZZZZXX' ),
( 567, 999, '2018-10-01', 'abcdefghijklmnopqr' )
SET IDENTITY_INSERT dbo.sourceTable OFF
GO
-- Setup END
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-- Test Data START
------------------------------------------------------------------------------------------------
;WITH cte AS (
SELECT 1 rn, 1 locationId, CAST( '1 Jan 2018' AS DATE ) startDate, REPLACE( NEWID(), '-', '' ) AttendanceString
UNION ALL
SELECT rn + 1, rn % 42, DATEADD( day, 1, startDate ), REPLACE( NEWID(), '-', '' )
FROM cte
WHERE rn < 100
)
INSERT INTO dbo.sourceTable ( LocationId, StartDate, AttendanceString )
SELECT LocationId, StartDate, AttendanceString
FROM cte
ORDER BY 1;
GO 1000
-- Test Data END
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-- Rewritten query START
------------------------------------------------------------------------------------------------
DROP TABLE IF EXISTS #tmp;
;WITH cte AS (
SELECT 1 n, 1 x, 2 y
UNION ALL
SELECT n + 1, x + 2, y + 2
FROM cte
WHERE n < 20
)
SELECT
personId,
locationId,
DATEADD( day, c.n - 1, startDate ) xdate,
SUBSTRING ( attendanceString, c.x, 1 ) a,
SUBSTRING ( attendanceString, c.y, 1 ) b
INTO #tmp
FROM dbo.sourceTable s
CROSS APPLY cte c
WHERE c.y <= LEN(attendanceString);
select *
from sourceTable
WHERE personId = 999
select *
from #tmp
WHERE personId = 999
select *
from #tmp
WHERE locationId = 999
-- Rewritten query END
------------------------------------------------------------------------------------------------
Altered version of the script for longer attendanceIds here.
Split string of variable length, variable delimiters
You can also use a string splitter. Here is an excellent one that works with your version. DelimitedSplit8K
Now we need some sample data.
declare @Something table
(
Category varchar(100)
)
insert @Something values
('Zoo - Animals - Lions')
, ('Zoo - Personnel')
, ('Zoo - Operating Costs - Power / Cooling')
Now that we have a function and sample data the code for this is quite nice and tidy.
select s.Category
, Location = max(case when x.ItemNumber = 1 then Item end)
, Category = max(case when x.ItemNumber = 2 then Item end)
, SubCategory = max(case when x.ItemNumber = 3 then Item end)
from @Something s
cross apply dbo.DelimitedSplit8K(s.Category, '-') x
group by s.Category
And this will return:
Category |Location|Category |SubCategory
Zoo - Animals - Lions |Zoo |Animals |Lions
Zoo - Operating Costs - Power / Cooling |Zoo |Operating Costs|Power / Cooling
Zoo - Personnel |Zoo |Personnel |NULL
SQL split values to multiple rows
If you can create a numbers table, that contains numbers from 1 to the maximum fields to split, you could use a solution like this:
select
tablename.id,
SUBSTRING_INDEX(SUBSTRING_INDEX(tablename.name, ',', numbers.n), ',', -1) name
from
numbers inner join tablename
on CHAR_LENGTH(tablename.name)
-CHAR_LENGTH(REPLACE(tablename.name, ',', ''))>=numbers.n-1
order by
id, n
Please see fiddle here.
If you cannot create a table, then a solution can be this:
select
tablename.id,
SUBSTRING_INDEX(SUBSTRING_INDEX(tablename.name, ',', numbers.n), ',', -1) name
from
(select 1 n union all
select 2 union all select 3 union all
select 4 union all select 5) numbers INNER JOIN tablename
on CHAR_LENGTH(tablename.name)
-CHAR_LENGTH(REPLACE(tablename.name, ',', ''))>=numbers.n-1
order by
id, n
an example fiddle is here.
Split String into given length but do not split word
Assuming you have a table of addresses, I'd use a recursive CTE.
On each iteration, find the last possible space to break on, then start the next iteration for the character after the space.
- take 31 characters
- reverse them
- find the position of the first space
Extra care to be taken for corner cases:
- The remaining string to be searched is less than 30 characters
- The current string being searched has no space in the first 31 characters
Using the following test data...
CREATE TABLE test (
address VARCHAR(MAX)
);
INSERT INTO
test
VALUES
('216 Apartment123 AreaArea SampleWord1 Word2 MiddleTown Upper1Location Another5 NewYork'),
('216 Apartment123 AreaArea SampleWord1 Word2 MiddleTownxx Upper1LocationUpper1LocationUpper1Location Another5 NewYork'),
('216 Apartment123 AreaArea SampleWord1 Word2 MiddleTownxx Upper1LocationUpper1LocationUpper1Location Another5 NewYork x')
;
Using the following CTE...
DECLARE @chars BIGINT = 30;
WITH
parts AS
(
SELECT
address,
LEN(address) AS length,
CAST(0 AS BIGINT) AS last_space,
CAST(1 AS BIGINT) AS next,
address AS fragment
FROM
test
UNION ALL
SELECT
parts.address,
parts.length,
last_space.pos,
parts.next + COALESCE(last_space.pos, @chars),
SUBSTRING(parts.address, parts.next, COALESCE(last_space.pos - 1, @chars))
FROM
parts
CROSS APPLY
(
SELECT
@chars + 2
-
NULLIF(
CHARINDEX(
' ',
REVERSE(
SUBSTRING(
parts.address + ' ',
parts.next,
@chars + 1
)
)
)
, 0
)
)
last_space(pos)
WHERE
parts.next <= parts.length
)
SELECT
*, len(fragment) AS chars
FROM
parts
WHERE
next > 1
ORDER BY
address,
next
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=acd11f2bc73e5036bd82498ecf14b08f
Split value into multiple rows
You are on the right track with the table of numbers. You should start by adding more rows, so it matches (or exceeds) the maximum possible number of elements in a CSV list.
Then, you can use a join condition to generate the relevant number of rows per name only
select
t.*,
substring_index(substring_index(t.name, ';', n.n), ';', -1) name
from numbers n
inner join testLocation t
on n <= length(t.name) - length(replace(t.name, ';', '')) + 1
Demo on DB Fiddle (I expanded the numbers to 8
):
id | State | name | name
-: | :------ | :-------------------------------------- | :-----------
1 | Alabama | Birmingham;Huntsville;Mobile;Montgomery | Birmingham
1 | Alabama | Birmingham;Huntsville;Mobile;Montgomery | Huntsville
1 | Alabama | Birmingham;Huntsville;Mobile;Montgomery | Mobile
1 | Alabama | Birmingham;Huntsville;Mobile;Montgomery | Montgomery
2 | Florida | Tampa;Jacksonville;Destin | Tampa
2 | Florida | Tampa;Jacksonville;Destin | Jacksonville
2 | Florida | Tampa;Jacksonville;Destin | Destin
Note, that, as commented already by others, storing CSV lists in a database is bad practice and should almost always be avoided. Recommended related reading: Is storing a delimited list in a database column really that bad?
SQL server split string into columns by delimiter (dynamic length)
An xml-based solution
declare @tmp table (STRING varchar(500))
insert into @tmp
values
('AA.0.HJ')
,('AABBCC.099.0')
,('0.91.JAH21')
;WITH Splitted
AS (
SELECT STRING
,CAST('<x>' + REPLACE(STRING, '.', '</x><x>') + '</x>' AS XML) AS Parts
FROM @tmp
)
SELECT STRING
,Parts.value(N'/x[1]', 'varchar(50)') AS [First]
,Parts.value(N'/x[2]', 'varchar(50)') AS [Second]
,Parts.value(N'/x[3]', 'varchar(50)') AS [Third]
FROM Splitted;
Output:
Related Topics
Find First Non-Null Values for Multiple Columns
SQL Server: Any Equivalent of Strpos()
Oracle Insert Failure:Not a Valid Month
Rails, Ransack: How to Search Habtm Relationship for "All" Matches Instead of "Any"
Using Pivot to Flip Data from Wide to Tall
Select the First 3 Rows of Each Table in a Database
SQL - Use a Reference of a Cte to Another Cte
Native Insert Query in Hibernate + Spring Data
How to Use a SQL Select Statement with Access Vba
Query Featuring Outer Joins Behaves Differently in Oracle 12C
Powershell SQL Select Output to Variable
What Is the Equivalent of Regexp_Substr in MySQL
SQL - Should I Use a Junction Table or Not
Merge Overlapping Time Intervals, How