Tsql Aggregate String for Group By

How to use GROUP BY to concatenate strings in SQL Server?

No CURSOR, WHILE loop, or User-Defined Function needed.

Just need to be creative with FOR XML and PATH.

[Note: This solution only works on SQL 2005 and later. Original question didn't specify the version in use.]

CREATE TABLE #YourTable ([ID] INT, [Name] CHAR(1), [Value] INT)

INSERT INTO #YourTable ([ID],[Name],[Value]) VALUES (1,'A',4)
INSERT INTO #YourTable ([ID],[Name],[Value]) VALUES (1,'B',8)
INSERT INTO #YourTable ([ID],[Name],[Value]) VALUES (2,'C',9)

SELECT
[ID],
STUFF((
SELECT ', ' + [Name] + ':' + CAST([Value] AS VARCHAR(MAX))
FROM #YourTable
WHERE (ID = Results.ID)
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
,1,2,'') AS NameValues
FROM #YourTable Results
GROUP BY ID

DROP TABLE #YourTable

Optimal way to concatenate/aggregate strings

SOLUTION

The definition of optimal can vary, but here's how to concatenate strings from different rows using regular Transact SQL, which should work fine in Azure.

;WITH Partitioned AS
(
SELECT
ID,
Name,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Name) AS NameNumber,
COUNT(*) OVER (PARTITION BY ID) AS NameCount
FROM dbo.SourceTable
),
Concatenated AS
(
SELECT
ID,
CAST(Name AS nvarchar) AS FullName,
Name,
NameNumber,
NameCount
FROM Partitioned
WHERE NameNumber = 1

UNION ALL

SELECT
P.ID,
CAST(C.FullName + ', ' + P.Name AS nvarchar),
P.Name,
P.NameNumber,
P.NameCount
FROM Partitioned AS P
INNER JOIN Concatenated AS C
ON P.ID = C.ID
AND P.NameNumber = C.NameNumber + 1
)
SELECT
ID,
FullName
FROM Concatenated
WHERE NameNumber = NameCount

EXPLANATION

The approach boils down to three steps:

  1. Number the rows using OVER and PARTITION grouping and ordering them as needed for the concatenation. The result is Partitioned CTE. We keep counts of rows in each partition to filter the results later.

  2. Using recursive CTE (Concatenated) iterate through the row numbers (NameNumber column) adding Name values to FullName column.

  3. Filter out all results but the ones with the highest NameNumber.

Please keep in mind that in order to make this query predictable one has to define both grouping (for example, in your scenario rows with the same ID are concatenated) and sorting (I assumed that you simply sort the string alphabetically before concatenation).

I've quickly tested the solution on SQL Server 2012 with the following data:

INSERT dbo.SourceTable (ID, Name)
VALUES
(1, 'Matt'),
(1, 'Rocks'),
(2, 'Stylus'),
(3, 'Foo'),
(3, 'Bar'),
(3, 'Baz')

The query result:

ID          FullName
----------- ------------------------------
2 Stylus
3 Bar, Baz, Foo
1 Matt, Rocks

SQL string aggregation based on a group by

Sybase supports the LIST() function. So:

select chrs, count(*) as cnt
from (select t.number, list(t.chr, '' order by t.chr) as chrs
from t
group by t.number
) n
group by chrs
order by count(*) desc;

Does T-SQL have an aggregate function to concatenate strings?

for SQL Server 2017 and up use:

STRING_AGG()

set nocount on;
declare @YourTable table (RowID int, HeaderValue int, ChildValue varchar(5))
insert into @YourTable VALUES (1,1,'CCC')
insert into @YourTable VALUES (2,2,'B<&>B')
insert into @YourTable VALUES (3,2,'AAA')
insert into @YourTable VALUES (4,3,'<br>')
insert into @YourTable VALUES (5,3,'A & Z')
set nocount off
SELECT
t1.HeaderValue
,STUFF(
(SELECT
', ' + t2.ChildValue
FROM @YourTable t2
WHERE t1.HeaderValue=t2.HeaderValue
ORDER BY t2.ChildValue
FOR XML PATH(''), TYPE
).value('.','varchar(max)')
,1,2, ''
) AS ChildValues
FROM @YourTable t1
GROUP BY t1.HeaderValue

SELECT
HeaderValue, STRING_AGG(ChildValue,', ')
FROM @YourTable
GROUP BY HeaderValue

OUTPUT:

HeaderValue 
----------- -------------
1 CCC
2 B<&>B, AAA
3 <br>, A & Z

(3 rows affected)

for SQL Server 2005 and up to 2016, you need to do something like this:

--Concatenation with FOR XML and eleminating control/encoded character expansion "& < >"
set nocount on;
declare @YourTable table (RowID int, HeaderValue int, ChildValue varchar(5))
insert into @YourTable VALUES (1,1,'CCC')
insert into @YourTable VALUES (2,2,'B<&>B')
insert into @YourTable VALUES (3,2,'AAA')
insert into @YourTable VALUES (4,3,'<br>')
insert into @YourTable VALUES (5,3,'A & Z')
set nocount off
SELECT
t1.HeaderValue
,STUFF(
(SELECT
', ' + t2.ChildValue
FROM @YourTable t2
WHERE t1.HeaderValue=t2.HeaderValue
ORDER BY t2.ChildValue
FOR XML PATH(''), TYPE
).value('.','varchar(max)')
,1,2, ''
) AS ChildValues
FROM @YourTable t1
GROUP BY t1.HeaderValue

OUTPUT:

HeaderValue ChildValues
----------- -------------------
1 CCC
2 AAA, B<&>B
3 <br>, A & Z

(3 row(s) affected)

Also, watch out, not all FOR XML PATH concatenations will properly handle XML special characters like my above example will.

Grouped string aggregation / LISTAGG for SQL Server

http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/

It is an interesting problem in Transact SQL, for which there are a number of solutions and considerable debate. How do you go about producing a summary result in which a distinguishing column from each row in each particular category is listed in a 'aggregate' column? A simple, and intuitive way of displaying data is surprisingly difficult to achieve. Anith Sen gives a summary of different ways, and offers words of caution over the one you choose...

Aggregate string values to a list

Based on this SO question, Redshift now has a LISTAGG() analytic function which you can use.

SELECT CUST_ID,
LISTAGG("ORDER", ', ')
WITHIN GROUP (ORDER BY "ORDER")
OVER (PARTITION BY CUST_ID) AS CUST_ID
FROM Table
ORDER BY CUST_ID

How to concatenate text from multiple rows into a single text string in SQL Server

If you are on SQL Server 2017 or Azure, see Mathieu Renda answer.

I had a similar issue when I was trying to join two tables with one-to-many relationships. In SQL 2005 I found that XML PATH method can handle the concatenation of the rows very easily.

If there is a table called STUDENTS

SubjectID       StudentName
---------- -------------
1 Mary
1 John
1 Sam
2 Alaina
2 Edward

Result I expected was:

SubjectID       StudentName
---------- -------------
1 Mary, John, Sam
2 Alaina, Edward

I used the following T-SQL:

SELECT Main.SubjectID,
LEFT(Main.Students,Len(Main.Students)-1) As "Students"
FROM
(
SELECT DISTINCT ST2.SubjectID,
(
SELECT ST1.StudentName + ',' AS [text()]
FROM dbo.Students ST1
WHERE ST1.SubjectID = ST2.SubjectID
ORDER BY ST1.SubjectID
FOR XML PATH (''), TYPE
).value('text()[1]','nvarchar(max)') [Students]
FROM dbo.Students ST2
) [Main]

You can do the same thing in a more compact way if you can concat the commas at the beginning and use substring to skip the first one so you don't need to do a sub-query:

SELECT DISTINCT ST2.SubjectID, 
SUBSTRING(
(
SELECT ','+ST1.StudentName AS [text()]
FROM dbo.Students ST1
WHERE ST1.SubjectID = ST2.SubjectID
ORDER BY ST1.SubjectID
FOR XML PATH (''), TYPE
).value('text()[1]','nvarchar(max)'), 2, 1000) [Students]
FROM dbo.Students ST2


Related Topics



Leave a reply



Submit