Sql: Aggregating Strings Together

Optimal way to concatenate/aggregate strings

SOLUTION

The definition of optimal can vary, but here's how to concatenate strings from different rows using regular Transact SQL, which should work fine in Azure.

;WITH Partitioned AS
(
SELECT
ID,
Name,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Name) AS NameNumber,
COUNT(*) OVER (PARTITION BY ID) AS NameCount
FROM dbo.SourceTable
),
Concatenated AS
(
SELECT
ID,
CAST(Name AS nvarchar) AS FullName,
Name,
NameNumber,
NameCount
FROM Partitioned
WHERE NameNumber = 1

UNION ALL

SELECT
P.ID,
CAST(C.FullName + ', ' + P.Name AS nvarchar),
P.Name,
P.NameNumber,
P.NameCount
FROM Partitioned AS P
INNER JOIN Concatenated AS C
ON P.ID = C.ID
AND P.NameNumber = C.NameNumber + 1
)
SELECT
ID,
FullName
FROM Concatenated
WHERE NameNumber = NameCount

EXPLANATION

The approach boils down to three steps:

  1. Number the rows using OVER and PARTITION grouping and ordering them as needed for the concatenation. The result is Partitioned CTE. We keep counts of rows in each partition to filter the results later.

  2. Using recursive CTE (Concatenated) iterate through the row numbers (NameNumber column) adding Name values to FullName column.

  3. Filter out all results but the ones with the highest NameNumber.

Please keep in mind that in order to make this query predictable one has to define both grouping (for example, in your scenario rows with the same ID are concatenated) and sorting (I assumed that you simply sort the string alphabetically before concatenation).

I've quickly tested the solution on SQL Server 2012 with the following data:

INSERT dbo.SourceTable (ID, Name)
VALUES
(1, 'Matt'),
(1, 'Rocks'),
(2, 'Stylus'),
(3, 'Foo'),
(3, 'Bar'),
(3, 'Baz')

The query result:

ID          FullName
----------- ------------------------------
2 Stylus
3 Bar, Baz, Foo
1 Matt, Rocks

How to use GROUP BY to concatenate strings in SQL Server?

No CURSOR, WHILE loop, or User-Defined Function needed.

Just need to be creative with FOR XML and PATH.

[Note: This solution only works on SQL 2005 and later. Original question didn't specify the version in use.]

CREATE TABLE #YourTable ([ID] INT, [Name] CHAR(1), [Value] INT)

INSERT INTO #YourTable ([ID],[Name],[Value]) VALUES (1,'A',4)
INSERT INTO #YourTable ([ID],[Name],[Value]) VALUES (1,'B',8)
INSERT INTO #YourTable ([ID],[Name],[Value]) VALUES (2,'C',9)

SELECT
[ID],
STUFF((
SELECT ', ' + [Name] + ':' + CAST([Value] AS VARCHAR(MAX))
FROM #YourTable
WHERE (ID = Results.ID)
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
,1,2,'') AS NameValues
FROM #YourTable Results
GROUP BY ID

DROP TABLE #YourTable

SQL: Aggregating strings together

WITH Data AS (
SELECT 1 UserId, 'A' Code
UNION ALL
SELECT 1, 'C5'
UNION ALL
SELECT 1, 'X'
UNION ALL
SELECT 2, 'V3'
UNION ALL
SELECT 3, 'B'
UNION ALL
SELECT 3, 'D'
UNION ALL
SELECT 3, NULL
UNION ALL
SELECT 3, 'F4'
UNION ALL
SELECT 4, NULL
)
SELECT U.UserId, STUFF((
SELECT ','+Code FROM Data WHERE Data.UserID = U.UserID FOR XML PATH('')
), 1, 1, '') Code
FROM (SELECT DISTINCT UserID FROM Data) U

Just replace the Data CTE with your table name and you're done.

Does T-SQL have an aggregate function to concatenate strings?

for SQL Server 2017 and up use:

STRING_AGG()

set nocount on;
declare @YourTable table (RowID int, HeaderValue int, ChildValue varchar(5))
insert into @YourTable VALUES (1,1,'CCC')
insert into @YourTable VALUES (2,2,'B<&>B')
insert into @YourTable VALUES (3,2,'AAA')
insert into @YourTable VALUES (4,3,'<br>')
insert into @YourTable VALUES (5,3,'A & Z')
set nocount off
SELECT
t1.HeaderValue
,STUFF(
(SELECT
', ' + t2.ChildValue
FROM @YourTable t2
WHERE t1.HeaderValue=t2.HeaderValue
ORDER BY t2.ChildValue
FOR XML PATH(''), TYPE
).value('.','varchar(max)')
,1,2, ''
) AS ChildValues
FROM @YourTable t1
GROUP BY t1.HeaderValue

SELECT
HeaderValue, STRING_AGG(ChildValue,', ')
FROM @YourTable
GROUP BY HeaderValue

OUTPUT:

HeaderValue 
----------- -------------
1 CCC
2 B<&>B, AAA
3 <br>, A & Z

(3 rows affected)

for SQL Server 2005 and up to 2016, you need to do something like this:

--Concatenation with FOR XML and eleminating control/encoded character expansion "& < >"
set nocount on;
declare @YourTable table (RowID int, HeaderValue int, ChildValue varchar(5))
insert into @YourTable VALUES (1,1,'CCC')
insert into @YourTable VALUES (2,2,'B<&>B')
insert into @YourTable VALUES (3,2,'AAA')
insert into @YourTable VALUES (4,3,'<br>')
insert into @YourTable VALUES (5,3,'A & Z')
set nocount off
SELECT
t1.HeaderValue
,STUFF(
(SELECT
', ' + t2.ChildValue
FROM @YourTable t2
WHERE t1.HeaderValue=t2.HeaderValue
ORDER BY t2.ChildValue
FOR XML PATH(''), TYPE
).value('.','varchar(max)')
,1,2, ''
) AS ChildValues
FROM @YourTable t1
GROUP BY t1.HeaderValue

OUTPUT:

HeaderValue ChildValues
----------- -------------------
1 CCC
2 AAA, B<&>B
3 <br>, A & Z

(3 row(s) affected)

Also, watch out, not all FOR XML PATH concatenations will properly handle XML special characters like my above example will.

How to concatenate text from multiple rows into a single text string in SQL Server

If you are on SQL Server 2017 or Azure, see Mathieu Renda answer.

I had a similar issue when I was trying to join two tables with one-to-many relationships. In SQL 2005 I found that XML PATH method can handle the concatenation of the rows very easily.

If there is a table called STUDENTS

SubjectID       StudentName
---------- -------------
1 Mary
1 John
1 Sam
2 Alaina
2 Edward

Result I expected was:

SubjectID       StudentName
---------- -------------
1 Mary, John, Sam
2 Alaina, Edward

I used the following T-SQL:

SELECT Main.SubjectID,
LEFT(Main.Students,Len(Main.Students)-1) As "Students"
FROM
(
SELECT DISTINCT ST2.SubjectID,
(
SELECT ST1.StudentName + ',' AS [text()]
FROM dbo.Students ST1
WHERE ST1.SubjectID = ST2.SubjectID
ORDER BY ST1.SubjectID
FOR XML PATH (''), TYPE
).value('text()[1]','nvarchar(max)') [Students]
FROM dbo.Students ST2
) [Main]

You can do the same thing in a more compact way if you can concat the commas at the beginning and use substring to skip the first one so you don't need to do a sub-query:

SELECT DISTINCT ST2.SubjectID, 
SUBSTRING(
(
SELECT ','+ST1.StudentName AS [text()]
FROM dbo.Students ST1
WHERE ST1.SubjectID = ST2.SubjectID
ORDER BY ST1.SubjectID
FOR XML PATH (''), TYPE
).value('text()[1]','nvarchar(max)'), 2, 1000) [Students]
FROM dbo.Students ST2

How to concatenate strings of a string field in a PostgreSQL 'group by' query?

PostgreSQL 9.0 or later:

Modern Postgres (since 2010) has the string_agg(expression, delimiter) function which will do exactly what the asker was looking for:

SELECT company_id, string_agg(employee, ', ')
FROM mytable
GROUP BY company_id;

Postgres 9 also added the ability to specify an ORDER BY clause in any aggregate expression; otherwise you have to order all your results or deal with an undefined order. So you can now write:

SELECT company_id, string_agg(employee, ', ' ORDER BY employee)
FROM mytable
GROUP BY company_id;

PostgreSQL 8.4.x:

PostgreSQL 8.4 (in 2009) introduced the aggregate function array_agg(expression) which collects the values in an array. Then array_to_string() can be used to give the desired result:

SELECT company_id, array_to_string(array_agg(employee), ', ')
FROM mytable
GROUP BY company_id;

PostgreSQL 8.3.x and older:

When this question was originally posed, there was no built-in aggregate function to concatenate strings. The simplest custom implementation (suggested by Vajda Gabo in this mailing list post, among many others) is to use the built-in textcat function (which lies behind the || operator):

CREATE AGGREGATE textcat_all(
basetype = text,
sfunc = textcat,
stype = text,
initcond = ''
);

Here is the CREATE AGGREGATE documentation.

This simply glues all the strings together, with no separator. In order to get a ", " inserted in between them without having it at the end, you might want to make your own concatenation function and substitute it for the "textcat" above. Here is one I put together and tested on 8.3.12:

CREATE FUNCTION commacat(acc text, instr text) RETURNS text AS $$
BEGIN
IF acc IS NULL OR acc = '' THEN
RETURN instr;
ELSE
RETURN acc || ', ' || instr;
END IF;
END;
$$ LANGUAGE plpgsql;

This version will output a comma even if the value in the row is null or empty, so you get output like this:

a, b, c, , e, , g

If you would prefer to remove extra commas to output this:

a, b, c, e, g

Then add an ELSIF check to the function like this:

CREATE FUNCTION commacat_ignore_nulls(acc text, instr text) RETURNS text AS $$
BEGIN
IF acc IS NULL OR acc = '' THEN
RETURN instr;
ELSIF instr IS NULL OR instr = '' THEN
RETURN acc;
ELSE
RETURN acc || ', ' || instr;
END IF;
END;
$$ LANGUAGE plpgsql;

String aggregate using a variable

From nvarchar concatenation / index / nvarchar(max) inexplicable behavior
"The correct behavior for an aggregate concatenation query is undefined."

How to use GROUP BY to concatenate strings in MySQL?

SELECT id, GROUP_CONCAT(name SEPARATOR ' ') FROM table GROUP BY id;

https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_group-concat

From the link above, GROUP_CONCAT: This function returns a string result with the concatenated non-NULL values from a group. It returns NULL if there are no non-NULL values.

How to aggregate and string concatenate at same time in Oracle?

Would this do?

SQL> with test (customer, product, amount) as
2 (select 'a', 'table', 500 from dual union all
3 select 'a', 'table', 300 from dual union all
4 select 'a', 'chair', 100 from dual union all
5 select 'b', 'rug' , 50 from dual union all
6 select 'b', 'chair', 200 from dual
7 )
8 select customer,
9 listagg (product, ', ') within group (order by null) product,
10 sum(sum_amount) amount
11 from (select customer, product, sum(amount) sum_amount
12 from test
13 group by customer, product
14 )
15 group by customer
16 order by customer;

C PRODUCT AMOUNT
- -------------------- ----------
a chair, table 900
b chair, rug 250

SQL>

Sql PIVOT and string concatenation aggregate

In order to get the result, first you should concatenate the values into the comma separated list.

I would use CROSS APPLY and FOR XML PATH:

SELECT distinct e.[Event Name],
e.[Resource Type],
LEFT(r.ResourceName , LEN(r.ResourceName)-1) ResourceName
FROM yourtable e
CROSS APPLY
(
SELECT r.[Resource Name] + ', '
FROM yourtable r
where e.[Event Name] = r.[Event Name]
and e.[Resource Type] = r.[Resource Type]
FOR XML PATH('')
) r (ResourceName)

See SQL Fiddle with Demo. The gives you result:

| EVENT NAME |   RESOURCE TYPE |                          RESOURCENAME |
------------------------------------------------------------------------
| Event 1 | Resource Type 1 | Resource 1, Resource 2 |
| Event 1 | Resource Type 2 | Resource 3, Resource 4 |
| Event 1 | Resource Type 3 | Resource 5, Resource 6, Resource 7 |
| Event 1 | Resource Type 4 | Resource 8 |
| Event 2 | Resource Type 2 | Resource 3 |
| Event 2 | Resource Type 3 | Resource 11, Resource 12, Resource 13 |
| Event 2 | Resource Type 4 | Resource 14 |
| Event 2 | Resource Type 5 | Resource 1, Resource 9, Resource 16 |

Then you will apply your PIVOT to this result:

SELECT [Event Name],
[Resource Type 1], [Resource Type 2],
[Resource Type 3], [Resource Type 4],
[Resource Type 5]
FROM
(
SELECT distinct e.[Event Name],
e.[Resource Type],
LEFT(r.ResourceName , LEN(r.ResourceName)-1) ResourceName
FROM yourtable e
CROSS APPLY
(
SELECT r.[Resource Name] + ', '
FROM yourtable r
where e.[Event Name] = r.[Event Name]
and e.[Resource Type] = r.[Resource Type]
FOR XML PATH('')
) r (ResourceName)
) src
pivot
(
max(ResourceName)
for [Resource Type] in ([Resource Type 1], [Resource Type 2],
[Resource Type 3], [Resource Type 4],
[Resource Type 5])
) piv

See SQL Fiddle with Demo. Your final result will then be:

| EVENT NAME |        RESOURCE TYPE 1 |        RESOURCE TYPE 2 |                       RESOURCE TYPE 3 | RESOURCE TYPE 4 |                     RESOURCE TYPE 5 |
----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Event 1 | Resource 1, Resource 2 | Resource 3, Resource 4 | Resource 5, Resource 6, Resource 7 | Resource 8 | (null) |
| Event 2 | (null) | Resource 3 | Resource 11, Resource 12, Resource 13 | Resource 14 | Resource 1, Resource 9, Resource 16 |


Related Topics



Leave a reply



Submit