Get unique values using STRING_AGG in SQL Server
Use the DISTINCT
keyword in a subquery to remove duplicates before combining the results: SQL Fiddle
SELECT
ProjectID
,STRING_AGG(value, ',') WITHIN GROUP (ORDER BY value) AS
NewField
from (
select distinct ProjectId, newId.value
FROM [dbo].[Data] WITH(NOLOCK)
CROSS APPLY STRING_SPLIT([bID],';') AS newID
WHERE newID.value IN ( 'O95833' , 'Q96NY7-2' )
) x
GROUP BY ProjectID
ORDER BY ProjectID
Produce DISTINCT values in STRING_AGG
Here is one way to do it.
Since you want the distinct counts as well, it can be done simply by grouping the rows twice. The first GROUP BY
will remove duplicates, the second GROUP BY
will produce the final result.
WITH
Sitings
AS
(
SELECT * FROM (VALUES
(1, 'Florida', 'Orlando', 'bird'),
(2, 'Florida', 'Orlando', 'dog'),
(3, 'Arizona', 'Phoenix', 'bird'),
(4, 'Arizona', 'Phoenix', 'dog'),
(5, 'Arizona', 'Phoenix', 'bird'),
(6, 'Arizona', 'Phoenix', 'bird'),
(7, 'Arizona', 'Phoenix', 'bird'),
(8, 'Arizona', 'Flagstaff', 'dog')
) F (ID, State, City, Siting)
)
,CTE_Animals
AS
(
SELECT
State, City, Siting
FROM Sitings
GROUP BY State, City, Siting
)
SELECT
State, City, COUNT(1) AS [# Of Sitings], STRING_AGG(Siting,',') AS Animals
FROM CTE_Animals
GROUP BY State, City
ORDER BY
State
,City
;
Result
+---------+-----------+--------------+----------+
| State | City | # Of Sitings | Animals |
+---------+-----------+--------------+----------+
| Arizona | Flagstaff | 1 | dog |
| Arizona | Phoenix | 2 | bird,dog |
| Florida | Orlando | 2 | bird,dog |
+---------+-----------+--------------+----------+
If you are still getting an error message about exceeding 8000 characters, then cast the values to varchar(max)
before STRING_AGG
.
Something like
STRING_AGG(CAST(Siting AS varchar(max)),',') AS Animals
SQL: How to get distinct values out of string_Agg() function?
You need to subquery it and group again. Note that DISTINCT
is not a function, it acts over the whole resultset, and is the same as grouping by all column.
SELECT
ID
, string_agg(Code, ',') AS Code
, [Year]
FROM (
SELECT
p.ID
, PIT.Code AS Code
, year(PT.Date) AS Year
FROM fact.PreT PT
INNER JOIN dim.ProdIType PIT
ON PIT.ProdITypeSKey = PT.ProdITypeSKey
INNER JOIN dim.Proudct P
ON P.ProductSKey = pt.ProductSKey
WHERE p.ID = '15'
GROUP BY p.ID, year(PT.Date), PIT.Code
) p
GROUP BY p.ID, PT.Year;
Get unique value using STRING_AGG in SQL Server 2017
You can go for first getting unique values and then applying string aggregate like below:
;WITH CTE_UniqueValues
(
SELECT Reported_Name, Entry, MAX(ID) AS ID
FROM Table1
GROUP BY Reported_Name, Entry
)
SELECT T1.REPORTED_NAME, STRING_AGG(CAST(T1.ENTRY AS NVARCHAR(MAX)),',') AS Average_Str
FROM CTE_UniqueValues T1
INNER JOIN Table2 T2 ON T1.ID = T2.ProdID
WHERE T1.ENTRY like '%[A-Za-z]%'
GROUP BY T1.REPORTED_NAME
ORDER BY T1.REPORTED_NAME
SQL Server; How to incorporate unique values from STRING_AGG?
Just put it in a subquery with DISTINCT
SELECT
#fact1.dim1Key,
#fact1.factvalue1,
#fact1.groupKey,
#dim1.attributeTwo,
#dim1.attributeThree,
ISNULL(#dim2.attributeOne, '<missing>')
FROM #fact1
JOIN #dim1 ON #dim1.dim1key = #fact1.dim1key
CROSS APPLY (
SELECT
attributeOne = STRING_AGG(ISNULL(d2.attributeOne, '<missing>'), ', ') WITHIN GROUP (ORDER BY d2.attributeOne)
FROM (
SELECT DISTINCT
#dim2.attributeOne
FROM #bridge b
JOIN #dim2 ON #dim2.dim2key = b.dim2key
WHERE b.groupKey = #fact1.groupKey
) d2
) #dim2
How to use DISTINCT with string_agg() and to_timestamp()?
DISTINCT
is neither a function nor an operator but an SQL construct or syntax element. Can be added as leading keyword to the whole SELECT
list or within most aggregate functions.
Add it to the SELECT
list (consisting of a single column in your case) in a subselect where you can also cheaply add ORDER BY
. Should yield best performance:
SELECT string_agg(to_char(the_date, 'DD-MM-YYYY'), ',') AS the_dates
FROM (
SELECT DISTINCT to_timestamp(from_date / 1000)::date AS the_date
FROM trn_day_bookkeeping_income_expense
WHERE enterprise_id = 5134650
ORDER BY the_date -- assuming this is the order you want
) sub;
First generate dates (multiple distinct values may result in the same date!).
Then the DISTINCT
step (or GROUP BY
).
(While being at it, optionally add ORDER BY
.)
Finally aggregate.
An index on (enterprise_id)
or better (enterprise_id, from_date)
should greatly improve performance.
Ideally, timestamps are stored as type timestamp
to begin with. Or timestamptz
. See:
- Ignoring time zones altogether in Rails and PostgreSQL
DISTINCT ON
is a Postgres-specific extension of standard SQL DISTINCT
functionality. See:
- Select first row in each GROUP BY group?
Alternatively, you could also add DISTINCT
(and ORDER BY
) to the aggregate function string_agg()
directly:
SELECT string_agg(DISTINCT to_char(to_timestamp(from_date / 1000), 'DD-MM-YYYY'), ',' ORDER BY to_char(to_timestamp(from_date / 1000), 'DD-MM-YYYY')) AS the_dates
FROM trn_day_bookkeeping_income_expense
WHERE enterprise_id = 5134650
But that would be ugly, hard to read and maintain, and more expensive. (Test with EXPLAIN ANALYZE
).
Related Topics
How to Concatenate All Columns in a Select with SQL Server
SQL Sort Order with Null Values Last
SQL to Determine Minimum Sequential Days of Access
Copy Table Structure into New Table
Oracle: How to "Group By" Over a Range
Order by Descending Date - Month, Day and Year
Django Prefetch_Related with Limit
Update Multiple Columns in a Trigger Function in Plpgsql
Is There a Quick Way to Check If Any Column Is Null
SQL Set Default Not Working in Ms Access
Sql: Using Dateadd with Bigints
How to Check for Is Not Null and Is Not Empty String in SQL Server
SQL Server Plans:Difference Between Index Scan/Index Seek