Produce DISTINCT values in STRING_AGG
Here is one way to do it.
Since you want the distinct counts as well, it can be done simply by grouping the rows twice. The first GROUP BY
will remove duplicates, the second GROUP BY
will produce the final result.
WITH
Sitings
AS
(
SELECT * FROM (VALUES
(1, 'Florida', 'Orlando', 'bird'),
(2, 'Florida', 'Orlando', 'dog'),
(3, 'Arizona', 'Phoenix', 'bird'),
(4, 'Arizona', 'Phoenix', 'dog'),
(5, 'Arizona', 'Phoenix', 'bird'),
(6, 'Arizona', 'Phoenix', 'bird'),
(7, 'Arizona', 'Phoenix', 'bird'),
(8, 'Arizona', 'Flagstaff', 'dog')
) F (ID, State, City, Siting)
)
,CTE_Animals
AS
(
SELECT
State, City, Siting
FROM Sitings
GROUP BY State, City, Siting
)
SELECT
State, City, COUNT(1) AS [# Of Sitings], STRING_AGG(Siting,',') AS Animals
FROM CTE_Animals
GROUP BY State, City
ORDER BY
State
,City
;
Result
+---------+-----------+--------------+----------+
| State | City | # Of Sitings | Animals |
+---------+-----------+--------------+----------+
| Arizona | Flagstaff | 1 | dog |
| Arizona | Phoenix | 2 | bird,dog |
| Florida | Orlando | 2 | bird,dog |
+---------+-----------+--------------+----------+
If you are still getting an error message about exceeding 8000 characters, then cast the values to varchar(max)
before STRING_AGG
.
Something like
STRING_AGG(CAST(Siting AS varchar(max)),',') AS Animals
Get unique values using STRING_AGG in SQL Server
Use the DISTINCT
keyword in a subquery to remove duplicates before combining the results: SQL Fiddle
SELECT
ProjectID
,STRING_AGG(value, ',') WITHIN GROUP (ORDER BY value) AS
NewField
from (
select distinct ProjectId, newId.value
FROM [dbo].[Data] WITH(NOLOCK)
CROSS APPLY STRING_SPLIT([bID],';') AS newID
WHERE newID.value IN ( 'O95833' , 'Q96NY7-2' )
) x
GROUP BY ProjectID
ORDER BY ProjectID
How to use DISTINCT with string_agg() and to_timestamp()?
DISTINCT
is neither a function nor an operator but an SQL construct or syntax element. Can be added as leading keyword to the whole SELECT
list or within most aggregate functions.
Add it to the SELECT
list (consisting of a single column in your case) in a subselect where you can also cheaply add ORDER BY
. Should yield best performance:
SELECT string_agg(to_char(the_date, 'DD-MM-YYYY'), ',') AS the_dates
FROM (
SELECT DISTINCT to_timestamp(from_date / 1000)::date AS the_date
FROM trn_day_bookkeeping_income_expense
WHERE enterprise_id = 5134650
ORDER BY the_date -- assuming this is the order you want
) sub;
First generate dates (multiple distinct values may result in the same date!).
Then the DISTINCT
step (or GROUP BY
).
(While being at it, optionally add ORDER BY
.)
Finally aggregate.
An index on (enterprise_id)
or better (enterprise_id, from_date)
should greatly improve performance.
Ideally, timestamps are stored as type timestamp
to begin with. Or timestamptz
. See:
- Ignoring time zones altogether in Rails and PostgreSQL
DISTINCT ON
is a Postgres-specific extension of standard SQL DISTINCT
functionality. See:
- Select first row in each GROUP BY group?
Alternatively, you could also add DISTINCT
(and ORDER BY
) to the aggregate function string_agg()
directly:
SELECT string_agg(DISTINCT to_char(to_timestamp(from_date / 1000), 'DD-MM-YYYY'), ',' ORDER BY to_char(to_timestamp(from_date / 1000), 'DD-MM-YYYY')) AS the_dates
FROM trn_day_bookkeeping_income_expense
WHERE enterprise_id = 5134650
But that would be ugly, hard to read and maintain, and more expensive. (Test with EXPLAIN ANALYZE
).
SQL Server; How to incorporate unique values from STRING_AGG?
Just put it in a subquery with DISTINCT
SELECT
#fact1.dim1Key,
#fact1.factvalue1,
#fact1.groupKey,
#dim1.attributeTwo,
#dim1.attributeThree,
ISNULL(#dim2.attributeOne, '<missing>')
FROM #fact1
JOIN #dim1 ON #dim1.dim1key = #fact1.dim1key
CROSS APPLY (
SELECT
attributeOne = STRING_AGG(ISNULL(d2.attributeOne, '<missing>'), ', ') WITHIN GROUP (ORDER BY d2.attributeOne)
FROM (
SELECT DISTINCT
#dim2.attributeOne
FROM #bridge b
JOIN #dim2 ON #dim2.dim2key = b.dim2key
WHERE b.groupKey = #fact1.groupKey
) d2
) #dim2
Distinct inside String_Agg
Your GROUP BY
should have the unaggregated columns in the SELECT
. STRING_AGG()
is an aggregation function. So you probably want:
select f.FagID, FagNavn, STRING_AGG(l.LaererID, ',') as Lærer
from Fag$ f join
Lektion$ l
on f.FagID = l.FagID
group by f.FagID, FagNavn;
If you still duplicates (which seems unlikely with an id
, then you need to use a subquery of some sort to remove the duplicates.
Find and concatenate all & distinct values in one query
I used one column to illustrate the solution but you got the idea:
select id,
STRING_AGG(FirstEvent,';') as FirstEvent
, STRING_AGG(case when rw = 1 then FirstEvent else null end,';') as allevents
from (
select * , row_number() over (partition by id,firstevent order by id) rw
from xx
) t
group by t.id
Getting a distinct comma-delimited string in T-SQL
Like this:
with q as
(
SELECT * FROM (VALUES
(1, 'Alice', 'Alice'),
(2, 'Robert', 'Robert'),
(3, 'Robert', 'Bob'),
(4, 'Richard', 'Rich'),
(5, 'Richard', 'Rick'),
(6, 'Richard', 'Dick')
) tmp (ID, RealName, NickName)
)
select RealName,
string_agg(case when RealName <> NickName then NickName else null end, ', ') AKA
from q
group by RealName
outputs
RealName AKA
-------- ---------------------
Alice NULL
Richard Rick, Rich, Dick
Robert Bob
Related Topics
How to Use System Username Directly in Ms Access Query
What Is It Exactly a Blob in a Dbms Context
String Concatenation Does Not Work in SQLite
Changing the Commandtimeout in SQL Management Studio
Query Runs Slow with Date Expression, But Fast with String Literal
Why Using a Udf in a SQL Query Leads to Cartesian Product
How to Get Export Output in "Real" CSV Format in SQL Server Management Studio
Select Rows Where Column Value Has Changed
How to Create a Pivot Query in SQL Server Without Aggregate Function
Delete Parent If It's Not Referenced by Any Other Child
Varchar Variable Is Not Working in Where Clause
Slow Simple Update Query on Postgresql Database with 3 Million Rows
How to Find Rows in One Table That Have No Corresponding Row in Another Table
Sqlite Select with Condition on Date
SQL Where Joined Set Must Contain All Values But May Contain More