Get the distinct sum of a joined table column
To get the result without subquery, you have to resort to advanced window function trickery:
SELECT sum(count(*)) OVER () AS tickets_count
, sum(min(a.revenue)) OVER () AS atendees_revenue
FROM tickets t
JOIN attendees a ON a.id = t.attendee_id
GROUP BY t.attendee_id
LIMIT 1;
sqlfiddle
How does it work?
The key to understanding this is the sequence of events in the query:
aggregate functions -> window functions -> DISTINCT -> LIMIT
More details:
- Best way to get result count before LIMIT was applied
Step by step:
I
GROUP BY t.attendee_id
- which you would normally do in a subquery.Then I sum over the counts to get the total count of tickets. Not very efficient, but forced by your requirement. The aggregate function
count(*)
is wrapped in the window functionsum( ... ) OVER ()
to arrive at the not-so-common expression:sum(count(*)) OVER ()
.And sum the minimum revenue per attendee to get the sum without duplicates.
You could also use
max()
oravg()
instead ofmin()
to the same effect asrevenue
is guaranteed to be the same for every row per attendee.This could be simpler if
DISTINCT
was allowed in window functions, but PostgreSQL has not (yet) implemented this feature. Per documentation:Aggregate window functions, unlike normal aggregate functions, do not
allowDISTINCT
orORDER BY
to be used within the function argument list.Final step is to get a single row. This could be done with
DISTINCT
(SQL standard) since all rows are the same.LIMIT 1
will be faster, though. Or the SQL-standard formFETCH FIRST 1 ROWS ONLY
.
SQL SUM values by DISTINCT column after JOIN
If there is no donation information of a city then donation column will show null. To convert it to 0 you can use coalesce(SUM(d.amount),0)
.
Schema and insert statements:
create table members(id int, name varchar(50), city varchar(50));
insert into members values(1, 'John' ,'Boston');
insert into members values(2, 'Maria' ,'Boston');
insert into members values(3, 'Steve' ,'London');
insert into members values(4, 'Oscar' ,'London');
insert into members values(5, 'Ben' ,'Singapore');
create table donations(member_id int, amount int);
insert into donations values(1, 100);
insert into donations values(1, 150);
insert into donations values(2, 300 );
insert into donations values(3, 50);
insert into donations values(3, 100);
insert into donations values(3, 50);
insert into donations values(4, 75);
insert into donations values(5, 200);
Query:
SELECT m.city, SUM(d.amount) as donations
FROM members m LEFT JOIN
donations d
ON d.member_id = m.id
GROUP BY m.city
ORDER BY city;
Output:
city | total_donations |
---|---|
Boston | 550 |
London | 275 |
Singapore | 200 |
Join tables then sum on distinct values
use AdventureWorks2012
select
datename(dw,orderdate ) as "Day",
SumLineTotal=SUM(LineTotal),
SumOrderQty=SUM(OrderQty)
from
sales.SalesOrderDetail
INNER join sales.SalesOrderHeader on (SalesOrderDetail.SalesOrderID=SalesOrderHeader.SalesOrderID)
group by
datename(dw,orderdate)
Getting sum of a column that needs a distinct value from other column
I guess this is a job for a subquery. So let's take your problem step by step.
I'm trying to find all the rows in the balance column that are the same and have the same date,
This subquery gets you that, I believe. It give the same result as SELECT DISTINCT but it also counts the duplicated rows.
SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance
and then find the sum of the balance column.
Nest the subquery like this.
SELECT SUM(balance) summed_balance, date
FROM (
SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance
) subquery
GROUP BY date
If you only want to consider rows that actually have duplicates, change your subquery to
SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance
HAVING COUNT(*) >= 1
Be careful here, though. You didn't tell us what you want to do, only how you want to do it. The way you described your problem calls for discarding duplicated data before doing the sums. Is that right? Do you want to discard data?
SQL INNER JOIN of sum distinct values
Well first of all your sample data was wrong. So first lets create the right structure.
And please keep in mind, in future if you share sql scripts to create your sample data, you will get more answers.
-- declare tables
DECLARE @singers TABLE (singerID INT, name NVARCHAR(255), country NVARCHAR(255));
DECLARE @musics TABLE (musicID INT, singerID INT, songName NVARCHAR(255));
DECLARE @playlistInfo TABLE (singerID INT, musicID INT, listened INT);
-- Load sample data
INSERT INTO @singers VALUES (1, 'Adele', 'England'), (2, 'Mozart', 'Austria'), (3, 'DuaLipa', 'England')
INSERT INTO @musics VALUES (1, 1, 'Rolling in the Deep'), (2, 2, 'Symphony No 40'), (3, 3, 'One Kiss')
INSERT INTO @playlistInfo VALUES (1, 1, 25), (2, 2, 15), (3, 3, 20)
And then query our tables for the top 10 singers from England.
SELECT TOP 10
s.name as Singer, ISNULL(SUM(pl.listened), 0) as TotalListened
FROM
@singers s
LEFT JOIN @musics m ON m.singerID = s.singerID
LEFT JOIN @playlistInfo pl ON pl.musicID = m.musicID AND pl.singerID = m.singerID
-- I did left join to show anyone with 0 listen too, you can convert it to `JOIN`
WHERE
s.country = 'England'
GROUP BY
s.name
ORDER BY
SUM(pl.listened) DESC
Some little extra (if you want to get most listened song)
-- get the most listened song within country
SELECT TOP 10
s.name as Singer, m.songName, ISNULL(SUM(pl.listened), 0) as TotalListened
FROM
@singers s
LEFT JOIN @musics m ON m.singerID = s.singerID
LEFT JOIN @playlistInfo pl ON pl.musicID = m.musicID AND pl.singerID = m.singerID
WHERE
s.country = 'England'
GROUP BY
s.name,
m.songName
ORDER BY
SUM(pl.listened) DESC
Get distinct values and sum their respective quantities
You can use group by
:
select ItemNo, sum(Qty) as QtyTotal
from QueryOutput q
group by ItemNo;
You can replace QueryOutput with a query that produces your example table.
Fiddle
MySql Join Tables With Sum Of A Column
If I'm understanding correctly, I believe the following should work.
The key is to issue the LEFT JOIN
so that even a server with no matching record in the server_hit
table will still show in the final output, but with a 0 sum.
SELECT s.server_id, s.server_name, s.server_url, s.category_id, c.category_name, IFNULL(SUM(sh.hit_count), 0)
FROM server s
INNER JOIN category c ON s.category_id = c.category_id
LEFT JOIN server_hit sh ON s.server_id = sh.server_id
GROUP BY s.server_id, s.server_name, s.server_url, s.category_id, c.category_name
Add IF EXISTS
to handle NULL
issue
SELECT DISTINCT s.server_id, s.server_name, s.server_url, s.category_id, c.category_name, IF(EXISTS(SELECT id FROM server_hit WHERE sh.server_id = s.server_id), SUM(sh.hit_count), 0) as 'total_hit_count'
FROM server s
INNER JOIN category c ON s.category_id = c.category_id
LEFT JOIN server_hit sh ON s.server_id = sh.server_id GROUP BY s.server_id, s.server_name, s.server_url, s.category_id, c.category_name
Related Topics
Delete SQL Rows Where Ids Do Not Have a Match from Another Table
Best Way to Delete Millions of Rows by Id
SQL Server 2008 - If Not Exists Insert Else Update
Passing Table and Column Name Dynamically Using Bind Variables
Why Do People Hate SQL Cursors So Much
Find Closest Numeric Value in Database
Sorting Null Values After All Others, Except Special
@@Identity, Scope_Identity(), Output and Other Methods of Retrieving Last Identity
Execute Stored Procedure from a Function
How to Use Asp Variables in SQL Statement
MySQL - Selecting Data from Multiple Tables All with Same Structure But Different Data
Mysql: What Is a Reverse Version of Like
Rbar VS. Set Based Programming for SQL
How to Deal with Concurrent Updates in Databases