How to Sum Distinct Rows

How do I SUM DISTINCT Rows?

If you just want an overall figure for it try

select sum(PopEstimate2005), sum(EstimatesBase2000)
from(
SELECT Distinct
Zipcodes.CountyID,
us_co_est2005_allData.PopEstimate2005,
us_co_est2005_allData.EstimatesBase2000,
users_link_territory.userID
FROM
Zipcodes Inner Join
Users_link_territory ON zipcodes.CountyID = Users_link_territory.CountyID Inner Join
us_co_est2005_alldata ON zipcodes.FIPS = us_co_est2005_alldata.State AND zipcodes.code = us_co_est2005_alldata.County
WHERE
(users_link_territory.userid = 4)
) as foo

How to sum distinct rows in a pandas Dataframe

IIUC, you need:

s = df.drop_duplicates(['col1','col2']).groupby('col2')['vote'].sum() #thanks @jez
df['aggrVote']=df.col2.map(s)
print(df)

  col1  col2  vote  aggrVote
0 a 2 5 7
1 a 2 5 7
2 b 2 2 7
3 c 4 1 1
4 d 3 5 5
5 d 3 5 5
6 d 3 5 5

Getting sum of a column that needs a distinct value from other column

I guess this is a job for a subquery. So let's take your problem step by step.

I'm trying to find all the rows in the balance column that are the same and have the same date,

This subquery gets you that, I believe. It give the same result as SELECT DISTINCT but it also counts the duplicated rows.

                SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance

and then find the sum of the balance column.

Nest the subquery like this.

SELECT SUM(balance) summed_balance, date
FROM (
SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance
) subquery
GROUP BY date

If you only want to consider rows that actually have duplicates, change your subquery to

                SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance
HAVING COUNT(*) >= 1

Be careful here, though. You didn't tell us what you want to do, only how you want to do it. The way you described your problem calls for discarding duplicated data before doing the sums. Is that right? Do you want to discard data?

SQL Server - select distinct rows and sum of duplicates

Check the basic sintaxis for GROUP BY

SELECT MIN(ID), Name, SUM(Salary)
FROM Employee
GROUP BY Name

The interesting part here is aggregation functions doesnt need to be at the end. As are usually show in the examples

How to extract and sum distinct values in from a column and create a column with the sum

We group by 'TYPE', get the unique 'SIZE' and return with the sum of those values in summarise

library(dplyr)
df1 %>%
group_by(TYPE) %>%
summarise(Sum = sum(unique(SIZE), na.rm = TRUE))

-output

# A tibble: 1 x 2
TYPE Sum
<chr> <dbl>
1 A 68409188.

data

df1 <- structure(list(TYPE = c("A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A"), SIZE = c(24522145.17, 35359867.65, 35359867.65, 35359867.65,
35359867.65, 35359867.65, 24522145.17, 35359867.65, 35359867.65,
8527174.786)), class = "data.frame", row.names = c(NA, -10L))

SQL Server SUM() for DISTINCT records

Use count()

SELECT count(DISTINCT table_name.users)
FROM table_name

SQLFiddle demo



Related Topics



Leave a reply



Submit