Aggregate Columns With Additional (Distinct) Filters

Aggregate columns with additional (distinct) filters

The aggregate FILTER clause in Postgres 9.4 or newer is shorter and faster:

SELECT u.name
     , count(*) FILTER (WHERE g.winner_id  > 0)    AS played
     , count(*) FILTER (WHERE g.winner_id  = u.id) AS won
     , count(*) FILTER (WHERE g.winner_id <> u.id) AS lost
FROM   games g
JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP  BY u.name;

The manual
Postgres Wiki
Depesz blog post

In Postgres 9.3 (or any version) this is still shorter and faster than nested sub-selects or CASE expressions:

SELECT u.name
     , count(g.winner_id  > 0 OR NULL)    AS played
     , count(g.winner_id  = u.id OR NULL) AS won
     , count(g.winner_id <> u.id OR NULL) AS lost
FROM   games g
JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP  BY u.name;

Details:

For absolute performance, is SUM faster or COUNT?

SQL Filter two aggregate functions with different conditions

You can use conditional aggregation:

SELECT airline_name,
       (AVG(CASE WHEN fl_date BETWEEN '2017-07-24' and '2017-07-31' THEN arr_delay_new END) -
        AVG(CASE WHEN fl_date BETWEEN '2017-07-01' and '2017-07-23' THEN arr_delay_new END)
       ) as AVG_DIFF
FROM Flight_delays F JOIN
     Airlines A
     ON A.airline_id = F.airline_id
GROUP BY airline_name;

This assumes that arr_delay_new has a type that can be averaged. Some databases are reluctant to do averages on date/times directly.

Combine two queries to count distinct strings with different filters

Much faster and simpler with conditional aggregates using the aggregate FILTER clause:

SELECT source
     , count(DISTINCT sku) FILTER (WHERE product_gap = 'yes') AS yes_gap
     , count(DISTINCT sku) FILTER (WHERE product_gap = 'no')  AS no_gap
FROM   product_gaps
WHERE  ingestion_date <= '2021-05-25'
GROUP  BY source;

See:

Aggregate columns with additional (distinct) filters

Aside 1: DISTINCT is a key word, not a function. Don't add parentheses for the single column. distinct(sku) is short notation for DISTINCT ROW(sku). It happens to work because Postgres strips the ROW wrapper for a single column, but it's just noise.

Aside 2: product_gap should probably be boolean.

Aggregating a table by multiple different column filters

Use conditional aggregation:

select user_id, max(value) as max_value
       min(case when event_type = 'click' then value end) as min_click_value
from my_table
group by user_id;

Can we use same aggregate function more than once on same table field or column using Different filter conditions?

You can use an aggregate function with a CASE:

SELECT Date1,
  CC,
  BU, 
  SUM(case when mode = '011' then Amount end) Mode011,
  SUM(case when mode = '012' then Amount end) Mode012,
  SUM(case when mode = '013' then Amount end) Mode013,
  SUM(case when mode = '014' then Amount end) Mode014
FROM MainTable
GROUP BY CC,BU,Date1;

Or you can use the PIVOT function:

select date1, CC, BU,
  [011] Mode011, 
  [012] Mode012, 
  [013] Mode013, 
  [014] Mode014
from
(
  select date1, CC, BU, mode, amount
  from maintable
) src
pivot
(
  sum(amount)
  for mode in ([011], [012], [013], [014])
) piv

Get conditional count and conditional DISTINCT count in a single SELECT

Use the aggregate FILTER clause. Then you can combine your count with DISTINCT:

SELECT s.logged_on::date AS login_date
     , count(*)                FILTER (WHERE s.device = 'mobile') AS mobile_count
     , count(DISTINCT user_id) FILTER (WHERE s.device = 'web') AS web_count
FROM   session_log s
JOIN   standard_users su USING (user_id)
GROUP  BY login_date;

See:

Aggregate columns with additional (distinct) filters

I also simplified your twisted formulation with LEFT JOIN and then IS NOT NULL. Boils down to a plain JOIN.

If referential integrity between session_log.user_id and standard_users.user_id is enforced with a FK constraint, and standard_users.user_id is defined UNIQUE or PK - as seems reasonable - you can drop the JOIN completely:

SELECT logged_on::date AS login_date
     , count(*)                FILTER (WHERE device = 'mobile') AS mobile_count
     , count(DISTINCT user_id) FILTER (WHERE device = 'web') AS web_count
FROM   session_log
GROUP  BY 1;

Athena array aggregate and filter multiple columns on condition

You should be able to do something like this:

SELECT
  uuid,
  SUM(fee.price) AS total_fee,
  SUM(fee.price) FILTER (WHERE fee.feetype = 'discount') AS total_discount,
  ARBITRARY(fee.title) FILTER (WHERE fee.feetype = 'discount') AS discount_type
FROM …
GROUP BY uuid

(I'm assuming the data column in your example is the same as the fee column in your query).

Aggregate functions support a FILTER clause that selects the rows to include into the aggregation. This can also be achieved by e.g. SUM(IF(fee.feetype = 'discount', fee.price, 0)), which is more compact but not as elegant.

The ARBITRARY aggregate function picks an arbitrary value from the group. I don't know if that's appropriate in your case, but I assume that there will only be one discount row per group. If there are more than one you might want to use ARRAY_AGG with the DISTINCT clause (e.g. ARRAY_AGG(DISTINCT fee.title) to get the all).

Aggregate Columns With Additional (Distinct) Filters