Group by and Count in Postgresql

GROUP BY and COUNT in PostgreSQL

I think you just need COUNT(DISTINCT post_id) FROM votes.

See "4.2.7. Aggregate Expressions" section in http://www.postgresql.org/docs/current/static/sql-expressions.html.

EDIT: Corrected my careless mistake per Erwin's comment.

PostgreSQL group by then count by value

status = 'new' is

  • true for all rows with status = 'new'
  • false for all rows with status <> 'new'
  • null for all rows with status is null.

COUNT( <expression> ) counts all non-null occurences of the expression. This means you count both 'new' and 'old', as neither true nor false is null, when you only want to count 'new'. Use a CASE expression instead:

count(case when status = 'new' then 1 end)

which is short for

count(case when status = 'new' then 1 else null end)

or the same with SUM:

sum(case when status = 'new' then 1 else 0 end)

Some DBMS (MySQL for instance) treat true as 1 and false as 0. There you can even use:

sum(status = 'new')

In PostgreSQL you can also use the filter() clause:

count(*) filter (where status = 'new')

Count on GROUP BY and also on total number of results

You can use an analytic/window function count without any partition:

SELECT users.id as user_id, COUNT(*) as houses_sold,

COUNT(*) OVER() as total_count -- count of rows returned by query

FROM users
JOIN users_with_house_permissions hp ON hp.user_id = users.id
LEFT JOIN houses on houses.user_id = users.id AND houses.sold_at IS NOT NULL
GROUP BY users.id

It works like any other analytic function; it counts over a partition, but if there is no partition specified, then it counts over the entire dataset. In this case the countover is done after the grouping is done, so while count(*) counts the number of items in the group, count(*) over() counts the number of groups in the data set

Someone else posted a sum(count(*)) over() which is the effective equivalent of counting the rows before they are grouped. If you had a data set of "who sold the house" and it went thus:

john
john
john
mary

4 houses have been sold, john sold 3, mary sold 1. There are 2 sales reps working for the agency.

COUNT(*) FROM ... GROUP BY name gives the "john sold 3, mary sold 1" and results in a dataset of:

john, 3
mary, 1

If we were to SUM that count, we'd have 4, i.e. 3+1. This is effectively the count of houses before the grouping was done. SUM(COUNT(*)) OVER() is hence the count of rows we had before we did the group. It's important to bear in mind that the COUNT(*) belongs to the GROUP BY and will become an integer count that is then later SUMmed by the SUM OVER. It would probably be easier to see if we used a subquery:

SELECT name, the_count, SUM(the_count) OVER()
FROM (SELECT name, count(*) as the_count FROM sales GROUP BY name) subquery

But because analytics are calculated after grouping is done, there isn't really any need to present it like this; the db would do this in the same way as it would do:

SELECT name, count(*), sum(count(*)) over() FROM sales GROUP BY name

Thus we get to the point where you appreciate that analytics are applied after a grouping is performed, it means that where COUNT(*) OVER() is a count of the number of rows in the dataset after the grouping operation is finished. The grouping produced john,3|mary,1 so COUNT(*) OVER() this produces 2 - the number of rows in the dataset

The documentation, if you want to read more, is titled "Windows Functions" and can be found somewhere like here: https://www.postgresql.org/docs/9.1/tutorial-window.html

This is for PG 9.1; remember to change the view to your specific version of PG

Postgres get count but group by another column

GROUP BY and a condition COUNT should do:

SELECT customer, COUNT(CASE WHEN liked = 'true' THEN 1 END) AS likes
FROM yourtable
group by customer

if it's a boolean column, do:

SELECT customer, COUNT(CASE WHEN liked THEN 1 END) AS likes
FROM yourtable
group by customer

PostgreSQL grouping query for count field as part of smth and the total count field

You want conditional aggregation. In Postgres, I recommend filter:

SELECT purchase_date, city, market
COUNT(*) FILTER (WHERE category = 'vegetables') as cnt_vegetables,
COUNT(*) as cnt
FROM table1
WHERE market = ? AND city = ? AND purchase_date = ?
GROUP BY purchase_date, city, market;

PostgreSQL Count DISTINCT from one column when grouped by another

Just found out that there's a COUNT(DISTINCT( option which doesn't require that distinct value to be placed in the grouping clause.

SELECT COUNT(DISTINCT userid) FROM data GROUP BY state

Does the trick

How to add default values to group by & count in PostgreSQL?

You need to list the values somehow. One method uses a values clause:

select brand, count(c.brand)
from (values ('BMW'), ('Audi'), ('Jaguar')) v(brand) left join
cars c
using (brand)
group by brand;

In practice, you should have the brands in their own table, so you could use:

select brand, count(c.brand)
from brands b
cars c
using (brand)
group by brand;

postgresql group count distinct using fast way

Try this query if you want avoid DISTINCT keyword

Sample Data:

stackoverflow=# select * from T;
c1 | c2
----+----
A | x
A | x
A | y
B | x
B | x
(5 rows)

Query:

stackoverflow=# WITH count_distinct as (SELECT C1 FROM T GROUP BY c1,c2)
SELECT c1,count(c1) FROM count_distinct GROUP BY C1; --updated missing group by

Output:

 c1 | count 
----+-------
B | 1
A | 2
(2 rows)

Same output, but you should try the performance first.



Related Topics



Leave a reply



Submit