How to include zero / 0 results in COUNT aggregate?
You want an outer join for this (and you need to use person as the "driving" table)
SELECT person.person_id, COUNT(appointment.person_id) AS "number_of_appointments"
FROM person
LEFT JOIN appointment ON person.person_id = appointment.person_id
GROUP BY person.person_id;
The reason why this is working, is that the outer (left) join will return NULL
for those persons that do not have an appointment. The aggregate function count()
will not count NULL
values and thus you'll not get a zero.
If you want to learn more about outer joins, here is a nice tutorial: http://sqlzoo.net/wiki/Using_Null
How to include zero-count results in query
Because you have a reference to Calls.CallDate
in your HAVING
clause, you are removing operators where there are no calls. If there were no calls, then CallDate
would be NULL
, and NULL=20170104
is not true, so these rows are excluded. You need to move this predicate to your join clause:
SELECT Operators.id, Operators.Nome, Count(Calls.OpId) AS CountCalls
FROM Operators LEFT JOIN Calls ON (Operators.id = Calls.OpId AND Calls.CallDate=20170104)
GROUP BY Calls.CallDate, Operators.id, Operators.Nome;
You also don't need to group by Calls.CallDate
, since you only have one anyway, so you can just use:
SELECT Operators.id, Operators.Nome, Count(Calls.OpId) AS CountCalls
FROM Operators LEFT JOIN Calls ON (Operators.id = Calls.OpId AND Calls.CallDate=20170104)
GROUP BY Operators.id, Operators.Nome;
As an aside HAVING
is the wrong operator. HAVING
is for filtering aggregates, since you are not filtering an aggregate, you should simply use WHERE
SELECT Operators.id, Operators.Nome, Count(Calls.OpId) AS CountCalls
FROM Operators LEFT JOIN Calls ON Operators.id = Calls.OpId
WHERE Calls.CallDate=20170104
GROUP BY Calls.CallDate, Operators.id, Operators.Nome;
You would use HAVING
if you wanted to fliter on CountCalls
, e.g if you only wanted operators that had made more than 1 call you might use:
SELECT Operators.id, Operators.Nome, Count(Calls.OpId) AS CountCalls
FROM Operators LEFT JOIN Calls ON Operators.id = Calls.OpId
WHERE Calls.CallDate=20170104
GROUP BY Calls.CallDate, Operators.id, Operators.Nome
HAVING Count(Calls.OpId) > 1;
This would only return
Id Nome CountCalls
--+----+----------
1 JDOE 2
How to add condition on the left table to include “zero” / “0” results in COUNT aggregate?
In a LEFT JOIN
, conditions on the second table need to be in the ON
clause:
SELECT p.id, COUNT(a.id)
FROM Person p LEFT JOIN
Account a
ON a.person_id = p.id AND a.state = '0'
WHERE p.id = 1
GROUP BY p.id;
The rule is pretty simple to follow. A LEFT JOIN
keeps all rows in the first table, even when there is no match in the second table. The values in the second table become NULL
. The NULL
value will fail the condition a.state = '0'
.
How to get the aggregate results for missing values as zero
Instead, use conditional aggregation:
select test, sum(case when test_value = 'CC2' then count else 0 end)
from sample_t
group by test;
Alternatively, if you have a table of all test
values:
select t.test, coalesce(sum(count), 0)
from test t left join
sample_t s
on s.test = t.test and s.test_value = 'CC2'
group by t.test;
Warning: Null value is eliminated by an aggregate or other SET operation in Aqua Data Studio
You would mostly be using COUNT
to summarize over a UID. Therefore
COUNT([uid])
will produce the warning:
Warning: Null value is eliminated by an aggregate or other SET operation.
whilst being used with a left join, where the counted object does not exist.
Using COUNT(*)
in this case would also render incorrect results, as you would then be counting the total number of results (ie parents) that exist.
Using COUNT([uid])
IS a valid way of counting, and the warning is nothing more than a warning. However if you are concerned, and you want to get a true count of uids in this case then you could use:
SUM(CASE WHEN [uid] IS NULL THEN 0 ELSE 1 END) AS [new_count]
This would not add a lot of overheads to your query.
(tested mssql 2008)
How to have GROUP BY and COUNT include zero sums?
Without convoluted (in my opinion) queries, your output data-set won't include dates that don't exist in your input data-set. This means that you need a data-set with the 5 days to join on to.
The simple version would be to create a table with the 5 dates, and join on that. I typically create and keep (effectively caching) a calendar table with every date I could ever need. (Such as from 1900-01-01 to 2099-12-31.)
SELECT
calendar.calendar_date,
Count(People.created_at)
FROM
Calendar
LEFT JOIN
People
ON Calendar.calendar_date = People.created_at
WHERE
Calendar.calendar_date >= '2012-05-01'
GROUP BY
Calendar.calendar_date
Return zero if no results using COUNT-GROUP BY
Assuming you have some data in the table for the months you want, you can switch to conditional aggregation:
select datename(month, date_occu) as MonthName,
datepart(month, date_occu) as MonthNumber,
sum(case when agency = 'WCSO' and offense = 'DEATH INVESTIGATION' then 1 else 0 end) as Quantity
from crimes.rms4gis.dbo.tmprms4gisall
where datediff(month, date_occu, getdate()) between 1 and 3
group by datename(month, date_occu), datepart(month, date_occu)
order by MonthNumber;
There are other methods that involve generating the three months involved. However, that requires a somewhat more complex query. The above is a simple solution that should work in this -- and many other -- situations.
COUNT DISTINCT with CONDITIONS
You can try this:
select
count(distinct tag) as tag_count,
count(distinct (case when entryId > 0 then tag end)) as positive_tag_count
from
your_table_name;
The first count(distinct...)
is easy.
The second one, looks somewhat complex, is actually the same as the first one, except that you use case...when
clause. In the case...when
clause, you filter only positive values. Zeros or negative values would be evaluated as null
and won't be included in count.
One thing to note here is that this can be done by reading the table once. When it seems that you have to read the same table twice or more, it can actually be done by reading once, in most of the time. As a result, it will finish the task a lot faster with less I/O.
Django Aggregate Query Include Zero Count
Along the same lines as Attempt 2, I also tried the following using
Sum(Case( which yielded the same result, as I recognized that the
Attempt 2 raw SQL actually utilizes Case(When, but seems to only count
when Student.pk is present in the Paper.objects.values "list" (while
not accounting for how many times it is present).
Either I'm not understanding the problem/question, but your Attempt 2 example is filtering the count to only Paper.objects.values "list"
, its normal to act like this ?
Have you tried with the simple:
Student.objects.annotate(num_papers=Count('paper'))
If you want to make an additional filter on the count, my suggestion is to use subqueries here is an example:
Student.objects.annotate(
num_papers=Subquery(
Paper.objects.filter(student=OuterRef('pk'))
# The first .values call defines our GROUP BY clause
# Its important to have a filtration on every field defined here
# Otherwise you will have more than one row per group!
# In this example we group only by student
# and we already filtered by student.
# any extra filtration you want should be make here too (before the grouping).
.values('student')
# Here we say: count how many rows we have per group
.annotate(cnt=Count('pk'))
# Here we say: return only the count
.values('cnt')
)
)
Related Topics
How to Do a Case Sensitive Search in Where Clause (I'M Using SQL Server)
MySQL Delete from With Subquery as Condition
MySQL: Invalid Use of Group Function
Creating Table Names That Are Reserved Words/Keywords in Ms SQL Server
Fastest Way to Count Exact Number of Rows in a Very Large Table
Sql: If Clause Within Where Clause
Foreign Key Referring to Primary Keys Across Multiple Tables
T-SQL Datetime Rounded to Nearest Minute and Nearest Hours With Using Functions
SQL Server: Group by Clause to Get Comma-Separated Values
Sql: Find Missing Ids in a Table
MySQL - Subtracting Value from Previous Row, Group By
T-Sql: Round to Nearest 15 Minute Interval
What Is the Maximum Number of Columns in a Postgresql Select Query