Performance of Count SQL Function

Performance of COUNT SQL function

Performance should not matter because they do 2 different aggregates

  • COUNT(*) is all rows, including NULLs
  • COUNT(some_column_name), excludes NULL in "some_column_name"

See the "Count(*) vs Count(1)" question for more

SQL count(*) performance

Mikael Eriksson has a good explanation bellow why the first query is fast:

SQL server optimize it into:
if exists(select * from BookChapters). So it goes looking for the presence of one row instead of counting all the rows in the table.

For the other two queries, SQL Server would use the following rule. To perform a query like SELECT COUNT(*), SQL Server will use the narrowest
non-clustered index to count the rows. If the table does not have any
non-clustered index, it will have to scan the table.

Also, if your table has a clustered index you can get your count even faster using the following query (borrowed from this site Get Row Counts Fast!)

--SQL Server 2005/2008
SELECT OBJECT_NAME(i.id) [Table_Name], i.rowcnt [Row_Count]
FROM sys.sysindexes i WITH (NOLOCK)
WHERE i.indid in (0,1)
ORDER BY i.rowcnt desc

--SQL Server 2000
SELECT OBJECT_NAME(i.id) [Table_Name], i.rows [Row_Count]
FROM sysindexes i (NOLOCK)
WHERE i.indid in (0,1)
ORDER BY i.rows desc

It uses sysindexes system table. More info you can find here SQL Server 2000, SQL Server 2005, SQL Server 2008, SQL Server 2012

Here is another link Why is my SELECT COUNT(*) running so slow? with another solution. It shows technique that Microsoft uses to quickly display the number of rows when you right click on the table and select properties.

select sum (spart.rows)
from sys.partitions spart
where spart.object_id = object_id(’YourTable’)
and spart.index_id < 2

You should find that this returns very quickly no matter how many tables you have.

If you are using SQL 2000 still you can use the sysindexes table to get the number.

select max(ROWS)
from sysindexes
where id = object_id(’YourTable’)

This number may be slightly off depending on how often SQL updates the sysindexes table, but it’s usually corrent (or at least close enough).

Will I get better performance if I use COUNT query instead of looping through entities in Symfony 4?

Of course count query will be faster because it will result into single SQL query that will return single value.

Iteration over entities will require:

  1. Run of SQL query for fetching data rows
  2. Actual fetching of data
  3. Entity objects instantiation and persisting fetched data into them

Depending on amount of affected data difference may be very big.

The only case when running count over entities may be fast enough is a case when you already have all entities fetched and just need to count them.

Count(*) vs Count(1) - SQL Server

There is no difference.

Reason:

Books on-line says "COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )"

"1" is a non-null expression: so it's the same as COUNT(*).
The optimizer recognizes it for what it is: trivial.

The same as EXISTS (SELECT * ... or EXISTS (SELECT 1 ...

Example:

SELECT COUNT(1) FROM dbo.tab800krows
SELECT COUNT(1),FKID FROM dbo.tab800krows GROUP BY FKID

SELECT COUNT(*) FROM dbo.tab800krows
SELECT COUNT(*),FKID FROM dbo.tab800krows GROUP BY FKID

Same IO, same plan, the works

Edit, Aug 2011

Similar question on DBA.SE.

Edit, Dec 2011

COUNT(*) is mentioned specifically in ANSI-92 (look for "Scalar expressions 125")

Case:

a) If COUNT(*) is specified, then the result is the cardinality of T.

That is, the ANSI standard recognizes it as bleeding obvious what you mean. COUNT(1) has been optimized out by RDBMS vendors because of this superstition. Otherwise it would be evaluated as per ANSI

b) Otherwise, let TX be the single-column table that is the
result of applying the <value expression> to each row of T
and eliminating null values. If one or more null values are
eliminated, then a completion condition is raised: warning-

SQL Server Count is slow

Very close approximate (ignoring any in-flight transactions) would be:

SELECT SUM(p.rows) FROM sys.partitions AS p
INNER JOIN sys.tables AS t
ON p.[object_id] = t.[object_id]
INNER JOIN sys.schemas AS s
ON s.[schema_id] = t.[schema_id]
WHERE t.name = N'myTable'
AND s.name = N'dbo'
AND p.index_id IN (0,1);

This will return much, much quicker than COUNT(*), and if your table is changing quickly enough, it's not really any less accurate - if your table has changed between when you started your COUNT (and locks were taken) and when it was returned (when locks were released and all the waiting write transactions were now allowed to write to the table), is it that much more valuable? I don't think so.

If you have some subset of the table you want to count (say, WHERE some_column IS NULL), you could create a filtered index on that column, and structure the where clause one way or the other, depending on whether it was the exception or the rule (so create the filtered index on the smaller set). So one of these two indexes:

CREATE INDEX IAmTheException ON dbo.table(some_column)
WHERE some_column IS NULL;

CREATE INDEX IAmTheRule ON dbo.table(some_column)
WHERE some_column IS NOT NULL;

Then you could get the count in a similar way using:

SELECT SUM(p.rows) FROM sys.partitions AS p
INNER JOIN sys.tables AS t
ON p.[object_id] = t.[object_id]
INNER JOIN sys.schemas AS s
ON s.[schema_id] = t.[schema_id]
INNER JOIN sys.indexes AS i
ON p.index_id = i.index_id
WHERE t.name = N'myTable'
AND s.name = N'dbo'
AND i.name = N'IAmTheException' -- or N'IAmTheRule'
AND p.index_id IN (0,1);

And if you want to know the opposite, you just subtract from the first query above.



Related Topics



Leave a reply



Submit