Is Count(*) Indexed

Is COUNT(*) indexed?

SELECT Count(*)
FROM SomeTableName

will always count all rows. Though (unlike SELECT *) it does not have to read all columns and can use the narrowest (non filtered) index available to do so.

Unlike MySQL (MyISAM engine) it does not retrieve the value from metadata.

A rowcount value is available in the metadata and can be retrieved from sys.partitions but this is never used for COUNT queries and isn't always accurate.

SELECT COUNT query on indexed column

Hash indexes don't store the indexed value in the index, just its 32-bit hash and the ctid (pointer to the table row). That means they can't resolve hash collisions on their own, so it has to go to the table to obtain the value and then recheck it. This can involve a lot or extra IO compared to a btree index, which do store the value and can support index only scans.

Does a SELECT COUNT(*) query have to do a full table scan?

The server will always read all records (if there's an index then it will scan the entire index) to count the rows. You can't escape this as long as you are doing SELECT COUNT(*) FROM Table.

If your table has a clustered index, you can change your query to an "under the hood" query to retrieve the count without actually fetching the records with:

SELECT OBJECT_NAME(i.id) [Table_Name], i.rowcnt [Row_Count]
FROM sys.sysindexes i WITH (NOLOCK)
WHERE i.indid in (0,1)
ORDER BY i.rowcnt desc

if you are looking for an approximate count of the records, you can also use the following query:

SELECT 
TableName = t.NAME,
SchemaName = s.Name,
[RowCount] = p.rows,
TotalSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.total_pages) * 8 / 1024.0),
UsedSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.used_pages) * 8 / 1024.0),
UnusedSpaceMB = CONVERT(DECIMAL(18,2), (SUM(a.total_pages) - SUM(a.used_pages)) * 8 / 1024.0)
FROM
sys.tables t
INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE
t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
GROUP BY
t.Name,
s.Name,
p.Rows
ORDER BY
TotalSpaceMB DESC

This will show non-system tables with their calculated (not exact) row count and the sum of the sizes of their data (with any index they might have), relatively fast without retrieving the records.

mysql COUNT(*) vs COUNT(DISTINCT col)

If the column is indexed, COUNT(DISTINCT id) just needs to return the number of items in the index for the column. COUNT(id) has to add up the number of rows that each index entry points to, or scan all the rows.

For your second question, see count(*) and count(column_name), what's the diff?. Most of the time, COUNT(*) is most appropriate; there are some situations, such as counting rows joined with an outer join, where you need to use COUNT(columnname) because you don't want to count the null rows.



Related Topics



Leave a reply



Submit