Is COUNT(*) indexed?
SELECT Count(*)
FROM SomeTableName
will always count all rows. Though (unlike SELECT *
) it does not have to read all columns and can use the narrowest (non filtered) index available to do so.
Unlike MySQL (MyISAM engine) it does not retrieve the value from metadata.
A rowcount value is available in the metadata and can be retrieved from sys.partitions
but this is never used for COUNT
queries and isn't always accurate.
SELECT COUNT query on indexed column
Hash indexes don't store the indexed value in the index, just its 32-bit hash and the ctid (pointer to the table row). That means they can't resolve hash collisions on their own, so it has to go to the table to obtain the value and then recheck it. This can involve a lot or extra IO compared to a btree index, which do store the value and can support index only scans.
Does a SELECT COUNT(*) query have to do a full table scan?
The server will always read all records (if there's an index then it will scan the entire index) to count the rows. You can't escape this as long as you are doing SELECT COUNT(*) FROM Table
.
If your table has a clustered index, you can change your query to an "under the hood" query to retrieve the count without actually fetching the records with:
SELECT OBJECT_NAME(i.id) [Table_Name], i.rowcnt [Row_Count]
FROM sys.sysindexes i WITH (NOLOCK)
WHERE i.indid in (0,1)
ORDER BY i.rowcnt desc
if you are looking for an approximate count of the records, you can also use the following query:
SELECT
TableName = t.NAME,
SchemaName = s.Name,
[RowCount] = p.rows,
TotalSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.total_pages) * 8 / 1024.0),
UsedSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.used_pages) * 8 / 1024.0),
UnusedSpaceMB = CONVERT(DECIMAL(18,2), (SUM(a.total_pages) - SUM(a.used_pages)) * 8 / 1024.0)
FROM
sys.tables t
INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE
t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
GROUP BY
t.Name,
s.Name,
p.Rows
ORDER BY
TotalSpaceMB DESC
This will show non-system tables with their calculated (not exact) row count and the sum of the sizes of their data (with any index they might have), relatively fast without retrieving the records.
mysql COUNT(*) vs COUNT(DISTINCT col)
If the column is indexed, COUNT(DISTINCT id)
just needs to return the number of items in the index for the column. COUNT(id)
has to add up the number of rows that each index entry points to, or scan all the rows.
For your second question, see count(*) and count(column_name), what's the diff?. Most of the time, COUNT(*)
is most appropriate; there are some situations, such as counting rows joined with an outer join, where you need to use COUNT(columnname)
because you don't want to count the null rows.
Related Topics
Thoughts on Index Creation for SQL Server for Missing Indexes
Excel Vlookup Incorporating SQL Table
Rails Pg::Undefinedtable: Error: Missing From-Clause Entry for Table
SQL Server - Update Column from Data in the Same Table
Why Is My Left Join Not Returning Nulls
Sql - Filtering Large Tables with Joins - Best Practices
How to Retrieve The Identities of Rows That Were Inserted Through Insert...Select
What's the Recommended Location for SQL (Ddl) Scripts
How to Optimize Tables for Specific Queries
Exporting Binary File Data (Images) from SQL via a Stored Procedure
Pure-SQL Technique for Auto-Numbering Rows in Result Set
Query to Convert from Datetime to Date MySQL
Retrieving SQL Queries from Active-Record Queries in Rails 3
Returning The First X Records in a Postgresql Query with a Unique Field