What is the reason not to use select *?
The essence of the quote of not prematurely optimizing is to go for simple and straightforward code and then use a profiler to point out the hot spots, which you can then optimize to be efficient.
When you use select * you're make it impossible to profile, therefore you're not writing clear & straightforward code and you are going against the spirit of the quote. select *
is an anti-pattern.
So selecting columns is not a premature optimization. A few things off the top of my head ....
- If you specify columns in a SQL statement, the SQL execution engine will error if that column is removed from the table and the query is executed.
- You can more easily scan code where that column is being used.
- You should always write queries to bring back the least amount of information.
- As others mention if you use ordinal column access you should never use select *
- If your SQL statement joins tables, select * gives you all columns from all tables in the join
The corollary is that using select *
...
- The columns used by the application is opaque
- DBA's and their query profilers are unable to help your application's poor performance
- The code is more brittle when changes occur
- Your database and network are suffering because they are bringing back too much data (I/O)
- Database engine optimizations are minimal as you're bringing back all data regardless (logical).
Writing correct SQL is just as easy as writing Select *
. So the real lazy person writes proper SQL because they don't want to revisit the code and try to remember what they were doing when they did it. They don't want to explain to the DBA's about every bit of code. They don't want to explain to their clients why the application runs like a dog.
Performance issue in using SELECT *?
If you need a subset of the columns, you are giving bad help to the optimizer (cannot choose for index, or cannot go only to index, ...)
Some database can choose to retrieve data from indexes only. That thing is very very helpfull and give an incredible speedup. Running SELECT * queries does not allow this trick.
Anyway, from the point of view of application is not a good practice.
Example on this:
- You have a table T with 20 columns (C1, C2, ..., C19 C20).
- You have an index on T for (C1,C2)
- You make
SELECT C1, C2 FROM T WHERE C1=123
- The optimizer have all the information on index, does not need to go to the table Data
Instead if you SELECT * FROM T WHERE C1=123
, the optimizer needs to get all the columns data, then the index on (C1,C2) cannot be used.
In joins for multiple tables is a lot helpful.
SQL query - Select * from view or Select col1, col2, ... colN from view
NEVER, EVER USE "SELECT *"!!!!
This is the cardinal rule of query design!
There are multiple reasons for this. One of which is, that if your table only has three fields on it and you use all three fields in the code that calls the query, there's a great possibility that you will be adding more fields to that table as the application grows, and if your select * query was only meant to return those 3 fields for the calling code, then you're pulling much more data from the database than you need.
Another reason is performance. In query design, don't think about reusability as much as this mantra:
TAKE ALL YOU CAN EAT, BUT EAT ALL YOU TAKE.
SELECT * - pros /cons
In general, the use of SELECT *
is not a good idea.
Pros:
- When you add/remove columns, you don't have to make changes where you did use
SELECT *
- It is shorter to write
- Also see the answers to: Can select * usage ever be justified?
Cons:
- You are returning more data than you need. Say you add a
VARBINARY
column that contains 200k per row. You only need this data in one place for a single record - usingSELECT *
you can end up returning 2MB per 10 rows that you don't need - Explicit about what data is used
- Specifying columns means you get an error when a column is removed
- The query processor has to do some more work - figuring out what columns exist on the table (thanks @vinodadhikary)
- You can find where a column is used more easily
- You get all columns in joins if you use
SELECT *
- You can't use ordinal referencing (though using ordinal references for columns is bad practice in itself)
- Also see the answers to: What is the reason not to use select *?
How to SELECT * and rename a column?
Yes you can do the following:
SELECT bar AS foobar, a.* from foo as a;
But in this case you will get bar twice: one with name foobar
and other with bar as *
fetched it..
Does the number of columns returned affect the speed of a query?
You better avoid SELECT *
- It leads to confusion when you change the table layout.
- It selects unneeded columns, and your data packets get larger.
- The columns can get duplicate names, which is also not good for some applications
- If all the columns you need are covered by an index,
SELECT columns
will only use this index, whileSELECT *
will need to visit the table records to get the values you don't need. Also bad for performance.
View - Return 0 if no rows found in a grouped by query
If there is not an entry in the view results, then this will always return NULL
- That's SQL. If you change your SELECT
that you use against the view, you can achieve what you want:
SELECT IFNULL(total, 0) FROM total_transactions WHERE account_id = 2060
Edit:
(SELECT IFNULL(total, 0) total FROM total_transactions WHERE account_id = 2060)
UNION
(SELECT 0 total)
Related Topics
SQL Query to Concatenate Column Values from Multiple Rows in Oracle
Two SQL Left Joins Produce Incorrect Result
How to Use Returning With on Conflict in Postgresql
How to Delete from Multiple Tables in MySQL
How to See the Raw SQL Queries Django Is Running
Equivalent of Explode() to Work With Strings in MySQL
How to Select a Column Name With a Space in MySQL
How to Calculate Percentage With a SQL Statement
Search For "Whole Word Match" in MySQL
Optimize Group by Query to Retrieve Latest Row Per User
How to Delete Duplicate Rows in SQL Server
How to Cast the Datetime to Time
Difference Between Single and Double Quotes in Sql