SQL Query - Select * from View or Select Col1, Col2, ... Coln from View

SQL query - Select * from view or Select col1, col2, ... colN from view

NEVER, EVER USE "SELECT *"!!!!

This is the cardinal rule of query design!

There are multiple reasons for this. One of which is, that if your table only has three fields on it and you use all three fields in the code that calls the query, there's a great possibility that you will be adding more fields to that table as the application grows, and if your select * query was only meant to return those 3 fields for the calling code, then you're pulling much more data from the database than you need.

Another reason is performance. In query design, don't think about reusability as much as this mantra:

TAKE ALL YOU CAN EAT, BUT EAT ALL YOU TAKE.

Which is faster/best? SELECT * or SELECT column1, colum2, column3, etc

One reason that selecting specific columns is better is that it raises the probability that SQL Server can access the data from indexes rather than querying the table data.

Here's a post I wrote about it: The real reason select queries are bad index coverage

It's also less fragile to change, since any code that consumes the data will be getting the same data structure regardless of changes you make to the table schema in the future.

What is the reason not to use select *?

The essence of the quote of not prematurely optimizing is to go for simple and straightforward code and then use a profiler to point out the hot spots, which you can then optimize to be efficient.

When you use select * you're make it impossible to profile, therefore you're not writing clear & straightforward code and you are going against the spirit of the quote. select * is an anti-pattern.


So selecting columns is not a premature optimization. A few things off the top of my head ....

  1. If you specify columns in a SQL statement, the SQL execution engine will error if that column is removed from the table and the query is executed.
  2. You can more easily scan code where that column is being used.
  3. You should always write queries to bring back the least amount of information.
  4. As others mention if you use ordinal column access you should never use select *
  5. If your SQL statement joins tables, select * gives you all columns from all tables in the join

The corollary is that using select * ...

  1. The columns used by the application is opaque
  2. DBA's and their query profilers are unable to help your application's poor performance
  3. The code is more brittle when changes occur
  4. Your database and network are suffering because they are bringing back too much data (I/O)
  5. Database engine optimizations are minimal as you're bringing back all data regardless (logical).

Writing correct SQL is just as easy as writing Select *. So the real lazy person writes proper SQL because they don't want to revisit the code and try to remember what they were doing when they did it. They don't want to explain to the DBA's about every bit of code. They don't want to explain to their clients why the application runs like a dog.

Performance issue in using SELECT *?

If you need a subset of the columns, you are giving bad help to the optimizer (cannot choose for index, or cannot go only to index, ...)

Some database can choose to retrieve data from indexes only. That thing is very very helpfull and give an incredible speedup. Running SELECT * queries does not allow this trick.

Anyway, from the point of view of application is not a good practice.


Example on this:

  • You have a table T with 20 columns (C1, C2, ..., C19 C20).
  • You have an index on T for (C1,C2)
  • You make SELECT C1, C2 FROM T WHERE C1=123
  • The optimizer have all the information on index, does not need to go to the table Data

Instead if you SELECT * FROM T WHERE C1=123, the optimizer needs to get all the columns data, then the index on (C1,C2) cannot be used.

In joins for multiple tables is a lot helpful.

Sql wildcard: performance overhead?

SELECT * FROM...

and

SELECT every, column, list, ... FROM...

will perform the same because both are an unoptimised scan

The difference is:

  • the extra lookup in sys.columns to resolve *
  • the contract/signature change when the table schema changes
  • inability to create a covering index. In fact, no tuning options at all, really
  • have to refresh views needed if non schemabound
  • can not index or schemabind a view using *
  • ...and other stuff

Other SO questions on the same subject...

  • What is the reason not to use select * ?
  • Is there a difference betweeen Select * and Select list each col
  • SQL Query Question - Select * from view or Select col1,col2…from view
  • “select * from table” vs “select colA,colB,etc from table” interesting behaviour in SqlServer2005

MySql queries: really never use SELECT *?

SELECT * FROM ... is not always the best way to go unless you need all columns. That's because if for example a table has 10 columns and you only need 2-3 of them and these columns are indexed, then if you use SELECT * the query will run slower, because the server must fetch all rows from the datafiles. If instead you used only the 2-3 columns that you actually needed, then the server could run the query much faster if the rows were fetch from a covering index. A covering index is one that is used to return results without reading the datafile.

So, use SELECT * only when you actually need all columns.



Related Topics



Leave a reply



Submit