Measuring Query Performance:"Execution Plan Query Cost" VS "Time Taken"

Measuring Query Performance : Execution Plan Query Cost vs Time Taken

The profiler trace puts it into perspective.

Query A: 1.3 secs CPU, 1.4 secs duration
Query B: 2.3 secs CPU, 1.2 secs duration

Query B is using parallelism: CPU > duration
eg the query uses 2 CPUs, average 1.15 secs each

Query A is probably not: CPU < duration

This explains cost relative to batch: 17% of the for the simpler, non-parallel query plan.

The optimiser works out that query B is more expensive and will benefit from parallelism, even though it takes extra effort to do so.

Remember though, that query B uses 100% of 2 CPUS (so 50% for 4 CPUs) for one second or so. Query A uses 100% of a single CPU for 1.5 seconds.

The peak for query A is lower, at the expense of increased duration.
With one user, who cares? With 100, perhaps it makes a difference...

Query Cost vs. Execution Speed + Parallelism

The rule your DBA gave you doesn't make a lot of sense. Worrying about the cost that is reported for a query is very seldom productive. First, you cannot directly compare the cost of two different queries-- one query that has a cost in the millions may run very quickly and consume very few system resources another query that has a cost in the hundreds may run for hours and bring the server to its knees. Second, cost is an estimate. If the optimizer made an accurate estimate of the cost, that strongly implies that it has come up with the optimal query plan which would mean that it is unlikely that you'd be able to modify the query to return the same results while using fewer resources. If the optimizer made an inaccurate estimate of the cost, that strongly implies that it has come up with a poor query plan in which case the reported cost would have no relationship to any useful metric you'd come up with. Most of the time, the queries you're trying to optimize are the queries where the optimizer generated an incorrect query plan because it incorrectly estimated the cost of various steps.

Tricking the optimizer by using hints that may or may not actually change the query plan (depending on how parallelism is configured, for example) is very unlikely to solve a problem-- it's much more likely to cause the optimizer's estimates to be less accurate and make it more likely that it chooses a query plan that consumes far more resources than it needs to. A parallel hint with a high degree of parallelism, for example, would tell Oracle to drastically reduce the cost of a full table scan which makes it more likely that the optimizer would choose that over an index scan. That is seldom something that your DBAs would want to see.

If you're looking for a single metric that tells you whether a query plan is reasonable, I'd use the amount of logical I/O. Logical I/O is correlated pretty well with actual query performance and with the amount of resources your query consumes. Looking at execution time can be problematic because it varies significantly based on what data happens to be cached (which is why queries often run much faster the second time they're executed) while logical I/O doesn't change based on what data is in cache. It also lets you scale your expectations as the number of rows your queries need to process change. If you're writing a query that needs to aggregate data from 1 million rows, for example, that should consume far more resources than a query that needs to return 100 rows of data from a table with no aggregation. If you're looking at logical I/O, you can easily scale your expectations to the size of the problem to figure out how efficient your queries could realistically be.

In Christian Antognini's "Troubleshooting Oracle Performance" (page 450), for example, he gives a rule of thumb that is pretty reasonable

5 logical reads per row that is returned/ aggregated is probably very good
10 logical reads per row that is returned/ aggregated is probably adequate
20+ logical reads per row that is returned/ aggregated is probably inefficient and needs to be tuned

Different systems with different data models may merit tweaking the buckets a bit but those are likely to be good starting points.

My guess is that if you're researchers that are not developers, you're probably running queries that need to aggregate or fetch relatively large data sets, at least in comparison to those that application developers are commonly writing. If you're scanning a million rows of data to generate some aggregate results, your queries are naturally going to consume far more resources than an application developer whose queries are reading or writing a handful of rows. You may be writing queries that are just as efficient from a logical I/O per row perspective, you just may be looking at many more rows.

If you are running queries against the live production database, you may well be in a situation where it makes sense to start segregating workload. Most organizations reach a point where running reporting queries against the live database starts to create issues for the production system. One common solution to this sort of problem is to create a separate reporting database that is fed from the production system (either via a nightly snapshot or by an ongoing replication process) where reporting queries can run without disturbing the production application. Another common solution is to use something like Oracle Resource Manager to limit the amount of resources available to one group of users (in this case, report developers) in order to minimize the impact on higher priority users (in this case, users of the production system).

How to understand the time spent in each step in execution plan?

Execution Time : Total time taken by Sql server to complete the execution of compile plan.

CPU Time : Actual time spent on CPU.If CPU time is 0 that means Query is directly fetch from cache plan.
If query is compiling and recompiling then CPU usage and CPU time will be more.
There are other factor also which affect CPU time.

Elapse Time :Total time taken for the completion of the parse and compile including time taken to bring result to client.

Both are important to understand.If I want that showing result to client in quick time is very important then I may ignore CPU time.
So in this case Elapse time is important than CPU Time.

In other case I can ignore little higher Elapse time and concentrate in reducing high CPU time.

Without that I can't figure out what the bottleneck in my sql
statement

Execution Time(CPU+Elapse) is itself not a bottleneck.It is just an indicator.
It it is not desirable then you have to investigate which operator is bottleneck
and take suitable action.

Like writing optimize query or Index Tune or change table schema or some other thing.

As per your query plan, Hash Match is considered to be most expensive operator.
Your main query is not visible.

You can also look that Wait time

why the total time spent in each plan is far less than the Elapsed
time reported from the message output window which is 331ms?

Suppose You have very simple query like,

Select * from Person.Person

In this case ,Each operator start executing after their right hand side operator has stop executing. In such scenario theoretically speaking,

Total Elapse Time is sum of each operator Elapse Time.

If query is directly executed from query plan or Elapse time is too small then it is rounded to 0.But theoretically,Total Elapse Time is sum of each operator Elapse Time.

Now take any complex query like your example,one or more operator start executing simultaneously and later merge into one,final Elapse time may be different.

Your main query plan is not completely visible else I would have tried to explain what I mean.

If you are using Sql server 2016 or above then you can view how data is flowing and how multiple operator are working together.

Live Query Statistics

Even if you don't have Sql server 2016, you can still understand the actual matter.

Measuring Query Performance:"Execution Plan Query Cost" VS "Time Taken"