Are Stored Procedures More Efficient, in General, Than Inline Statements on Modern Rdbms'S

Are Stored Procedures more efficient, in general, than inline statements on modern RDBMS's?


NOTE that this is a general look at stored procedures not regulated to a specific
DBMS. Some DBMS (and even, different
versions of the same DBMS!) may operate
contrary to this, so you'll want to
double-check with your target DBMS
before assuming all of this still holds.

I've been a Sybase ASE, MySQL, and SQL Server DBA on-and off since for almost a decade (along with application development in C, PHP, PL/SQL, C#.NET, and Ruby). So, I have no particular axe to grind in this (sometimes) holy war.

The historical performance benefit of stored procs have generally been from the following (in no particular order):

  • Pre-parsed SQL
  • Pre-generated query execution plan
  • Reduced network latency
  • Potential cache benefits

Pre-parsed SQL -- similar benefits to compiled vs. interpreted code, except on a very micro level.

Still an advantage?
Not very noticeable at all on the modern CPU, but if you are sending a single SQL statement that is VERY large eleventy-billion times a second, the parsing overhead can add up.

Pre-generated query execution plan.
If you have many JOINs the permutations can grow quite unmanageable (modern optimizers have limits and cut-offs for performance reasons). It is not unknown for very complicated SQL to have distinct, measurable (I've seen a complicated query take 10+ seconds just to generate a plan, before we tweaked the DBMS) latencies due to the optimizer trying to figure out the "near best" execution plan. Stored procedures will, generally, store this in memory so you can avoid this overhead.

Still an advantage?
Most DBMS' (the latest editions) will cache the query plans for INDIVIDUAL SQL statements, greatly reducing the performance differential between stored procs and ad hoc SQL. There are some caveats and cases in which this isn't the case, so you'll need to test on your target DBMS.

Also, more and more DBMS allow you to provide optimizer path plans (abstract query plans) to significantly reduce optimization time (for both ad hoc and stored procedure SQL!!).

WARNING Cached query plans are not a performance panacea. Occasionally the query plan that is generated is sub-optimal.
For example, if you send SELECT *
FROM table WHERE id BETWEEN 1 AND
99999999
, the DBMS may select a
full-table scan instead of an index
scan because you're grabbing every row
in the table (so sayeth the
statistics). If this is the cached
version, then you can get poor
performance when you later send
SELECT * FROM table WHERE id BETWEEN
1 AND 2
. The reasoning behind this is
outside the scope of this posting, but
for further reading see:
http://www.microsoft.com/technet/prodtechnol/sql/2005/frcqupln.mspx
and
http://msdn.microsoft.com/en-us/library/ms181055.aspx
and http://www.simple-talk.com/sql/performance/execution-plan-basics/

"In summary, they determined that
supplying anything other than the
common values when a compile or
recompile was performed resulted in
the optimizer compiling and caching
the query plan for that particular
value. Yet, when that query plan was
reused for subsequent executions of
the same query for the common values
(‘M’, ‘R’, or ‘T’), it resulted in
sub-optimal performance. This
sub-optimal performance problem
existed until the query was
recompiled. At that point, based on
the @P1 parameter value supplied, the
query might or might not have a
performance problem."

Reduced network latency
A) If you are running the same SQL over and over -- and the SQL adds up to many KB of code -- replacing that with a simple "exec foobar" can really add up.
B) Stored procs can be used to move procedural code into the DBMS. This saves shuffling large amounts of data off to the client only to have it send a trickle of info back (or none at all!). Analogous to doing a JOIN in the DBMS vs. in your code (everyone's favorite WTF!)

Still an advantage?
A) Modern 1Gb (and 10Gb and up!) Ethernet really make this negligible.
B) Depends on how saturated your network is -- why shove several megabytes of data back and forth for no good reason?

Potential cache benefits
Performing server-side transforms of data can potentially be faster if you have sufficient memory on the DBMS and the data you need is in memory of the server.

Still an advantage?
Unless your app has shared memory access to DBMS data, the edge will always be to stored procs.

Of course, no discussion of Stored Procedure optimization would be complete without a discussion of parameterized and ad hoc SQL.

Parameterized / Prepared SQL

Kind of a cross between stored procedures and ad hoc SQL, they are embedded SQL statements in a host language that uses "parameters" for query values, e.g.:

SELECT .. FROM yourtable WHERE foo = ? AND bar = ?

These provide a more generalized version of a query that modern-day optimizers can use to cache (and re-use) the query execution plan, resulting in much of the performance benefit of stored procedures.

Ad Hoc SQL
Just open a console window to your DBMS and type in a SQL statement. In the past, these were the "worst" performers (on average) since the DBMS had no way of pre-optimizing the queries as in the parameterized/stored proc method.

Still a disadvantage?
Not necessarily. Most DBMS have the ability to "abstract" ad hoc SQL into parameterized versions -- thus more or less negating the difference between the two. Some do this implicitly or must be enabled with a command setting (SQL server: http://msdn.microsoft.com/en-us/library/ms175037.aspx , Oracle: http://www.praetoriate.com/oracle_tips_cursor_sharing.htm).

Lessons learned?
Moore's law continues to march on and DBMS optimizers, with every release, get more sophisticated. Sure, you can place every single silly teeny SQL statement inside a stored proc, but just know that the programmers working on optimizers are very smart and are continually looking for ways to improve performance. Eventually (if it's not here already) ad hoc SQL performance will become indistinguishable (on average!) from stored procedure performance, so any sort of massive stored procedure use ** solely for "performance reasons"** sure sounds like premature optimization to me.

Anyway, I think if you avoid the edge cases and have fairly vanilla SQL, you won't notice a difference between ad hoc and stored procedures.

Stored procedures or inline queries?

It doesn't need to be one or the other. If it's a simple query, use the SubSonic query tool. If it's more complex, use a stored procedure and load up a collection or create a dataset from the results.

See here: What are the pros and cons to keeping SQL in Stored Procs versus Code and here SubSonic and Stored Procedures

What is your best-practice advice on implementing SQL stored procedures (in a C# winforms application)?

Consider skipping stored procedures for an ORM. Consider using:

  • LINQ To SQL
  • Entity Framework
  • SubSonic

  • You'll be writing less boiler plate ListCustomer and GetCustomerByID code when you could be adding more value to your application.

  • IMO, there isn't any real compelling reason to choose stored procedures with the modern toolset that we have in the Microsoft stack.

  • The move away from inline SQL statements is good, and an ORM will help parameterize your queries for you. You don't have to think about it.

  • You don't have to mess with ADO.NET objects at all. Code your data access in an object oriented fashion.

Are there significant performance benefits from changing involved queries to stored procedures?

Read the other solution before reading this one.

Essentially, I ran some tests and swapped out my project's dynamic queries to stored procedures and saw massive performance bumps.

I was reading the SAP HANA Best Practices reference and came across this passage under Performance:

Executing dynamic SQL is slow because compile time checks and query
optimization must be done for every invocation of the procedure.
Another related problem is security because constructing SQL
statements without proper checks of the variables used may harm
security.

Which is better: Ad hoc queries or stored procedures?

In my experience writing mostly WinForms Client/Server apps these are the simple conclusions I've come to:

Use Stored Procedures:

  1. For any complex data work. If you're going to be doing something truly requiring a cursor or temp tables it's usually fastest to do it within SQL Server.
  2. When you need to lock down access to the data. If you don't give table access to users (or role or whatever) you can be sure that the only way to interact with the data is through the SP's you create.

Use ad-hoc queries:

  1. For CRUD when you don't need to restrict data access (or are doing so in another manner).
  2. For simple searches. Creating SP's for a bunch of search criteria is a pain and difficult to maintain. If you can generate a reasonably fast search query use that.

In most of my applications I've used both SP's and ad-hoc sql, though I find I'm using SP's less and less as they end up being code just like C#, only harder to version control, test, and maintain. I would recommend using ad-hoc sql unless you can find a specific reason not to.

Stored Procedures not that fast?

I agree with Jeremy. One of the better reasons for using a stored procedures is to avoid circumstances where you call the database an unpredictable number of times within a loop. You should move the loop to the stored procedure itself and call it once. Therefore, your example shows a poor usage of stored procedures. A bit like racing a tractor and a Ferrari across a ploughed field, and then claiming that the Ferrari is slow.

Does SQL Server cache LINQ to SQL queries?

When it comes to the execution plan in SQL Server there is no major differences in speed with stored procedures vs normal non-compiled queries. See this for more information about other factors about speed.

LINQ-to-SQL generates normal paramatised queries (SQL Server knows nothing about LINQ, the app generates normal SQL) so the balance is the same. That is if you aren't using the LINQ-to-SQL functions which call Stored Procs (assuming you meant the query stuff).

You may also want to read this, which is a similar question, but focuses on the more important benefits of LINQ vs StoredProcs

Why execute stored procedures is faster than SQL query from a script?

SQL Server basically goes through these steps to execute any query (stored procedure call or ad-hoc SQL statement):

1) syntactically check the query

2) if it's okay - it checks the plan cache to see if it already has an execution plan for that query

3) if there is an execution plan - that plan is (re-)used and the query executed

4) if there is no plan yet, an execution plan is determined

5) that plan is stored into the plan cache for later reuse

6) the query is executed

The point is: ad-hoc SQL and stored procedures are treatly no differently.

If an ad-hoc SQL query is properly using parameters - as it should anyway, to prevent SQL injection attacks - its performance characteristics are no different and most definitely no worse than executing a stored procedure.

Stored procedure have other benefits (no need to grant users direct table access, for instance), but in terms of performance, using properly parametrized ad-hoc SQL queries is just as efficient as using stored procedures.

Update: using stored procedures over non-parametrized queries is better for two main reasons:

  • since each non-parametrized query is a new, different query to SQL Server, it has to go through all the steps of determining the execution plan, for each query (thus wasting time - and also wasting plan cache space, since storing the execution plan into plan cache doesn't really help in the end, since that particular query will probably not be executed again)

  • non-parametrized queries are at risk of SQL injection attack and should be avoided at all costs



Related Topics



Leave a reply



Submit