Is the Nolock (SQL Server Hint) Bad Practice

Is the NOLOCK (Sql Server hint) bad practice?

With NOLOCK hint, the transaction isolation level for the SELECT statement is READ UNCOMMITTED. This means that the query may see dirty and inconsistent data.

This is not a good idea to apply as a rule. Even if this dirty read behavior is OK for your mission critical web based application, a NOLOCK scan can cause 601 error which will terminate the query due to data movement as a result of lack of locking protection.

I suggest reading When Snapshot Isolation Helps and When It Hurts - the MSDN recommends using READ COMMITTED SNAPSHOT rather than SNAPSHOT under most circumstances.

Is the With(NoLock) hint dangerous in my case?

Yes, there could well be issues if you are collecting data that has to be accurate - you may grab "dirty" / incorrect data and then persist it. Have you looked at whether indexing can correct the deadlock problem instead? Often a deadlock scenario can be solved by controlling the access pattern to the data pages, which is controlled by the available indexes.

What I like to do is look at the deadlock graph to see where the conflict is, then look at the order of operations in the code, and which indexes are being used to access/modify the data, to see if I can adjust to remove the deadlock risk.

To NOLOCK or not to NOLOCK?

I see you've read a lot about it, but allow me to point you to a very good explanation on the dangers of using NOLOCK (that's it READ UNCOMMITTED isolation level): SQL Server NOLOCK Hint & other poor ideas.

Apart from this, I'll make some citations and comments. The worst part of NOLOCK is this:

It creates “incredibly hard to reproduce” bugs.

The problem is that when you read uncommited data, most of the time is commited, so everything is alright. But it will randomly fail if the transaction is not comitted. And that doesn't usually happen. Right? Nope: first, a single error is a very bad thing (your customer don't like it). And second, things can get much worse, LO:

The issue is that transactions do more than just update the row. Often they require an index to be updated OR they run out of space on the data page. This may require new pages to be allocated & existing rows on that page to be moved, called a PageSplit. It is possible for your select to completely miss a number of rows &/or count other rows twice. More info on this in the linked article

So, that means that even if the uncommited transaction you've read is committed, you can still read bad data. And, this will happen at random times. That's ugly, very ugly!

What about corruption?

As Remus Rusanu said, it's not "hard" but "soft" corruption. And it affects specially aggregates, because you're reading what you shouldn't when updating them. This can lead for example to a wrong account balance.

Haven't you heard of big LOB apps that have procedures to rebuild account balances? Why? They should be correctly updated inside transactions! (That can be acceptable if the balances are rebuilt at critical moments, for example while calcultaing taxes).

What can I do without corrupting data (and thus is relatively safe)?

Let's say it's "quite safe" to read uncommited data when you're not using it to update other existing data on the DB. I.e. if you use NOLOCK only for reporting purposes (without write-back) you're on the "quite safe" side. The only "tiny trouble" is that the report can show the wrong data, but, at least, the data in the DB will keep consistent.

To consider this safe depends on the prupose of what you're reading. If it's something informational, which is not going to be used to make decissions, that's quite safe (for example it's not very bad to have some errors on a report of the best customers, or the most sold products). But if you're getting this information to make decissions, things can be much worse (you can make a decission on a wrong basis!)

A particular experience

I worked on the development of a 'crowded' application, with some 1,500 users which used NOLOCK for reading data, modifying it an updating it on the DB (a HHRR/TEA company). And (apparently) there were no problems. The trick was that each employee read "atomic data" (an employee's data) to modify it, and it was nearly impossible that two people read and modified the same data at the same time. Besides this "atomic data" didn't influence any aggregate data. So everything was fine. But from time to time there were problems on the reporting area, which read "aggregated data" with NOLOCK. So, the critical reports had to be scheduled for moments where noone was working in the DB. The small deviations on non-critical reports was overlooked and admittable.

Now you know it. You have no excuses. You decide, to NOLOCK or not to NOLOCK

Does NOLOCK hint slow down operation?

The short answer to the question as stated is: "No."

In most cases the NOLOCK hint will speed up the query in question, as well as, any other queries operating against the specified table at the same time. The reason is that no locks are checked or obtained. You've listed the possible side effects in your question so I won't cover those here.

At the end of the day the query will be faster, but the results will be suspect.

Do I get any performance gain by using WITH (NOLOCK) on SQL Server database where READ_COMMITTED_SNAPSHOT is switched on?

In theory there should be an improvement because although the query will not block in either case, in the READ_COMMITTED_SNAPSHOT case the book-keeping associated with the locks is still required, so that the database knows when it needs to create/retain/cleanup snapshotted rows/pages.

However as with all performance questions, you should try it and see if the difference exists in practice, and if it is, if it matters to your use-case.

Understanding the NOLOCK hint

The NOLOCK table hint will cause that no shared locks will be taken for the table in question; same with READUNCOMMITTED isolation level, but this time applies not to a single table but rather to everything involved. So, the answer is 'no, it won't lock the table'. Note that possible schema locks will still be held even with readuncommitted.

Not asking for shared locks the read operation will potentially read dirty data (updated but not yet committed) and non-existent data (updated, but rolled back), transactionally inconsistent in any case, and migh even skip the whole pages as well (in case of page splits happening simultaneously with the read operation).

Specifying either NOLOCK or READUNCOMMITTED is not considered a good practice. It will be faster, of course. Just make sure you're aware of consequences.

Also, support for these hints in UPDATE and DELETE statements will be removed in a future version, according to docs.

Can NOLOCK cause DISTINCT to fail?

Sure this can happen. The engine is smart enough to know that since ID is your primary key it isn't going to waste resources looking for duplicates. However, you have introduced the dreaded NOLOCK hint. And you said that TableB is being updated during this process.

What you are almost certainly experiencing here is one the side affects of NOLOCK brought on by page splits. These page splits can cause the engine to return duplicate rows and as I said before the engine assumes you have no duplicates because you are selecting the primary key and there can't be duplicates. This is NOT a bug in sql server, it is yet another reason so stop using the hint.

Effect of NOLOCK hint in SELECT statements

1) Yes, a select with NOLOCK will complete faster than a normal select.

2) Yes, a select with NOLOCK will allow other queries against the effected table to complete faster than a normal select.

Why would this be?

NOLOCK typically (depending on your DB engine) means give me your data, and I don't care what state it is in, and don't bother holding it still while you read from it. It is all at once faster, less resource-intensive, and very very dangerous.

You should be warned to never do an update from or perform anything system critical, or where absolute correctness is required using data that originated from a NOLOCK read. It is absolutely possible that this data contains rows that were deleted during the query's run or that have been deleted in other sessions that have yet to be finalized. It is possible that this data includes rows that have been partially updated. It is possible that this data contains records that violate foreign key constraints. It is possible that this data excludes rows that have been added to the table but have yet to be committed.

You really have no way to know what the state of the data is.

If you're trying to get things like a Row Count or other summary data where some margin of error is acceptable, then NOLOCK is a good way to boost performance for these queries and avoid having them negatively impact database performance.

Always use the NOLOCK hint with great caution and treat any data it returns suspiciously.

Is the Nolock (SQL Server Hint) Bad Practice