Are There Any Disadvantages to Always Using Nvarchar(Max)

Are there any disadvantages to always using nvarchar(MAX)?

Same question was asked on MSDN Forums:

  • Varchar(max) vs Varchar(255)

From the original post (much more information there):

When you store data to a VARCHAR(N) column, the values are physically stored in the same way. But when you store it to a VARCHAR(MAX) column, behind the screen the data is handled as a TEXT value. So there is some additional processing needed when dealing with a VARCHAR(MAX) value. (only if the size exceeds 8000)

VARCHAR(MAX) or NVARCHAR(MAX) is considered as a 'large value type'. Large value types are usually stored 'out of row'. It means that the data row will have a pointer to another location where the 'large value' is stored...

Good practices of SQL Server: nvarchar(max) performance

Best practice is to perform the appropriate data analysis BEFORE you design your table. Without context, one can assume that a description does not consist of pages and pages of text, so the "max" choice is probably not appropriate. As an additional consideration when choosing varchar(max), remember that you typically need to provide support for displaying such values in an application. If you do not intend to design a GUI to do so, then the choice is probably not appropriate.

And one more caveat - it is generally futile to attempt to future-proof your schema by choosing datatypes that exceed your foreseeable needs.

Are there disadvantages to using VARCHAR(MAX) in a table?

Sounds to me like you plan to use the varchar(MAX) data type for its intended purpose.

When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types.

For further reading, check out Books Online: char and varchar

Is a nvarchar(max) less performant than a nvarchar(100) for instance?

Same question was answered here (SO) and here (MSDN)

Quoting David Kreps's answer:

When you store data to a VARCHAR(N) column, the values are physically stored in the same way. But when you store it to a VARCHAR(MAX) column, behind the screen the data is handled as a TEXT value. So there is some additional processing needed when dealing with a VARCHAR(MAX) value. (only if the size exceeds 8000)

VARCHAR(MAX) or NVARCHAR(MAX) is considered as a 'large value type'. Large value types are usually stored 'out of row'. It means that the data row will have a pointer to another location where the 'large value' is stored...

How much length can NVARCHAR(MAX) store?

An nvarchar(MAX) can store up to 2GB of characters. Each character in an nvarchar is 2bytes in size. 2GB is 2,000,000,000 bytes so an nvarchar(MAX) can store 2,000,000,000 / 2 characters = 1,000,000,000 characters.

So, to answer your question "Could you fit 20,000 characters into an nvarchar(MAX)?": Yes, you could (50,000 times).

implications of using nvarchar(max) with non-indexed fields

It being indexed or not comes into play when you're trying to search on that data. So, if it's data that just comes along for the ride when you query on other fields, then it shouldn't be too bad. There's a little more to it (i.e. covering indexes), but that's the basic gist of it.

Efficiency of varchar(max) in T-SQL code

In Are there any disadvantages to always using nvarchar(MAX)? there is one answer https://stackoverflow.com/a/26120578/489865 which relates to T-SQL variables performance and not column definitions.

The gist of that post is to run SELECT @var='ABC' queries returning 1,000,000 rows, assigning to variables defined as nvarchar() versus nvarchar(max).

Under SQL Server 2008 R2, I concur with the poster's findings that nvarchar(max) is 4 times slower than nvarchar() in the example. Interestingly, if it is changed to make the assignment do slightly more work as in:

SET NOCOUNT ON;

--===== Test Variable Assignment 1,000,000 times using NVARCHAR(300)
DECLARE @SomeString NVARCHAR(300),
@StartTime DATETIME
;
SELECT @startTime = GETDATE()
;
SELECT TOP 1000000
@SomeString = 'ABC' + ac1.[name] + ac2.[name]
FROM master.sys.all_columns ac1,
master.sys.all_columns ac2
;
SELECT Duration = DATEDIFF(ms,@StartTime,GETDATE())
;
GO
--===== Test Variable Assignment 1,000,000 times using NVARCHAR(4000)
DECLARE @SomeString NVARCHAR(4000),
@StartTime DATETIME
;
SELECT @startTime = GETDATE()
;
SELECT TOP 1000000
@SomeString = 'ABC' + ac1.[name] + ac2.[name]
FROM master.sys.all_columns ac1,
master.sys.all_columns ac2
;
SELECT Duration = DATEDIFF(ms,@StartTime,GETDATE())
;
GO
--===== Test Variable Assignment 1,000,000 times using VARCHAR(MAX)
DECLARE @SomeString NVARCHAR(MAX),
@StartTime DATETIME
;
SELECT @startTime = GETDATE()
;
SELECT TOP 1000000
@SomeString = 'ABC' + ac1.[name] + ac2.[name]
FROM master.sys.all_columns ac1,
master.sys.all_columns ac2
;
SELECT Duration = DATEDIFF(ms,@StartTime,GETDATE())
;
GO

(note the + ac1.[name] + ac2.[name]) then the nvarchar(max) takes only twice as long. So in practice performance hit for nvarchar(max) may be better than at first seems.

Is varchar(MAX) always preferable?

There is a very good article on this subject by SO User @Remus Rusanu. Here is a snippit that I've stolen but I suggest you read the whole thing:

The code path that handles the MAX types (varchar, nvarchar and
varbinary) is different from the code path that handles their
equivalent non-max length types. The non-max types can internally be
represented as an ordinary pointer-and-length structure. But the max
types cannot be stored internally as a contiguous memory area, since
they can possibly grow up to 2Gb. So they have to be represented by a
streaming interface, similar to COM’s IStream. This carries over to
every operation that involves the max types, including simple
assignment and comparison, since these operations are more complicated
over a streaming interface. The biggest impact is visible in the code
that allocates and assign max-type variables (my first test), but the
impact is visible on every operation.

In the article he shows several examples that demonstrate that using varchar(n) typically improves performance.

You can find the entire article here.



Related Topics



Leave a reply



Submit