Varchar(Max) Ms SQL Server 2000, Problems

varchar(max) MS SQL Server 2000, problems?

It sounds like the varchar(MAX) limitations are a moot point if your live DB is SQL Server 2000, which doesn't support them. If you have more than 8K characters to store you are pretty much left with the only other option, a TEXT column. However, beware that TEXT columns have a lot of limitations too.

For example you can't sort or group on them easily, nor can you compare them for equivalency with other columns. That is you can't say Select * from mytable where Mytext1 = mytext2.

Other relevant concerns:

  • I'd suggest using an NText or NVarchar column regardless of the way you go to support Unicode.
  • If the table has a lot of other columns and the varchar(8000) column is likely to be frequently close to full, you may have problems with the row limit of 8K. Keep this in mind too.

Why not use varchar(max)?

My answer to this, isn't about the usage of Max, as much as it is about the reason for VARCHAR(max) vs TEXT.

In my book; first of all, Unless you can be absolutely certain that you'll never encode anything but english text and people won't refer to names of foreign locations, then you should use NVARCHAR or NTEXT.

Secondly, it's what the fields allow you to do.

TEXT is hard to update in comparison to VARCHAR, but you get the advantage of Full Text Indexing and lots of clever things.

On the other hand, VARCHAR(MAX) has some ambiguity, if the size of the cell is < 8000 chars, it will be treated as Row data. If it's greater, it will be treated as a LOB for storage purposes.
Because you can't know this without querying RBAR, this may have optimization strategies for places where you need to be sure about your data and how many reads it costs.

Otherwise, if your usage is relatively mundane and you don't expect to have problems with the size of data (IE you're using .Net and therefore don't have to be concerned about the size of your string/char* objects) then using VARCHAR(max) is fine.

Varchar(max) showing as text in SQL Server Management Studio

If your database engine is running on SQL Server 2000, or if your database compatibility level is set to "80" (SQL Server 2000), then that database doesn't know about varchar(max) yet.

Change your compatibility level to a more recent version!

SQL NVARCHAR and VARCHAR Limits

I understand that there is a 4000 max set for NVARCHAR(MAX)

Your understanding is wrong. nvarchar(max) can store up to (and beyond sometimes) 2GB of data (1 billion double byte characters).

From nchar and nvarchar in Books online the grammar is

nvarchar [ ( n | max ) ]

The | character means these are alternatives. i.e. you specify either n or the literal max.

If you choose to specify a specific n then this must be between 1 and 4,000 but using max defines it as a large object datatype (replacement for ntext which is deprecated).

In fact in SQL Server 2008 it seems that for a variable the 2GB limit can be exceeded indefinitely subject to sufficient space in tempdb (Shown here)

Regarding the other parts of your question

Truncation when concatenating depends on datatype.

  1. varchar(n) + varchar(n) will truncate at 8,000 characters.
  2. nvarchar(n) + nvarchar(n) will truncate at 4,000 characters.
  3. varchar(n) + nvarchar(n) will truncate at 4,000 characters. nvarchar has higher precedence so the result is nvarchar(4,000)
  4. [n]varchar(max) + [n]varchar(max) won't truncate (for < 2GB).
  5. varchar(max) + varchar(n) won't truncate (for < 2GB) and the result will be typed as varchar(max).
  6. varchar(max) + nvarchar(n) won't truncate (for < 2GB) and the result will be typed as nvarchar(max).
  7. nvarchar(max) + varchar(n) will first convert the varchar(n) input to nvarchar(n) and then do the concatenation. If the length of the varchar(n) string is greater than 4,000 characters the cast will be to nvarchar(4000) and truncation will occur.

Datatypes of string literals

If you use the N prefix and the string is <= 4,000 characters long it will be typed as nvarchar(n) where n is the length of the string. So N'Foo' will be treated as nvarchar(3) for example. If the string is longer than 4,000 characters it will be treated as nvarchar(max)

If you don't use the N prefix and the string is <= 8,000 characters long it will be typed as varchar(n) where n is the length of the string. If longer as varchar(max)

For both of the above if the length of the string is zero then n is set to 1.

Newer syntax elements.

1. The CONCAT function doesn't help here

DECLARE @A5000 VARCHAR(5000) = REPLICATE('A',5000);

SELECT DATALENGTH(@A5000 + @A5000),
DATALENGTH(CONCAT(@A5000,@A5000));

The above returns 8000 for both methods of concatenation.

2. Be careful with +=

DECLARE @A VARCHAR(MAX) = '';

SET @A+= REPLICATE('A',5000) + REPLICATE('A',5000)

DECLARE @B VARCHAR(MAX) = '';

SET @B = @B + REPLICATE('A',5000) + REPLICATE('A',5000)

SELECT DATALENGTH(@A),
DATALENGTH(@B);`

Returns

-------------------- --------------------
8000 10000

Note that @A encountered truncation.

How to resolve the problem you are experiencing.

You are getting truncation either because you are concatenating two non max datatypes together or because you are concatenating a varchar(4001 - 8000) string to an nvarchar typed string (even nvarchar(max)).

To avoid the second issue simply make sure that all string literals (or at least those with lengths in the 4001 - 8000 range) are prefaced with N.

To avoid the first issue change the assignment from

DECLARE @SQL NVARCHAR(MAX);
SET @SQL = 'Foo' + 'Bar' + ...;

To

DECLARE @SQL NVARCHAR(MAX) = ''; 
SET @SQL = @SQL + N'Foo' + N'Bar'

so that an NVARCHAR(MAX) is involved in the concatenation from the beginning (as the result of each concatenation will also be NVARCHAR(MAX) this will propagate)

Avoiding truncation when viewing

Make sure you have "results to grid" mode selected then you can use

select @SQL as [processing-instruction(x)] FOR XML PATH 

The SSMS options allow you to set unlimited length for XML results. The processing-instruction bit avoids issues with characters such as < showing up as <.

Maximum size of a varchar(max) variable

As far as I can tell there is no upper limit in 2008.

In SQL Server 2005 the code in your question fails on the assignment to the @GGMMsg variable with

Attempting to grow LOB beyond maximum allowed size of 2,147,483,647
bytes.

the code below fails with

REPLICATE: The length of the result exceeds the length limit (2GB) of
the target large type.

However it appears these limitations have quietly been lifted. On 2008

DECLARE @y VARCHAR(MAX) = REPLICATE(CAST('X' AS VARCHAR(MAX)),92681); 

SET @y = REPLICATE(@y,92681);

SELECT LEN(@y)

Returns

8589767761

I ran this on my 32 bit desktop machine so this 8GB string is way in excess of addressable memory

Running

select internal_objects_alloc_page_count
from sys.dm_db_task_space_usage
WHERE session_id = @@spid

Returned

internal_objects_alloc_page_co 
------------------------------
2144456

so I presume this all just gets stored in LOB pages in tempdb with no validation on length. The page count growth was all associated with the SET @y = REPLICATE(@y,92681); statement. The initial variable assignment to @y and the LEN calculation did not increase this.

The reason for mentioning this is because the page count is hugely more than I was expecting. Assuming an 8KB page then this works out at 16.36 GB which is obviously more or less double what would seem to be necessary. I speculate that this is likely due to the inefficiency of the string concatenation operation needing to copy the entire huge string and append a chunk on to the end rather than being able to add to the end of the existing string. Unfortunately at the moment the .WRITE method isn't supported for varchar(max) variables.

Paul White confirms the above supposition here and also provides the information that the variable is held entirely in memory if <= 512KB and changes to the tempdb-based backup scheme for values larger than that.

Addition

I've also tested the behaviour with concatenating nvarchar(max) + nvarchar(max) and nvarchar(max) + varchar(max). Both of these allow the 2GB limit to be exceeded. Trying to then store the results of this in a table then fails however with the error message Attempting to grow LOB beyond maximum allowed size of 2147483647 bytes. again. The script for that is below (may take a long time to run).

DECLARE @y1 VARCHAR(MAX) = REPLICATE(CAST('X' AS VARCHAR(MAX)),2147483647); 
SET @y1 = @y1 + @y1;
SELECT LEN(@y1), DATALENGTH(@y1) /*4294967294, 4294967292*/

DECLARE @y2 NVARCHAR(MAX) = REPLICATE(CAST('X' AS NVARCHAR(MAX)),1073741823);
SET @y2 = @y2 + @y2;
SELECT LEN(@y2), DATALENGTH(@y2) /*2147483646, 4294967292*/

DECLARE @y3 NVARCHAR(MAX) = @y2 + @y1
SELECT LEN(@y3), DATALENGTH(@y3) /*6442450940, 12884901880*/

/*This attempt fails*/
SELECT @y1 y1, @y2 y2, @y3 y3
INTO Test

Using varchar(MAX) vs TEXT on SQL Server

The VARCHAR(MAX) type is a replacement for TEXT. The basic difference is that a TEXT type will always store the data in a blob whereas the VARCHAR(MAX) type will attempt to store the data directly in the row unless it exceeds the 8k limitation and at that point it stores it in a blob.

Using the LIKE statement is identical between the two datatypes. The additional functionality VARCHAR(MAX) gives you is that it is also can be used with = and GROUP BY as any other VARCHAR column can be. However, if you do have a lot of data you will have a huge performance issue using these methods.

In regard to if you should use LIKE to search, or if you should use Full Text Indexing and CONTAINS. This question is the same regardless of VARCHAR(MAX) or TEXT.

If you are searching large amounts of text and performance is key then you should use a Full Text Index.

LIKE is simpler to implement and is often suitable for small amounts of data, but it has extremely poor performance with large data due to its inability to use an index.

Are there any disadvantages to always using nvarchar(MAX)?

Same question was asked on MSDN Forums:

  • Varchar(max) vs Varchar(255)

From the original post (much more information there):

When you store data to a VARCHAR(N) column, the values are physically stored in the same way. But when you store it to a VARCHAR(MAX) column, behind the screen the data is handled as a TEXT value. So there is some additional processing needed when dealing with a VARCHAR(MAX) value. (only if the size exceeds 8000)

VARCHAR(MAX) or NVARCHAR(MAX) is considered as a 'large value type'. Large value types are usually stored 'out of row'. It means that the data row will have a pointer to another location where the 'large value' is stored...



Related Topics



Leave a reply



Submit