varchar(max) MS SQL Server 2000, problems?
It sounds like the varchar(MAX)
limitations are a moot point if your live DB is SQL Server 2000, which doesn't support them. If you have more than 8K characters to store you are pretty much left with the only other option, a TEXT
column. However, beware that TEXT
columns have a lot of limitations too.
For example you can't sort or group on them easily, nor can you compare them for equivalency with other columns. That is you can't say Select * from mytable where Mytext1 = mytext2
.
Other relevant concerns:
- I'd suggest using an
NText
orNVarchar
column regardless of the way you go to support Unicode. - If the table has a lot of other columns and the
varchar(8000)
column is likely to be frequently close to full, you may have problems with the row limit of 8K. Keep this in mind too.
Why not use varchar(max)?
My answer to this, isn't about the usage of Max, as much as it is about the reason for VARCHAR(max) vs TEXT.
In my book; first of all, Unless you can be absolutely certain that you'll never encode anything but english text and people won't refer to names of foreign locations, then you should use NVARCHAR or NTEXT.
Secondly, it's what the fields allow you to do.
TEXT is hard to update in comparison to VARCHAR, but you get the advantage of Full Text Indexing and lots of clever things.
On the other hand, VARCHAR(MAX) has some ambiguity, if the size of the cell is < 8000 chars, it will be treated as Row data. If it's greater, it will be treated as a LOB for storage purposes.
Because you can't know this without querying RBAR, this may have optimization strategies for places where you need to be sure about your data and how many reads it costs.
Otherwise, if your usage is relatively mundane and you don't expect to have problems with the size of data (IE you're using .Net and therefore don't have to be concerned about the size of your string/char* objects) then using VARCHAR(max) is fine.
Varchar(max) showing as text in SQL Server Management Studio
If your database engine is running on SQL Server 2000, or if your database compatibility level is set to "80" (SQL Server 2000), then that database doesn't know about varchar(max)
yet.
Change your compatibility level to a more recent version!
SQL NVARCHAR and VARCHAR Limits
I understand that there is a 4000 max set for
NVARCHAR(MAX)
Your understanding is wrong. nvarchar(max)
can store up to (and beyond sometimes) 2GB of data (1 billion double byte characters).
From nchar and nvarchar in Books online the grammar is
nvarchar [ ( n | max ) ]
The |
character means these are alternatives. i.e. you specify either n
or the literal max
.
If you choose to specify a specific n
then this must be between 1 and 4,000 but using max
defines it as a large object datatype (replacement for ntext
which is deprecated).
In fact in SQL Server 2008 it seems that for a variable the 2GB limit can be exceeded indefinitely subject to sufficient space in tempdb
(Shown here)
Regarding the other parts of your question
Truncation when concatenating depends on datatype.
varchar(n) + varchar(n)
will truncate at 8,000 characters.nvarchar(n) + nvarchar(n)
will truncate at 4,000 characters.varchar(n) + nvarchar(n)
will truncate at 4,000 characters.nvarchar
has higher precedence so the result isnvarchar(4,000)
[n]varchar(max)
+[n]varchar(max)
won't truncate (for < 2GB).varchar(max)
+varchar(n)
won't truncate (for < 2GB) and the result will be typed asvarchar(max)
.varchar(max)
+nvarchar(n)
won't truncate (for < 2GB) and the result will be typed asnvarchar(max)
.nvarchar(max)
+varchar(n)
will first convert thevarchar(n)
input tonvarchar(n)
and then do the concatenation. If the length of thevarchar(n)
string is greater than 4,000 characters the cast will be tonvarchar(4000)
and truncation will occur.
Datatypes of string literals
If you use the N
prefix and the string is <= 4,000 characters long it will be typed as nvarchar(n)
where n
is the length of the string. So N'Foo'
will be treated as nvarchar(3)
for example. If the string is longer than 4,000 characters it will be treated as nvarchar(max)
If you don't use the N
prefix and the string is <= 8,000 characters long it will be typed as varchar(n)
where n
is the length of the string. If longer as varchar(max)
For both of the above if the length of the string is zero then n
is set to 1.
Newer syntax elements.
1. The CONCAT
function doesn't help here
DECLARE @A5000 VARCHAR(5000) = REPLICATE('A',5000);
SELECT DATALENGTH(@A5000 + @A5000),
DATALENGTH(CONCAT(@A5000,@A5000));
The above returns 8000 for both methods of concatenation.
2. Be careful with +=
DECLARE @A VARCHAR(MAX) = '';
SET @A+= REPLICATE('A',5000) + REPLICATE('A',5000)
DECLARE @B VARCHAR(MAX) = '';
SET @B = @B + REPLICATE('A',5000) + REPLICATE('A',5000)
SELECT DATALENGTH(@A),
DATALENGTH(@B);`
Returns
-------------------- --------------------
8000 10000
Note that @A
encountered truncation.
How to resolve the problem you are experiencing.
You are getting truncation either because you are concatenating two non max
datatypes together or because you are concatenating a varchar(4001 - 8000)
string to an nvarchar
typed string (even nvarchar(max)
).
To avoid the second issue simply make sure that all string literals (or at least those with lengths in the 4001 - 8000 range) are prefaced with N
.
To avoid the first issue change the assignment from
DECLARE @SQL NVARCHAR(MAX);
SET @SQL = 'Foo' + 'Bar' + ...;
To
DECLARE @SQL NVARCHAR(MAX) = '';
SET @SQL = @SQL + N'Foo' + N'Bar'
so that an NVARCHAR(MAX)
is involved in the concatenation from the beginning (as the result of each concatenation will also be NVARCHAR(MAX)
this will propagate)
Avoiding truncation when viewing
Make sure you have "results to grid" mode selected then you can use
select @SQL as [processing-instruction(x)] FOR XML PATH
The SSMS options allow you to set unlimited length for XML
results. The processing-instruction
bit avoids issues with characters such as <
showing up as <
.
Maximum size of a varchar(max) variable
As far as I can tell there is no upper limit in 2008.
In SQL Server 2005 the code in your question fails on the assignment to the @GGMMsg
variable with
Attempting to grow LOB beyond maximum allowed size of 2,147,483,647
bytes.
the code below fails with
REPLICATE: The length of the result exceeds the length limit (2GB) of
the target large type.
However it appears these limitations have quietly been lifted. On 2008
DECLARE @y VARCHAR(MAX) = REPLICATE(CAST('X' AS VARCHAR(MAX)),92681);
SET @y = REPLICATE(@y,92681);
SELECT LEN(@y)
Returns
8589767761
I ran this on my 32 bit desktop machine so this 8GB string is way in excess of addressable memory
Running
select internal_objects_alloc_page_count
from sys.dm_db_task_space_usage
WHERE session_id = @@spid
Returned
internal_objects_alloc_page_co
------------------------------
2144456
so I presume this all just gets stored in LOB
pages in tempdb
with no validation on length. The page count growth was all associated with the SET @y = REPLICATE(@y,92681);
statement. The initial variable assignment to @y
and the LEN
calculation did not increase this.
The reason for mentioning this is because the page count is hugely more than I was expecting. Assuming an 8KB page then this works out at 16.36 GB which is obviously more or less double what would seem to be necessary. I speculate that this is likely due to the inefficiency of the string concatenation operation needing to copy the entire huge string and append a chunk on to the end rather than being able to add to the end of the existing string. Unfortunately at the moment the .WRITE
method isn't supported for varchar(max) variables.
Paul White confirms the above supposition here and also provides the information that the variable is held entirely in memory if <= 512KB and changes to the tempdb-based backup scheme for values larger than that.
Addition
I've also tested the behaviour with concatenating nvarchar(max) + nvarchar(max)
and nvarchar(max) + varchar(max)
. Both of these allow the 2GB limit to be exceeded. Trying to then store the results of this in a table then fails however with the error message Attempting to grow LOB beyond maximum allowed size of 2147483647 bytes.
again. The script for that is below (may take a long time to run).
DECLARE @y1 VARCHAR(MAX) = REPLICATE(CAST('X' AS VARCHAR(MAX)),2147483647);
SET @y1 = @y1 + @y1;
SELECT LEN(@y1), DATALENGTH(@y1) /*4294967294, 4294967292*/
DECLARE @y2 NVARCHAR(MAX) = REPLICATE(CAST('X' AS NVARCHAR(MAX)),1073741823);
SET @y2 = @y2 + @y2;
SELECT LEN(@y2), DATALENGTH(@y2) /*2147483646, 4294967292*/
DECLARE @y3 NVARCHAR(MAX) = @y2 + @y1
SELECT LEN(@y3), DATALENGTH(@y3) /*6442450940, 12884901880*/
/*This attempt fails*/
SELECT @y1 y1, @y2 y2, @y3 y3
INTO Test
Using varchar(MAX) vs TEXT on SQL Server
The VARCHAR(MAX)
type is a replacement for TEXT
. The basic difference is that a TEXT
type will always store the data in a blob whereas the VARCHAR(MAX)
type will attempt to store the data directly in the row unless it exceeds the 8k limitation and at that point it stores it in a blob.
Using the LIKE statement is identical between the two datatypes. The additional functionality VARCHAR(MAX)
gives you is that it is also can be used with =
and GROUP BY
as any other VARCHAR
column can be. However, if you do have a lot of data you will have a huge performance issue using these methods.
In regard to if you should use LIKE
to search, or if you should use Full Text Indexing and CONTAINS
. This question is the same regardless of VARCHAR(MAX)
or TEXT
.
If you are searching large amounts of text and performance is key then you should use a Full Text Index.
LIKE
is simpler to implement and is often suitable for small amounts of data, but it has extremely poor performance with large data due to its inability to use an index.
Are there any disadvantages to always using nvarchar(MAX)?
Same question was asked on MSDN Forums:
- Varchar(max) vs Varchar(255)
From the original post (much more information there):
When you store data to a VARCHAR(N) column, the values are physically stored in the same way. But when you store it to a VARCHAR(MAX) column, behind the screen the data is handled as a TEXT value. So there is some additional processing needed when dealing with a VARCHAR(MAX) value. (only if the size exceeds 8000)
VARCHAR(MAX) or NVARCHAR(MAX) is considered as a 'large value type'. Large value types are usually stored 'out of row'. It means that the data row will have a pointer to another location where the 'large value' is stored...
Related Topics
Insert Binary File into Mssql Db (Varbinary) with Python Pymssql
Sql "If Exists..." Dynamic Query
Cannot Connect to SQL Server Express from Ssms
How to Get Id of Newly Inserted Record Using Excel Vba
How to Execute Different Select Statements Based on a Case
Generated Excel from Ssis But Getting Quote in Every Column
How to Generate All Constraints Scripts
Update Multiple Rows Using Select Statement
Sql: Insert a Linebreak in Varchar String
Importing and Validating Xml File Using Ssis or Just Plain T-Sql
How to Add Months to a Current_Timestamp in Sql
Writing SQL Query for Getting Maximum Occurrence of a Value in a Column
Sql Query - Limiting Query Results
T-Sql Row Number Restart After N
MySQL to Get The Count of Rows That Fall on a Date for Each Day of a Month
Data Type Equivalents: Ms Access Tables ↔ 'Create Table' Queries ↔ Odbc Sql