Difference Between Varchar(500) VS Varchar(Max) in SQL Server

difference between varchar(500) vs varchar(max) in sql server

In SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.

To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.

To solve this problem, Microsoft introduced the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.

Microsoft recommend using MAX data types instead of BLOBs in SQL Server 2005. In fact, BLOBs are being deprecated in future releases of SQL Server.

Credit: http://www.teratrax.com/articles/varchar_max.html


In SQL Server 2005 and SQL Server 2008, The maximum storage size for VARCHAR(MAX) is 2^31-1 bytes (2,147,483,647 bytes or 2GB - 1 bytes). The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. Since each character in a VARCHAR data type uses one byte, the maximum length for a VARCHAR(MAX) data type is 2,147,483,645.

Full Interesting read for you: http://www.sql-server-helper.com/faq/sql-server-2005-varchar-max-p01.aspx

Reference: http://msdn.microsoft.com/en-us/library/ms143432.aspx

sql varchar(max) vs varchar(fix)

MSDN

  • Use varchar when the sizes of the column data entries vary
    considerably.
  • Use varchar(max) when the sizes of the column data entries vary
    considerably, and the size might exceed 8,000 bytes.

When the the length is specified in declaring a VARCHAR variable or column, the maximum length allowed is 8000. If the length is greater than 8000, you have to use the MAX specifier as the length. If a length greater than 8000 is specified, the following error will be encountered (assuming that the length specified is 10000):

The size (10000) given to the type 'varchar' exceeds the maximum allowed for any data type (8000).

UPDATE :-
I found a link which I would like to share:-

Here

There is not much performance difference between Varchar[(n)] and Varchar(Max). Varchar[(n)] provides better performance results compared to Varchar(Max). If we know that data to be stored in the column or variable is less than or equal to 8000 characters, then using this Varchar[(n)] data type provides better performance compared to Varchar(Max).Example: When I ran the below script by changing the variable @FirstName type to Varchar(Max) then for 1 million assignments it is consistently taking double time than when we used data type as Varchar(50) for variable @FirstName.

DECLARE @FirstName VARCHAR(50), @COUNT INT=0, @StartTime DATETIME = GETDATE()
WHILE(@COUNT < 1000000)
BEGIN
SELECT @FirstName = 'Suraj', @COUNT = @COUNT +1
END
SELECT DATEDIFF(ms,@StartTime,GETDATE()) 'Time Taken in ms'
GO

is there an advantage to varchar(500) over varchar(8000)?

From a processing standpoint, it will not make a difference to use varchar(8000) vs varchar(500). It's more of a "good practice" kind of thing to define a maximum length that a field should hold and make your varchar that length. It's something that can be used to assist with data validation. For instance, making a state abbreviation be 2 characters or a postal/zip code as 5 or 9 characters. This used to be a more important distinction for when your data interacted with other systems or user interfaces where field length was critical (e.g. a mainframe flat file dataset), but nowadays I think it's more habit than anything else.

Difference between varchar(5) and varchar(5000)?

For one thing - you CANNOT index any column that's over 900 bytes in size.

That alone (to me) is a killer criteria against using varchar(5000) (or varchar(max)) all the time, even if not needed....

Use the appropriate size.

Also: varchar adds at least 2 bytes of overhead, so for strings of 10 characters or less, I would always recommend using char(n) instead.

char(5) will use 5 bytes of storage - always. varchar(5) uses between 2 and 7 bytes of storage.

difference between varchar(500) vs varchar(max) in sql server

In SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.

To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.

To solve this problem, Microsoft introduced the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.

Microsoft recommend using MAX data types instead of BLOBs in SQL Server 2005. In fact, BLOBs are being deprecated in future releases of SQL Server.

Credit: http://www.teratrax.com/articles/varchar_max.html


In SQL Server 2005 and SQL Server 2008, The maximum storage size for VARCHAR(MAX) is 2^31-1 bytes (2,147,483,647 bytes or 2GB - 1 bytes). The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. Since each character in a VARCHAR data type uses one byte, the maximum length for a VARCHAR(MAX) data type is 2,147,483,645.

Full Interesting read for you: http://www.sql-server-helper.com/faq/sql-server-2005-varchar-max-p01.aspx

Reference: http://msdn.microsoft.com/en-us/library/ms143432.aspx

Why not use varchar(max)?

My answer to this, isn't about the usage of Max, as much as it is about the reason for VARCHAR(max) vs TEXT.

In my book; first of all, Unless you can be absolutely certain that you'll never encode anything but english text and people won't refer to names of foreign locations, then you should use NVARCHAR or NTEXT.

Secondly, it's what the fields allow you to do.

TEXT is hard to update in comparison to VARCHAR, but you get the advantage of Full Text Indexing and lots of clever things.

On the other hand, VARCHAR(MAX) has some ambiguity, if the size of the cell is < 8000 chars, it will be treated as Row data. If it's greater, it will be treated as a LOB for storage purposes.
Because you can't know this without querying RBAR, this may have optimization strategies for places where you need to be sure about your data and how many reads it costs.

Otherwise, if your usage is relatively mundane and you don't expect to have problems with the size of data (IE you're using .Net and therefore don't have to be concerned about the size of your string/char* objects) then using VARCHAR(max) is fine.

SQL Server varchar(50) and varchar(128) performance difference

varchar(50) and varchar(128) will behave pretty much identical from every point of view. The storage size is identical for values under 50 characters. They can be joined interchangeably (varchar(50) joined with varchar(128)) w/o type convertion issues (ie. an index on varchar(50) can seek a column varchar(128) in a join) and same applies to WHERE predicates. Prior to SQL Server 2012 ncreasing the size of a varchar column is a very fast metadata-only operation, after SQL Server 2012 this operation may be a slow size-of-data-update-each-record operation under certain conditions, similar to those descirbed in Adding a nullable column can update the entire table.

Some issues can arrise from any column length change:

  • application issues from handling unexpected size values. Native ones may run into buffer size issues if improperly codded (ie. larger size can cause buffer overflow). Managed apps are unlikely to have serious issues, but minor issues like values not fitting on column widths on screen or on reports may occur.
  • T-SQL errors from truncating values on insert or update
  • T-SQL silent truncation occuring and resulting in incorrect values (Eg. @variables declared as varchar(50) in stored proc)
  • Limits like max row size or max index size may be reached. Eg. you have today a composite index on 8 columns of type varchar(50), extending to varchar(128) will exceed the max index size of 900 and trigger warnings.

Martin's warning about memory grants incresing is a very valid concern. I would just buy more RAM if that would indeed turn out to be an issue.

Best practices for SQL varchar column length

No DBMS I know of has any "optimization" that will make a VARCHAR with a 2^n length perform better than one with a max length that is not a power of 2.

I think early SQL Server versions actually treated a VARCHAR with length 255 differently than one with a higher maximum length. I don't know if this is still the case.

For almost all DBMS, the actual storage that is required is only determined by the number of characters you put into it, not the max length you define. So from a storage point of view (and most probably a performance one as well), it does not make any difference whether you declare a column as VARCHAR(100) or VARCHAR(500).

You should see the max length provided for a VARCHAR column as a kind of constraint (or business rule) rather than a technical/physical thing.

For PostgreSQL the best setup is to use text without a length restriction and a CHECK CONSTRAINT that limits the number of characters to whatever your business requires.

If that requirement changes, altering the check constraint is much faster than altering the table (because the table does not need to be re-written)

The same can be applied for Oracle and others - in Oracle it would be VARCHAR(4000) instead of text though.

I don't know if there is a physical storage difference between VARCHAR(max) and e.g. VARCHAR(500) in SQL Server. But apparently there is a performance impact when using varchar(max) as compared to varchar(8000).

See this link (posted by Erwin Brandstetter as a comment)

Edit 2013-09-22

Regarding bigown's comment:

In Postgres versions before 9.2 (which was not available when I wrote the initial answer) a change to the column definition did rewrite the whole table, see e.g. here. Since 9.2 this is no longer the case and a quick test confirmed that increasing the column size for a table with 1.2 million rows indeed only took 0.5 seconds.

For Oracle this seems to be true as well, judging by the time it takes to alter a big table's varchar column. But I could not find any reference for that.

For MySQL the manual says "In most cases, ALTER TABLE makes a temporary copy of the original table". And my own tests confirm that: running an ALTER TABLE on a table with 1.2 million rows (the same as in my test with Postgres) to increase the size of a column took 1.5 minutes. In MySQL however you can not use the "workaround" to use a check constraint to limit the number of characters in a column.

For SQL Server I could not find a clear statement on this but the execution time to increase the size of a varchar column (again the 1.2 million rows table from above) indicates that no rewrite takes place.

Edit 2017-01-24

Seems I was (at least partially) wrong about SQL Server. See this answer from Aaron Bertrand that shows that the declared length of a nvarchar or varchar columns makes a huge difference for the performance.



Related Topics



Leave a reply



Submit