difference between varchar(500) vs varchar(max) in sql server
In SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.
To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.
To solve this problem, Microsoft introduced the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.
Microsoft recommend using MAX data types instead of BLOBs in SQL Server 2005. In fact, BLOBs are being deprecated in future releases of SQL Server.
Credit: http://www.teratrax.com/articles/varchar_max.html
In SQL Server 2005 and SQL Server 2008, The maximum storage size for VARCHAR(MAX) is 2^31-1 bytes (2,147,483,647 bytes or 2GB - 1 bytes). The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. Since each character in a VARCHAR data type uses one byte, the maximum length for a VARCHAR(MAX) data type is 2,147,483,645.
Full Interesting read for you: http://www.sql-server-helper.com/faq/sql-server-2005-varchar-max-p01.aspx
Reference: http://msdn.microsoft.com/en-us/library/ms143432.aspx
sql varchar(max) vs varchar(fix)
MSDN
- Use varchar when the sizes of the column data entries vary
considerably. - Use varchar(max) when the sizes of the column data entries vary
considerably, and the size might exceed 8,000 bytes.
When the the length is specified in declaring a VARCHAR
variable or column, the maximum length allowed is 8000. If the length is greater than 8000, you have to use the MAX
specifier as the length. If a length greater than 8000 is specified, the following error will be encountered (assuming that the length specified is 10000):
The size (10000) given to the type 'varchar' exceeds the maximum allowed for any data type (8000).
UPDATE :-
I found a link which I would like to share:-
Here
There is not much performance difference between Varchar[(n)]
and Varchar(Max)
. Varchar[(n)]
provides better performance results compared to Varchar(Max)
. If we know that data to be stored in the column or variable is less than or equal to 8000 characters, then using this Varchar[(n)] data type provides better performance compared to Varchar(Max).Example: When I ran the below script by changing the variable @FirstName
type to Varchar(Max)
then for 1 million assignments it is consistently taking double time than when we used data type as Varchar(50)
for variable @FirstName
.
DECLARE @FirstName VARCHAR(50), @COUNT INT=0, @StartTime DATETIME = GETDATE()
WHILE(@COUNT < 1000000)
BEGIN
SELECT @FirstName = 'Suraj', @COUNT = @COUNT +1
END
SELECT DATEDIFF(ms,@StartTime,GETDATE()) 'Time Taken in ms'
GO
is there an advantage to varchar(500) over varchar(8000)?
From a processing standpoint, it will not make a difference to use varchar(8000) vs varchar(500). It's more of a "good practice" kind of thing to define a maximum length that a field should hold and make your varchar that length. It's something that can be used to assist with data validation. For instance, making a state abbreviation be 2 characters or a postal/zip code as 5 or 9 characters. This used to be a more important distinction for when your data interacted with other systems or user interfaces where field length was critical (e.g. a mainframe flat file dataset), but nowadays I think it's more habit than anything else.
Difference between varchar(5) and varchar(5000)?
For one thing - you CANNOT index any column that's over 900 bytes in size.
That alone (to me) is a killer criteria against using varchar(5000)
(or varchar(max)
) all the time, even if not needed....
Use the appropriate size.
Also: varchar
adds at least 2 bytes of overhead, so for strings of 10 characters or less, I would always recommend using char(n)
instead.
char(5)
will use 5 bytes of storage - always. varchar(5)
uses between 2 and 7 bytes of storage.
difference between varchar(500) vs varchar(max) in sql server
In SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.
To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.
To solve this problem, Microsoft introduced the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.
Microsoft recommend using MAX data types instead of BLOBs in SQL Server 2005. In fact, BLOBs are being deprecated in future releases of SQL Server.
Credit: http://www.teratrax.com/articles/varchar_max.html
In SQL Server 2005 and SQL Server 2008, The maximum storage size for VARCHAR(MAX) is 2^31-1 bytes (2,147,483,647 bytes or 2GB - 1 bytes). The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. Since each character in a VARCHAR data type uses one byte, the maximum length for a VARCHAR(MAX) data type is 2,147,483,645.
Full Interesting read for you: http://www.sql-server-helper.com/faq/sql-server-2005-varchar-max-p01.aspx
Reference: http://msdn.microsoft.com/en-us/library/ms143432.aspx
Why not use varchar(max)?
My answer to this, isn't about the usage of Max, as much as it is about the reason for VARCHAR(max) vs TEXT.
In my book; first of all, Unless you can be absolutely certain that you'll never encode anything but english text and people won't refer to names of foreign locations, then you should use NVARCHAR or NTEXT.
Secondly, it's what the fields allow you to do.
TEXT is hard to update in comparison to VARCHAR, but you get the advantage of Full Text Indexing and lots of clever things.
On the other hand, VARCHAR(MAX) has some ambiguity, if the size of the cell is < 8000 chars, it will be treated as Row data. If it's greater, it will be treated as a LOB for storage purposes.
Because you can't know this without querying RBAR, this may have optimization strategies for places where you need to be sure about your data and how many reads it costs.
Otherwise, if your usage is relatively mundane and you don't expect to have problems with the size of data (IE you're using .Net and therefore don't have to be concerned about the size of your string/char* objects) then using VARCHAR(max) is fine.
SQL Server varchar(50) and varchar(128) performance difference
varchar(50)
and varchar(128)
will behave pretty much identical from every point of view. The storage size is identical for values under 50 characters. They can be joined interchangeably (varchar(50)
joined with varchar(128)
) w/o type convertion issues (ie. an index on varchar(50)
can seek a column varchar(128)
in a join) and same applies to WHERE predicates. Prior to SQL Server 2012 ncreasing the size of a varchar
column is a very fast metadata-only operation, after SQL Server 2012 this operation may be a slow size-of-data-update-each-record operation under certain conditions, similar to those descirbed in Adding a nullable column can update the entire table.
Some issues can arrise from any column length change:
- application issues from handling unexpected size values. Native ones may run into buffer size issues if improperly codded (ie. larger size can cause buffer overflow). Managed apps are unlikely to have serious issues, but minor issues like values not fitting on column widths on screen or on reports may occur.
- T-SQL errors from truncating values on insert or update
- T-SQL silent truncation occuring and resulting in incorrect values (Eg. @variables declared as
varchar(50)
in stored proc) - Limits like max row size or max index size may be reached. Eg. you have today a composite index on 8 columns of type
varchar(50)
, extending tovarchar(128)
will exceed the max index size of 900 and trigger warnings.
Martin's warning about memory grants incresing is a very valid concern. I would just buy more RAM if that would indeed turn out to be an issue.
Best practices for SQL varchar column length
No DBMS I know of has any "optimization" that will make a VARCHAR
with a 2^n
length perform better than one with a max
length that is not a power of 2.
I think early SQL Server versions actually treated a VARCHAR
with length 255 differently than one with a higher maximum length. I don't know if this is still the case.
For almost all DBMS, the actual storage that is required is only determined by the number of characters you put into it, not the max
length you define. So from a storage point of view (and most probably a performance one as well), it does not make any difference whether you declare a column as VARCHAR(100)
or VARCHAR(500)
.
You should see the max
length provided for a VARCHAR
column as a kind of constraint (or business rule) rather than a technical/physical thing.
For PostgreSQL the best setup is to use text
without a length restriction and a CHECK CONSTRAINT
that limits the number of characters to whatever your business requires.
If that requirement changes, altering the check constraint is much faster than altering the table (because the table does not need to be re-written)
The same can be applied for Oracle and others - in Oracle it would be VARCHAR(4000)
instead of text
though.
I don't know if there is a physical storage difference between VARCHAR(max)
and e.g. VARCHAR(500)
in SQL Server. But apparently there is a performance impact when using varchar(max)
as compared to varchar(8000)
.
See this link (posted by Erwin Brandstetter as a comment)
Edit 2013-09-22
Regarding bigown's comment:
In Postgres versions before 9.2 (which was not available when I wrote the initial answer) a change to the column definition did rewrite the whole table, see e.g. here. Since 9.2 this is no longer the case and a quick test confirmed that increasing the column size for a table with 1.2 million rows indeed only took 0.5 seconds.
For Oracle this seems to be true as well, judging by the time it takes to alter a big table's varchar
column. But I could not find any reference for that.
For MySQL the manual says "In most cases, ALTER TABLE
makes a temporary copy of the original table". And my own tests confirm that: running an ALTER TABLE
on a table with 1.2 million rows (the same as in my test with Postgres) to increase the size of a column took 1.5 minutes. In MySQL however you can not use the "workaround" to use a check constraint to limit the number of characters in a column.
For SQL Server I could not find a clear statement on this but the execution time to increase the size of a varchar
column (again the 1.2 million rows table from above) indicates that no rewrite takes place.
Edit 2017-01-24
Seems I was (at least partially) wrong about SQL Server. See this answer from Aaron Bertrand that shows that the declared length of a nvarchar
or varchar
columns makes a huge difference for the performance.
Related Topics
Is There a Performance Difference Between Between and in with MySQL or in SQL in General
Sql: Group by on Consecutive Records
Group by Two Columns and Display Grand Total in Every Row
Invalid Column Name on SQL Server Update After Column Create
Update Values in Struct Arrays in Bigquery
Oracle 10: Using Hextoraw to Fill in Blob Data
How to Return Empty Groups in SQL Group by Clause
Convert One Row into Multiple Rows with Fewer Columns
Get Size of Large Object in Postgresql Query
Intersection of Multiple Arrays in Postgresql
Convert Row Value in to Column in SQL Server (Pivot)
SQL Left Join Losing Rows After Filtering
How to Allow Only One Row for a Table
Using Dynamic SQL to Specify a Column Name by Adding a Variable to Simple SQL Query
How to Get Count() and Rows from One SQL Query in SQL Server
Why Does Comparing a SQL Date Variable to Null Behave in This Way