Is There an Advantage to Varchar(500) Over Varchar(8000)

is there an advantage to varchar(500) over varchar(8000)?

From a processing standpoint, it will not make a difference to use varchar(8000) vs varchar(500). It's more of a "good practice" kind of thing to define a maximum length that a field should hold and make your varchar that length. It's something that can be used to assist with data validation. For instance, making a state abbreviation be 2 characters or a postal/zip code as 5 or 9 characters. This used to be a more important distinction for when your data interacted with other systems or user interfaces where field length was critical (e.g. a mainframe flat file dataset), but nowadays I think it's more habit than anything else.

difference between varchar(500) vs varchar(max) in sql server

In SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.

To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.

To solve this problem, Microsoft introduced the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.

Microsoft recommend using MAX data types instead of BLOBs in SQL Server 2005. In fact, BLOBs are being deprecated in future releases of SQL Server.

Credit: http://www.teratrax.com/articles/varchar_max.html


In SQL Server 2005 and SQL Server 2008, The maximum storage size for VARCHAR(MAX) is 2^31-1 bytes (2,147,483,647 bytes or 2GB - 1 bytes). The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. Since each character in a VARCHAR data type uses one byte, the maximum length for a VARCHAR(MAX) data type is 2,147,483,645.

Full Interesting read for you: http://www.sql-server-helper.com/faq/sql-server-2005-varchar-max-p01.aspx

Reference: http://msdn.microsoft.com/en-us/library/ms143432.aspx

Difference between varchar(5) and varchar(5000)?

For one thing - you CANNOT index any column that's over 900 bytes in size.

That alone (to me) is a killer criteria against using varchar(5000) (or varchar(max)) all the time, even if not needed....

Use the appropriate size.

Also: varchar adds at least 2 bytes of overhead, so for strings of 10 characters or less, I would always recommend using char(n) instead.

char(5) will use 5 bytes of storage - always. varchar(5) uses between 2 and 7 bytes of storage.

Does size of a VARCHAR column matter when used in queries

Yes, the length of varchar affects estimation of the query, memory that will be allocated for internal operation (for example for sorting) and as consequence resources of CPU. You can reproduce it with the following simple example.

1.Create two tables:

create table varLenTest1
(
a varchar(100)
)

create table varLenTest2
(
a varchar(8000)
)

2. Fill both of them with some data:

declare @i int
set @i = 20000

while (@i > 0)
begin
insert into varLenTest1 (a) values (cast(NEWID() as varchar(36)))
set @i = @i - 1
end

3. Execute the following queries with "include actual execution plan":

select a from varLenTest1 order by a OPTION (MAXDOP 1) ;
select a from varLenTest2 order by a OPTION (MAXDOP 1) ;

If you inspect execution plans of these queries, you can see that estimated IO cost and estimated CPU cost is very different:
Sample Image

Why not use varchar(max)?

My answer to this, isn't about the usage of Max, as much as it is about the reason for VARCHAR(max) vs TEXT.

In my book; first of all, Unless you can be absolutely certain that you'll never encode anything but english text and people won't refer to names of foreign locations, then you should use NVARCHAR or NTEXT.

Secondly, it's what the fields allow you to do.

TEXT is hard to update in comparison to VARCHAR, but you get the advantage of Full Text Indexing and lots of clever things.

On the other hand, VARCHAR(MAX) has some ambiguity, if the size of the cell is < 8000 chars, it will be treated as Row data. If it's greater, it will be treated as a LOB for storage purposes.
Because you can't know this without querying RBAR, this may have optimization strategies for places where you need to be sure about your data and how many reads it costs.

Otherwise, if your usage is relatively mundane and you don't expect to have problems with the size of data (IE you're using .Net and therefore don't have to be concerned about the size of your string/char* objects) then using VARCHAR(max) is fine.

SQL Server varchar(50) and varchar(128) performance difference

varchar(50) and varchar(128) will behave pretty much identical from every point of view. The storage size is identical for values under 50 characters. They can be joined interchangeably (varchar(50) joined with varchar(128)) w/o type convertion issues (ie. an index on varchar(50) can seek a column varchar(128) in a join) and same applies to WHERE predicates. Prior to SQL Server 2012 ncreasing the size of a varchar column is a very fast metadata-only operation, after SQL Server 2012 this operation may be a slow size-of-data-update-each-record operation under certain conditions, similar to those descirbed in Adding a nullable column can update the entire table.

Some issues can arrise from any column length change:

  • application issues from handling unexpected size values. Native ones may run into buffer size issues if improperly codded (ie. larger size can cause buffer overflow). Managed apps are unlikely to have serious issues, but minor issues like values not fitting on column widths on screen or on reports may occur.
  • T-SQL errors from truncating values on insert or update
  • T-SQL silent truncation occuring and resulting in incorrect values (Eg. @variables declared as varchar(50) in stored proc)
  • Limits like max row size or max index size may be reached. Eg. you have today a composite index on 8 columns of type varchar(50), extending to varchar(128) will exceed the max index size of 900 and trigger warnings.

Martin's warning about memory grants incresing is a very valid concern. I would just buy more RAM if that would indeed turn out to be an issue.



Related Topics



Leave a reply



Submit