Impact of Defining Varchar2 Column with Greater Length

Impact of defining VARCHAR2 column with greater length

The answer depends on whether you're talking about a column in a database table, or a variable in a PL/SQL program.

Database column

The amount of storage used is proportionate to the size of the data stored.

PL/SQL variable

If the variable is declared with a size 1 to 4000 (11g+) / 1999 (10g or earlier), memory will be allocated for the maximum length (i.e. VARCHAR2(100) will require at least 100 bytes of memory).

If the variable is declared with a size 4001 (11g+) / 2000 (10g or earlier) or greater, memory will be allocated according to the size of the data stored. (an interesting side question would be, if the variable's value is changed, how is the memory resized - does it reallocate another buffer with the new size?)

Reference for 10g: PL/SQL Datatypes

Small VARCHAR2 variables are optimized for performance, and larger ones are optimized for efficient memory use. The cutoff point is 2000 bytes. For a VARCHAR2 that is 2000 bytes or longer, PL/SQL dynamically allocates only enough memory to hold the actual value. For a VARCHAR2 variable that is shorter than 2000 bytes, PL/SQL preallocates the full declared length of the variable. For example, if you assign the same 500-byte value to a VARCHAR2(2000 BYTE) variable and to a VARCHAR2(1999 BYTE) variable, the former takes up 500 bytes and the latter takes up 1999 bytes.

Reference for 11g: Avoiding Memory Overhead in PL/SQL Code

Specify a size of more than 4000 characters for the VARCHAR2 variable; PL/SQL waits until you assign the variable, then only allocates as much storage as needed

Impact of editing field length of a varchar2 column oracle

You can check the DBA_DEPENDENCIES view as well:

DBA_DEPENDENCIES describes all dependencies in the database between procedures, packages, functions, package bodies, and triggers, including dependencies on views created without any database links.

Although it doesn't say so there, oddly, it includes package and trigger dependencies on tables, so if you query that with referenced_name = '<your table>' you'll see anything that would be invalidated by the table change. Some of those won't matter - if you have a procedure with an argument with a %type dependency it will be invalidated but will work the same after compilation.

But you could have your own variables defined as varchar2(50) instead of table.column%type, and those will be harder to find. The objects listed in that view will be a good starting point, but you might still need to check the source for them manually to look for issues like that, rather than waiting for a test case to fail.

Best practices for SQL varchar column length

No DBMS I know of has any "optimization" that will make a VARCHAR with a 2^n length perform better than one with a max length that is not a power of 2.

I think early SQL Server versions actually treated a VARCHAR with length 255 differently than one with a higher maximum length. I don't know if this is still the case.

For almost all DBMS, the actual storage that is required is only determined by the number of characters you put into it, not the max length you define. So from a storage point of view (and most probably a performance one as well), it does not make any difference whether you declare a column as VARCHAR(100) or VARCHAR(500).

You should see the max length provided for a VARCHAR column as a kind of constraint (or business rule) rather than a technical/physical thing.

For PostgreSQL the best setup is to use text without a length restriction and a CHECK CONSTRAINT that limits the number of characters to whatever your business requires.

If that requirement changes, altering the check constraint is much faster than altering the table (because the table does not need to be re-written)

The same can be applied for Oracle and others - in Oracle it would be VARCHAR(4000) instead of text though.

I don't know if there is a physical storage difference between VARCHAR(max) and e.g. VARCHAR(500) in SQL Server. But apparently there is a performance impact when using varchar(max) as compared to varchar(8000).

See this link (posted by Erwin Brandstetter as a comment)

Edit 2013-09-22

Regarding bigown's comment:

In Postgres versions before 9.2 (which was not available when I wrote the initial answer) a change to the column definition did rewrite the whole table, see e.g. here. Since 9.2 this is no longer the case and a quick test confirmed that increasing the column size for a table with 1.2 million rows indeed only took 0.5 seconds.

For Oracle this seems to be true as well, judging by the time it takes to alter a big table's varchar column. But I could not find any reference for that.

For MySQL the manual says "In most cases, ALTER TABLE makes a temporary copy of the original table". And my own tests confirm that: running an ALTER TABLE on a table with 1.2 million rows (the same as in my test with Postgres) to increase the size of a column took 1.5 minutes. In MySQL however you can not use the "workaround" to use a check constraint to limit the number of characters in a column.

For SQL Server I could not find a clear statement on this but the execution time to increase the size of a varchar column (again the 1.2 million rows table from above) indicates that no rewrite takes place.

Edit 2017-01-24

Seems I was (at least partially) wrong about SQL Server. See this answer from Aaron Bertrand that shows that the declared length of a nvarchar or varchar columns makes a huge difference for the performance.

Why does Oracle varchar2 have a mandatory size as a definition parameter?

It's the same to define a type of
varchar2(10) or varchar2(1000).

No, it is not the same thing at all.

The length of the column is useful metadata for developers building screens.
Similarly automatic query tools like TOAD and SQL Developer use the length of the column when they render results.
The database uses the length of a variable when allocating memory for PL/SQL collections. As that memory comes out of the PGA supersizing the variable declaration can lead to programs failing because the server has run out of memory.
There are similar issues with the declaration of single variables in PL/SQL programs, it is just that collections tend to multiply the problem.
Supersized columns create problems for compound indexes. The following is on a database with 8K blocks

....

SQL> create table t23 (col1 varchar2(4000), col2 varchar2(4000))
  2  /

Table created.

SQL> create index t23_i on t23(col1,col2)
  2  /
create index t23_i on t23(col1,col2)
                      *
ERROR at line 1:
ORA-01450: maximum key length (6398) exceeded

SQL>

But above all else, columns sizes are a form of error checking. If the column is supposed to be ten characters long and some autonomic process is trying to load a thousand characters then something is wrong. The process ought to fail, so we can investigate why we are loading duff data. The alternative is a database full of garbage, and if that is what was wanted we should just have given everybody Excel and have done with it.

It is true that changing the column size when it turns out we have underestimated can be tiresome. But it doesn't happen very often, and we can mitigate a lot of the pain by using %TYPE and SUBTYPE declarations in our PL/SQL instead of hard-coding variable lengths.

"why no such declaration in NUMBER type"

Numbers are different. For a start, the maximum size of a number is much smaller than the text equivalent (38 digits of guaranteed precision).

But the key difference is that Oracle stores numeric values in scientific notation so there is not a straightforward relationship between the arithmetical size of the number and the storage space it consumes.

SQL> select vsize(123456789012345678901) n1
  2         , vsize(999999999999999999999999999999) n2
  3         , vsize(0.000000000000000000001) n3
  4         , vsize(1000000000000000000000000) n4
  5  from dual
  6  /

        N1         N2         N3         N4
---------- ---------- ---------- ----------
        12         16          2          2

SQL>

Nevertheless, it remains good practice to specify scale and precision wherever possible, especially when we are dealing with integers, say, or money.

Does VARCHAR size limit matter?

In general, for a VARCHAR field, the amount of data stored in each field determines its footprint on the disk rather than the maximum size (unlike a CHAR field which always has the same footprint).

There is an upper limit on the total data stored within all fields of an index of 900 bytes (900 byte index size limit in character length).

The larger you make the field, the more likely people will try to use for purposes other than what you intended - and the greater the screen real-estate required to show the value - so its good practice to try to pick the right size, rather than assuming that if you make it as large as possible it will save you having to revisit the design.

Importance of varchar length in MySQL table

No, in the sense that if the values you're storing in that column are always (say) less than 50 characters, declaring the column as varchar(50) or varchar(200) has the same performance.

Impact of Defining Varchar2 Column with Greater Length