What Are the Use Cases for Selecting Char Over Varchar in SQL

What are the use cases for selecting CHAR over VARCHAR in SQL?

The general rule is to pick CHAR if all rows will have close to the same length. Pick VARCHAR (or NVARCHAR) when the length varies significantly. CHAR may also be a bit faster because all the rows are of the same length.

It varies by DB implementation, but generally, VARCHAR (or NVARCHAR) uses one or two more bytes of storage (for length or termination) in addition to the actual data. So (assuming you are using a one-byte character set) storing the word "FooBar"

  • CHAR(6) = 6 bytes (no overhead)
  • VARCHAR(100) = 8 bytes (2 bytes of overhead)
  • CHAR(10) = 10 bytes (4 bytes of waste)

The bottom line is CHAR can be faster and more space-efficient for data of relatively the same length (within two characters length difference).

Note: Microsoft SQL has 2 bytes of overhead for a VARCHAR. This may vary from DB to DB, but generally, there is at least 1 byte of overhead needed to indicate length or EOL on a VARCHAR.

As was pointed out by Gaven in the comments: Things change when it comes to multi-byte characters sets, and is a is case where VARCHAR becomes a much better choice.

A note about the declared length of the VARCHAR: Because it stores the length of the actual content, then you don't waste unused length. So storing 6 characters in VARCHAR(6), VARCHAR(100), or VARCHAR(MAX) uses the same amount of storage. Read more about the differences when using VARCHAR(MAX). You declare a maximum size in VARCHAR to limit how much is stored.

In the comments AlwaysLearning pointed out that the Microsoft Transact-SQL docs seem to say the opposite. I would suggest that is an error or at least the docs are unclear.

Any benefit of uses CHAR over VARCHAR?


  • VARCHAR

varchar stores variable-length character string. it can require less storage than fixed-length types because it uses only as much space as it needs.

varchar also uses 1 or 2 extra bytes to record the value's length. for example varchar(10) will use up to 11 bytes of storage space. varchar helps performance because it saves space. however because the rows are variable length, they can grow when you update them, which can cause extra work. if a row grows and no longer fits in its original location, the behavior is storage engine-dependent...

  • CHAR

char is fixed-length , mysql always allocates enough space for the specified number of characters. When storing a CHAR value, MySQL removes any trailing spaces. Values are padded with spaces as needed for comparisons.

char is useful if you want to store very short strings, or if all the values are nearly
the same length. For example, CHAR is a good choice for MD5 values for user passwords,
which are always the same length.

char is also better than VARCHAR for data that’s changed frequently, because a fixed-length row is not prone to fragmentation.

Why should I use char instead of varchar?

Prefer VARCHAR.

In olden days of tight storage, it mattered for space. Nowadays, disk storage is cheap, but RAM and IO are still precious. VARCHAR is IO and cache friendly; it allows you to more densely pack the db buffer cache with data rather than wasted literal "space" space, and for the same reason, space padding imposes an IO overhead.

The upside to CHAR() used to be reduced row chaining on frequently updated records. When you update a field and the value is larger than previously allocated, the record may chain. This is manageable, however; databases often support a "percent free" setting on your table storage attributes that tells the DB how much extra space to preallocate per row for growth.

VARCHAR is almost always preferable because space padding requires you to be aware of it and code differently. Different databases handle it differently. With VARCHAR you know your field holds only exactly what you store in it.

I haven't designed a schema in over a decade with CHAR.

What's the difference between VARCHAR and CHAR?

VARCHAR is variable-length.

CHAR is fixed length.

If your content is a fixed size, you'll get better performance with CHAR.

See the MySQL page on CHAR and VARCHAR Types for a detailed explanation (be sure to also read the comments).

CHAR vs. VARCHAR and the ramifications when joining

Trailing space is ignored in string comparisons in SQL Server. There is no need to RTRIM it yourself (which would make the condition unsargable)

is there an advantage to varchar(500) over varchar(8000)?

From a processing standpoint, it will not make a difference to use varchar(8000) vs varchar(500). It's more of a "good practice" kind of thing to define a maximum length that a field should hold and make your varchar that length. It's something that can be used to assist with data validation. For instance, making a state abbreviation be 2 characters or a postal/zip code as 5 or 9 characters. This used to be a more important distinction for when your data interacted with other systems or user interfaces where field length was critical (e.g. a mainframe flat file dataset), but nowadays I think it's more habit than anything else.

What is the advantage of using varbinary over varchar here?

I believe the expectation is that the varbinary data will generally consume fewer bytes (5), than the varchar one (10 or 11, I think) per portion of the original string, and so, for very large numbers of components, or comparisons to occur, it should be more efficient.

But I'd recommend that if you were looking to use either solution, that you implement both (they're quite short), and try some profiling against your real data (and query patterns), to see if there are practical differences (I wouldn't expect so).

(Crafty Steal): And as Martin points out, the binary comparisons will be more efficient, since it won't involve all of the code that's there to deal with collations. :-)

Why is VARCHAR slower than CHAR on updating rows?

Rows are laid out with the fixed size columns first, at fixed offsets from the start of the row. Then (after some important bytes in the middle) the variable sized data is placed at the end. Because it's variable sized, the actual offset to the data cannot be computed for the whole table (like the fixed data) but has to be computed on a row-by-row basis.

And if a varchar(5)1 is storing NYC and is then asked to store NYCX, it may find that there's not a spare byte at the end of NYC - it's being used for another column - so the row has to expand by moving everything after one byte further along to make space for the extra byte.


1I notice in one of your examples you failed to specify a length. Please drill into yourself that that's a bad habit



Related Topics



Leave a reply



Submit