How Much Disk-Space Is Needed to Store a Null Value Using Postgresql Db

How much disk-space is needed to store a NULL value using postgresql DB?

Null columns are not stored. The row has a bitmap at the start and one bit per column that indicates which ones are null or non-null. The bitmap could be omitted if all columns are non-null in a row. So for any given row with one or more nulls, the size added to it would be that of the bitmap(N bits for an N-column table, rounded up).

More in depth discussion from the docs here

Do nullable columns occupy additional space in PostgreSQL?

Basically, NULL values occupy 1 bit in the NULL bitmap. But it's not that simple.

The null bitmap (per row) is only allocated if at least one column in that row holds a NULL value. This can lead to a seemingly paradoxic effect in tables with 9 or more columns: assigning the first NULL value to a column can take up more space on disk than writing a value to it. Conversely, removing the last NULL value from the row also removes the NULL bitmap.

Physically, the initial null bitmap occupies 1 byte between the HeapTupleHeader (23 bytes) and actual column data or the row OID (if you should still be using that) - which always start at a multiple of MAXALIGN (typically 8 bytes). This leaves 1 byte of padding that is utilized by the initial null bitmap.

In effect, NULL storage is absolutely free for tables of 8 columns or less (including dropped, but not yet purged columns).

After that, another MAXALIGN bytes (typically 8) are allocated for the next MAXALIGN * 8 columns (typically 64). Etc.

More details in the manual and under these related questions:

  • How much disk-space is needed to store a NULL value using postgresql DB?
  • Does not using NULL in PostgreSQL still use a NULL bitmap in the header?
  • How many records can I store in 5 MB of PostgreSQL on Heroku?

Once you understand alignment padding of data types, you can further optimize storage:

  • Calculating and saving space in PostgreSQL

But the cases are rare where you can save substantial amounts of space. Normally it's not worth the effort.

@Daniel already covers effects on index size.

Note that dropped columns (though now invisible) are kept in the system catalogs until the table is recreated. Those zombis can force the allocation of an (enlarged) NULL bitmap. See:

  • Dropping column in Postgres on a large dataset

How much space does a null value and an empty value take in postgreSQL?

The documentation gives information about that:

All table rows are structured in the same way. There is a fixed-size header (occupying 23 bytes on most machines), followed by an optional null bitmap, an optional object ID field, and the user data. [...] The null bitmap is only present if the HEAP_HASNULL bit is set in t_infomask. [...] In this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When the bitmap is not present, all columns are assumed not-null.

A NULL value does not take up any extra space on disk – it means that the respective bit in the null bitmap is set to 0. Only of the table row contains no NULL values at all, PostgreSQL will not save a null bitmap and you can actually save a little space.

An empty string will occupy one byte: a short varlena header that indicates that the length is 0. Beware of padding bytes because of alignment: if, for example, a timestamp immediately follows the empty string, you could get up to 7 bytes of empty padding space, so that the timestamp starts at an address that is a multiple of 8.

Does not using NULL in PostgreSQL still use a NULL bitmap in the header?

It's actually more complex than that.

The null bitmap needs one bit per column in the row, rounded up to full bytes. It is only there if the actual row includes at least one NULL value and is fully allocated in that case. NOT NULL constraints do not directly affect that. (Of course, if all fields of your table are NOT NULL, there can never be a null bitmap.)

The "heap tuple header" (per row) is 23 bytes long. Actual data starts at a multiple of MAXALIGN (Maximum data alignment) after that, which is typically 8 bytes on 64-bit OS (4 bytes on 32-bit OS). Run the following command from your PostgreSQL binary dir as root to get a definitive answer:

./pg_controldata /path/to/my/dbcluster

On a typical Debian-based installation of Postgres 12 that would be:

sudo /usr/lib/postgresql/12/bin/pg_controldata /var/lib/postgresql/12/main

Either way, there is one free byte between the header and the aligned start of the data, which the null bitmap can utilize. As long as your table has 8 columns or less, NULL storage is effectively absolutely free (as far as disk space is concerned).

After that, another MAXALIGN (typically 8 bytes) is allocated for the null bitmap to cover another (typically) 64 fields. Etc.

This is valid for at least versions 8.4 - 12 and most likely won't change.

Do null values save storage space?

For fixed width fields like nullable int the storage space required is always the same regardless of whether the value is null or not.

For variable width nullable fields the value ’NULL‘ takes zero bytes of storage space (ignoring the bit to store whether the value is null or not).

A table with a XML field with null value, consumes some space? How much?

The answer to the question "how much space does a null value take" is: no space at all - at least not in the "data" area.

For each nullable column in the table there is one bit in the row header that marks the column value as null (or not null). So the "space" that the null values takes is already present in the row header - regardless whether the column is null or not.

Thus the null "value" does not occupy any space in the data block storing the row.

This is documented in the manual: http://www.postgresql.org/docs/current/static/storage-page-layout.html


Postgres will not store long string values (xml, varchar, text, json, ...) in the actual data block if it exceeds a certain threshold (about 2000 bytes). If the value is longer than that, it will be stored in a special storage area "away" from your actual data. So splitting up the table into two tables with a 1:1 relationship doesn't really by you that much. Unless you are storing a lot of rows (hundreds of millions), I doubt you will be able to notice the difference - but this also depends on your usage patterns.

The data that is stored "out-of-line" is also automatically compressed.

Details about this can be found in the manual: http://www.postgresql.org/docs/current/static/storage-toast.html


One reason why the separate table might be an advantage is the necessary "vacuum" cleanup. If you update the XML data a lot but the rest of the table hardly ever changes, then splitting this up in two tables might improve the overall performance because "XML table" will need less "maintenance" and the "main" table won't be changed at all.

Is disk space consumed when storing null data?

In SQLite's record format, exactly one byte is need to specify that a value is NULL.

If you normalize your database so that you have one record per non-NULL value, you save those NULL bytes, but you have additional overhead for the non-NULL values.
SQLite's minimum record overhead is:

  • two bytes cell pointer,
  • at least one byte payload length,
  • several bytes for the rowid, and
  • for your case, several bytes for the foreign key that points back to the original record.

Additionally, storing non-NULL values in a separate table might require an index to get efficient lookups.

PostgreSQL: Column Disk usage

Try using pg_column_size(), it will return the column size in bytes:

SELECT sum(pg_column_size(column)) FROM yourtable

How much space will be actually allocated in a table if the value for char field is null in Django postgresql database?

NULL values don't take up any storage space in the database.

If a table has nullable columns, each table row has a null bitmap that signifies which columns are NULL. Of course this bitmap occupies storage, but it does so regardless if there are NULL values or not.

Disk space implications of setting MySQL column value to NULL instead of 0 or ''

No, using NULL will not take up less space than an empty VARCHAR or INT field. In fact it may take up more space. Here's why:

A VARCHAR is stored as a size + value. The number of bytes used for the size depends on the max storage of the VARCHAR. VARCHAR(255) requires one byte, VARCHAR(65536) requires two bytes and so on.

So that VARCHAR(255) column takes up one byte even if you store an empty string. The following table would take a minimum of one byte per row (plus some other possible overhead depending on storage engine).

CREATE TABLE sample (
a VARCHAR(255) NOT NULL
);

To save NULL values, MySQL uses a bitmask for each row. Up to 8 nullable columns can be stored per byte. So if you have a table like this:

CREATE TABLE sample (
a VARCHAR(255) NULL
);

Would take a minimum of two bytes per row. Storing NULL just sets the bit, it's already reserved whether you use it or not. The byte for the VARCHAR's size is still used for each row even if the column is set to NULL.



Related Topics



Leave a reply



Submit