Postgresql Multiple Nullable Columns in Unique Constraint

Create unique constraint with null columns

Postgres 15 or newer

Postgres 15 adds the clause NULLS NOT DISTINCT. The release notes:

  • Allow unique constraints and indexes to treat NULL values as not distinct (Peter Eisentraut)

    Previously NULL values were always indexed as distinct values, but
    this can now be changed by creating constraints and indexes using
    UNIQUE NULLS NOT DISTINCT.

With this clause NULL is treated like just another value, and a UNIQUE constraint does not allow more than one row with the same NULL value. The task is simple now:

ALTER TABLE favorites
ADD CONSTRAINT favo_uni UNIQUE NULLS NOT DISTINCT (user_id, menu_id, recipe_id);

There are examples in the manual chapter "Unique Constraints".

The clause switches behavior for all keys of the same index. You can't treat NULL as equal for one key, but not for another.

NULLS DISTINCT remains the default (in line with standard SQL) and does not have to be spelled out.

The same clause works for a UNIQUE index, too:

CREATE UNIQUE INDEX favo_uni_idx
ON favorites (user_id, menu_id, recipe_id) NULLS NOT DISTINCT;

Note the position of the new clause after the key fields.

Postgres 14 or older

Create two partial indexes:

CREATE UNIQUE INDEX favo_3col_uni_idx ON favorites (user_id, menu_id, recipe_id)
WHERE menu_id IS NOT NULL;

CREATE UNIQUE INDEX favo_2col_uni_idx ON favorites (user_id, recipe_id)
WHERE menu_id IS NULL;

This way, there can only be one combination of (user_id, recipe_id) where menu_id IS NULL, effectively implementing the desired constraint.

Possible drawbacks:

  • You cannot have a foreign key referencing (user_id, menu_id, recipe_id). (It seems unlikely you'd want a FK reference three columns wide - use the PK column instead!)
  • You cannot base CLUSTER on a partial index.
  • Queries without a matching WHERE condition cannot use the partial index.

If you need a complete index, you can alternatively drop the WHERE condition from favo_3col_uni_idx and your requirements are still enforced.

The index, now comprising the whole table, overlaps with the other one and gets bigger. Depending on typical queries and the percentage of NULL values, this may or may not be useful. In extreme situations it may even help to maintain all three indexes (the two partial ones and a total on top).

This is a good solution for a single nullable column, maybe for two. But it gets out of hands quickly for more as you need a separate partial index for every combination of nullable columns, so the number grows binomially. For multiple nullable columns, see instead:

  • Why doesn't my UNIQUE constraint trigger?

Aside: I advise not to use mixed case identifiers in PostgreSQL.

Postgres - Unique constraint with multiple columns and NULL values

For PostgreSQL v15 or better, see Naeel's answer. For lower versions, try the following:

An alternative to the good solution of forbidding NULLs is to create a unique index.

All you need is a value that is guaranteed not to occur in your data set (in my example '@@'):

CREATE UNIQUE INDEX ON test (
coalesce(foo, '@@'),
coalesce(bar, '@@')
);

PostgreSQL multiple nullable columns in unique constraint

Postgres 15 adds the clause NULLS NOT DISTINCT

See:

  • Create unique constraint with null columns

The solution is very simple now:

ALTER TABLE example ADD CONSTRAINT foo
UNIQUE NULLS NOT DISTINCT (field1, field2, field3, field4, field5);

For Postgres 14 or older

You are striving for compatibility with your existing Oracle and SQL Server implementations.

Since Oracle does not implement NULL values at all in row storage, it can't tell the difference between an empty string and NULL anyway. So wouldn't it be prudent to use empty strings ('') instead of NULL values in Postgres as well - for this particular use case?

Define columns included in the unique constraint as NOT NULL DEFAULT '', problem solved:

CREATE TABLE example (
example_id serial PRIMARY KEY
, field1 text NOT NULL DEFAULT ''
, field2 text NOT NULL DEFAULT ''
, field3 text NOT NULL DEFAULT ''
, field4 text NOT NULL DEFAULT ''
, field5 text NOT NULL DEFAULT ''
, CONSTRAINT foo UNIQUE (field1, field2, field3, field4, field5)
);

Notes

What you demonstrate in the question is a unique index:

CREATE UNIQUE INDEX ...

Not the unique constraint you keep talking about. There are subtle, important differences!

  • How does PostgreSQL enforce the UNIQUE constraint / what type of index does it use?

I changed that to an actual constraint like in the title of the question.

The keyword ASC is just noise, since that is the default sort order. I dropped it.

Using a serial PK column for simplicity which is totally optional but typically preferable to numbers stored as text.

Working with it

Just omit empty / null fields from the INSERT:

INSERT INTO example(field1) VALUES ('F1_DATA');
INSERT INTO example(field1, field2, field5) VALUES ('F1_DATA', 'F2_DATA', 'F5_DATA');

Repeating any of theses inserts would violate the unique constraint.

Or if you insist on omitting target columns (which is a bit of anti-pattern in persisted INSERT statements),

or for bulk inserts where all columns need to be listed:

INSERT INTO example VALUES
('1', 'F1_DATA', DEFAULT, DEFAULT, DEFAULT, DEFAULT)
, ('2', 'F1_DATA','F2_DATA', DEFAULT, DEFAULT,'F5_DATA')
;

Or simply:

INSERT INTO example VALUES
('1', 'F1_DATA', '', '', '', '')
, ('2', 'F1_DATA','F2_DATA', '', '','F5_DATA')
;

Or you can write a trigger BEFORE INSERT OR UPDATE that converts NULL to ''.

Alternative solutions

If you need to use actual NULL values I would suggest the unique index with COALESCE like you mentioned as option (2) and @wildplasser provided as his last example.

The index on an array like @Rudolfo presented is simple, but considerably more expensive. Array handling isn't very cheap in Postgres and there is an array overhead similar to that of a row (24 bytes):

  • Calculating and saving space in PostgreSQL

Arrays are limited to columns of the same data type. You could cast all columns to text if some are not, but it will typically further increase storage requirements. Or you could use a well-known row type for heterogeneous data types ...

A corner case: array (or row) types with all NULL values are considered equal (!), so there can only be 1 row with all involved columns NULL. May or may not be as desired. If you want to disallow all columns NULL:

  • NOT NULL constraint over a set of columns

UPSERT based on UNIQUE constraint with NULL values

If you can find a value that can never legally exist in col3 (make sure with a check constraint), you could use a unique index:

CREATE UNIQUE INDEX ON my_table (
col2,
coalesce(col3, -1.0)
);

and use that in your INSERT:

INSERT INTO my_table (col2, col3, col4)
VALUES (p_col2, p_col3, p_col4)
ON CONFLICT (col2, coalesce(col3, -1.0))
DO UPDATE SET col4 = excluded.col4;

Unique constraints on multiple columns that cannot both be null

It is possible to do this with just two unique constraints. The second one is:

CREATE UNIQUE INDEX IF NOT EXISTS ab_null_constraint ON my_table ( coalesce(id_a, id_b), (id_a is null) WHERE id_a IS NULL or id_b is null;

Here is a db<>fiddle.

Actually, you can combine all this into one unique index:

CREATE UNIQUE INDEX IF NOT EXISTS ab_null_constraint ON
my_table ( coalesce(id_a, id_b),
coalesce(id_b, id_a),
(id_a is null),
(id_b is null)
);

Here is a db<>fiddle for this.

You might find your original formulation more maintainable.

How to create composite UNIQUE constraint with nullable columns?

If you have values that can never appear in those columns, you can use them as a replacement in the index:

create unique index on the_table (coalesce(a,-1), coalesce(b, -1), coalesce(c, -1));

That way NULL values are treated the same inside the index, without the need to use them in the table.

If those columns are numeric or float (rather than integer or bigint) using '-Infinity' might be a better substitution value.


There is a drawback to this though:

This index will however not be usable for query on those columns unless you also use the coalesce() expression. So with the above index a query like:

select *
from the_table
where a = 10
and b = 100;

would not use the index. You would need to use the same expressions as used in the index itself:

select *
from the_table
where coalesce(a, -1) = 10
and coalesce(b, -1) = 100;

UNIQUE constraint where NULL is one valid value

You can create two partial indexes. They are supported since version 7.2, which was released in February 2002.

This one will check that any combinaison of the three columns will be unique when emaildomainid isn't null:

CREATE UNIQUE INDEX customexternalemail_sp_emaildomain_unique_not_null 
ON customexternalemail (serviceproviderid, emailprefix, emaildomainid)
WHERE emaildomainid IS NOT NULL;

This one will ensure that for any row which has a null value for emaildomainid, the combination (serviceproviderid, emailprefix) will be unique:

CREATE UNIQUE INDEX customexternalemail_sp_emaildomain_unique_null 
ON customexternalemail (serviceproviderid, emailprefix)
WHERE emaildomainid IS NULL;


Related Topics



Leave a reply



Submit