Create Unique Constraint With Null Columns

Create unique constraint with null columns

Postgres 15 or newer

Postgres 15 (currently beta) adds the clause NULLS NOT DISTINCT. The release notes:

  • Allow unique constraints and indexes to treat NULL values as not distinct (Peter Eisentraut)

    Previously NULL values were always indexed as distinct values, but
    this can now be changed by creating constraints and indexes using
    UNIQUE NULLS NOT DISTINCT.

With this clause NULL is treated like just another value, and a UNIQUE constraint does not allow more than one row with the same NULL value. The task is simple now:

ALTER TABLE favorites
ADD CONSTRAINT favo_uni UNIQUE NULLS NOT DISTINCT (user_id, menu_id, recipe_id);

There are examples in the manual chapter "Unique Constraints".

The clause switches behavior for all index keys. You can't treat NULL as equal for one key, but not for another.

NULLS DISTINCT remains the default (in line with standard SQL) and does not have to be spelled out.

The same clause works for a UNIQUE index, too:

CREATE UNIQUE INDEX favo_uni_idx
ON favorites (user_id, menu_id, recipe_id) NULLS NOT DISTINCT;

Note the position of the new clause after the key fields.

Postgres 14 or older

Create two partial indexes:

CREATE UNIQUE INDEX favo_3col_uni_idx ON favorites (user_id, menu_id, recipe_id)
WHERE menu_id IS NOT NULL;

CREATE UNIQUE INDEX favo_2col_uni_idx ON favorites (user_id, recipe_id)
WHERE menu_id IS NULL;

This way, there can only be one combination of (user_id, recipe_id) where menu_id IS NULL, effectively implementing the desired constraint.

Possible drawbacks:

  • You cannot have a foreign key referencing (user_id, menu_id, recipe_id). (It seems unlikely you'd want a FK reference three columns wide - use the PK column instead!)
  • You cannot base CLUSTER on a partial index.
  • Queries without a matching WHERE condition cannot use the partial index.

If you need a complete index, you can alternatively drop the WHERE condition from favo_3col_uni_idx and your requirements are still enforced.

The index, now comprising the whole table, overlaps with the other one and gets bigger. Depending on typical queries and the percentage of NULL values, this may or may not be useful. In extreme situations it may even help to maintain all three indexes (the two partial ones and a total on top).

This is a good solution for a single nullable column, maybe for two. But it gets out of hands quickly for more as you need a separate partial index for every combination of nullable columns, so the number grows binomially. For multiple nullable columns, see instead:

  • Why doesn't my UNIQUE constraint trigger?

Aside: I advise not to use mixed case identifiers in PostgreSQL.

How do I create a unique constraint that also allows nulls?

SQL Server 2008 +

You can create a unique index that accept multiple NULLs with a WHERE clause. See the answer below.

Prior to SQL Server 2008

You cannot create a UNIQUE constraint and allow NULLs. You need set a default value of NEWID().

Update the existing values to NEWID() where NULL before creating the UNIQUE constraint.

MSSQL: Add Unique Constraint at Create Table and allow NULLs

Assuming that you want multiple rows with the value NULL, you won't be able to use a UNIQUE CONSTRAINT, as NULL is still a value (even if an unknown one). For example:

CREATE TABLE dbo.YourTable (UserIdentifier nvarchar(100) NULL,
CONSTRAINT UC_UI UNIQUE (UserIdentifier));
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(NULL);
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(NULL);
GO
DROP TABLE dbo.YourTable;

Notice the second INSERT fails.

You can, instead, however, use a conditional UNIQUE INDEX:

CREATE TABLE dbo.YourTable (UserIdentifier nvarchar(100) NULL);

CREATE UNIQUE NONCLUSTERED INDEX UI_UI ON dbo.YourTable(UserIdentifier) WHERE UserIdentifier IS NOT NULL;

GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(NULL); -- Success
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(NULL); --Success
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(N'Steve'); --Success
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(N'Jayne'); --Success
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(N'Steve'); --Fails
GO
DROP TABLE dbo.YourTable;

As Jeroen Mostert stated in the comments though, you cannot create a unique index as part of creating the table; it must be created in a separate statement. There is no syntax to create an UNIQUE INDEX as part of a CREATE TABLE statement.

You can create this inline (it was undocumented at the time of this answer was originally written) with either of the following syntax in SQL Server 2016+:

CREATE TABLE dbo.YourTable (UserIdentifier nvarchar(100) NULL INDEX UI_UI UNIQUE WHERE UserIdentifier IS NOT NULL);

CREATE TABLE dbo.SomeTable (UserIdentifier nvarchar(100) NULL,
INDEX UI_UI UNIQUE (UserIdentifier) WHERE UserIdentifier IS NOT NULL);

db<>fiddle 2014, db<>fiddle 2016

Treating `null` as a distinct value in a table unique constraint

Based on the approach recommended in this answer, the solution is to create two partial indexes.

Using sqlalchemy for the example in the question, this looks like:

class OptionTable(Base):
__tablename__ = "option_table"

id = Column(Integer, primary_key=True)
custom_id = Column(Integer, ForeignKey("custom.id"), nullable=True)
client = Column(String, nullable=False)
option = Column(String, nullable=False)

__table_args__ = (
Index(
"uix_custom_client_option",
"custom_id",
"client",
"option",
unique=True,
postgresql_where=custom_id.isnot(None)
),
Index(
"uix_client_option",
"client",
"option",
unique=True,
postgresql_where=custom_id.is_(None)
),
)

UPSERT based on UNIQUE constraint with NULL values

If you can find a value that can never legally exist in col3 (make sure with a check constraint), you could use a unique index:

CREATE UNIQUE INDEX ON my_table (
col2,
coalesce(col3, -1.0)
);

and use that in your INSERT:

INSERT INTO my_table (col2, col3, col4)
VALUES (p_col2, p_col3, p_col4)
ON CONFLICT (col2, coalesce(col3, -1.0))
DO UPDATE SET col4 = excluded.col4;

PostgreSQL multiple nullable columns in unique constraint

You are striving for compatibility with your existing Oracle and SQL Server implementations.

Here is a presentation comparing physical row storage formats of the three involved RDBS.

Since Oracle does not implement NULL values at all in row storage, it can't tell the difference between an empty string and NULL anyway. So wouldn't it be prudent to use empty strings ('') instead of NULL values in Postgres as well - for this particular use case?

Define columns included in the unique constraint as NOT NULL DEFAULT '', problem solved:

CREATE TABLE example (
example_id serial PRIMARY KEY
, field1 text NOT NULL DEFAULT ''
, field2 text NOT NULL DEFAULT ''
, field3 text NOT NULL DEFAULT ''
, field4 text NOT NULL DEFAULT ''
, field5 text NOT NULL DEFAULT ''
, CONSTRAINT example_index UNIQUE (field1, field2, field3, field4, field5)
);

Notes

  • What you demonstrate in the question is a unique index:

    CREATE UNIQUE INDEX ...

    not the unique constraint you keep talking about. There are subtle, important differences!

    • How does PostgreSQL enforce the UNIQUE constraint / what type of index does it use?

    I changed that to an actual constraint like you made it the subject of the post.

  • The keyword ASC is just noise, since that is the default sort order. I left it away.

  • Using a serial PK column for simplicity which is totally optional but typically better than numbers stored as text.

Working with it

Just omit empty / null fields from the INSERT:

INSERT INTO example(field1) VALUES ('F1_DATA');
INSERT INTO example(field1, field2, field5) VALUES ('F1_DATA', 'F2_DATA', 'F5_DATA');

Repeating any of theses inserts would violate the unique constraint.

Or if you insist on omitting target columns (which is a bit of antipattern in persisted INSERT statements):

Or for bulk inserts where all columns need to be listed:

INSERT INTO example VALUES
('1', 'F1_DATA', DEFAULT, DEFAULT, DEFAULT, DEFAULT)
, ('2', 'F1_DATA','F2_DATA', DEFAULT, DEFAULT,'F5_DATA');

Or simply:

INSERT INTO example VALUES
('1', 'F1_DATA', '', '', '', '')
, ('2', 'F1_DATA','F2_DATA', '', '','F5_DATA');

Or you can write a trigger BEFORE INSERT OR UPDATE that converts NULL to ''.

Alternative solutions

If you need to use actual NULL values I would suggest the unique index with COALESCE like you mentioned as option (2) and @wildplasser provided as his last example.

The index on an array like @Rudolfo presented is simple, but considerably more expensive. Array handling isn't very cheap in Postgres and there is an array overhead similar to that of a row (24 bytes):

  • Calculating and saving space in PostgreSQL

Arrays are limited to columns of the same data type. You could cast all columns to text if some are not, but it will typically further increase storage requirements. Or you could use a well-known row type for heterogeneous data types ...

A corner case: array (or row) types with all NULL values are considered equal (!), so there can only be 1 row with all involved columns NULL. May or may not be as desired. If you want to disallow all columns NULL:

  • NOT NULL constraint over a set of columns


Related Topics



Leave a reply



Submit