Create unique constraint with null columns
Postgres 15 or newer
Postgres 15 (currently beta) adds the clause NULLS NOT DISTINCT
. The release notes:
Allow unique constraints and indexes to treat NULL values as not distinct (Peter Eisentraut)
Previously NULL values were always indexed as distinct values, but
this can now be changed by creating constraints and indexes usingUNIQUE NULLS NOT DISTINCT
.
With this clause NULL
is treated like just another value, and a UNIQUE
constraint does not allow more than one row with the same NULL
value. The task is simple now:
ALTER TABLE favorites
ADD CONSTRAINT favo_uni UNIQUE NULLS NOT DISTINCT (user_id, menu_id, recipe_id);
There are examples in the manual chapter "Unique Constraints".
The clause switches behavior for all index keys. You can't treat NULL
as equal for one key, but not for another.NULLS DISTINCT
remains the default (in line with standard SQL) and does not have to be spelled out.
The same clause works for a UNIQUE
index, too:
CREATE UNIQUE INDEX favo_uni_idx
ON favorites (user_id, menu_id, recipe_id) NULLS NOT DISTINCT;
Note the position of the new clause after the key fields.
Postgres 14 or older
Create two partial indexes:
CREATE UNIQUE INDEX favo_3col_uni_idx ON favorites (user_id, menu_id, recipe_id)
WHERE menu_id IS NOT NULL;
CREATE UNIQUE INDEX favo_2col_uni_idx ON favorites (user_id, recipe_id)
WHERE menu_id IS NULL;
This way, there can only be one combination of (user_id, recipe_id)
where menu_id IS NULL
, effectively implementing the desired constraint.
Possible drawbacks:
- You cannot have a foreign key referencing
(user_id, menu_id, recipe_id)
. (It seems unlikely you'd want a FK reference three columns wide - use the PK column instead!) - You cannot base
CLUSTER
on a partial index. - Queries without a matching
WHERE
condition cannot use the partial index.
If you need a complete index, you can alternatively drop the WHERE
condition from favo_3col_uni_idx
and your requirements are still enforced.
The index, now comprising the whole table, overlaps with the other one and gets bigger. Depending on typical queries and the percentage of NULL
values, this may or may not be useful. In extreme situations it may even help to maintain all three indexes (the two partial ones and a total on top).
This is a good solution for a single nullable column, maybe for two. But it gets out of hands quickly for more as you need a separate partial index for every combination of nullable columns, so the number grows binomially. For multiple nullable columns, see instead:
- Why doesn't my UNIQUE constraint trigger?
Aside: I advise not to use mixed case identifiers in PostgreSQL.
How do I create a unique constraint that also allows nulls?
SQL Server 2008 +
You can create a unique index that accept multiple NULLs with a WHERE
clause. See the answer below.
Prior to SQL Server 2008
You cannot create a UNIQUE constraint and allow NULLs. You need set a default value of NEWID().
Update the existing values to NEWID() where NULL before creating the UNIQUE constraint.
MSSQL: Add Unique Constraint at Create Table and allow NULLs
Assuming that you want multiple rows with the value NULL
, you won't be able to use a UNIQUE CONSTRAINT
, as NULL
is still a value (even if an unknown one). For example:
CREATE TABLE dbo.YourTable (UserIdentifier nvarchar(100) NULL,
CONSTRAINT UC_UI UNIQUE (UserIdentifier));
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(NULL);
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(NULL);
GO
DROP TABLE dbo.YourTable;
Notice the second INSERT
fails.
You can, instead, however, use a conditional UNIQUE INDEX
:
CREATE TABLE dbo.YourTable (UserIdentifier nvarchar(100) NULL);
CREATE UNIQUE NONCLUSTERED INDEX UI_UI ON dbo.YourTable(UserIdentifier) WHERE UserIdentifier IS NOT NULL;
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(NULL); -- Success
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(NULL); --Success
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(N'Steve'); --Success
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(N'Jayne'); --Success
GO
INSERT INTO dbo.YourTable (UserIdentifier)
VALUES(N'Steve'); --Fails
GO
DROP TABLE dbo.YourTable;
As Jeroen Mostert stated in the comments though, you cannot create a unique index as part of creating the table; it must be created in a separate statement. There is no syntax to create an UNIQUE INDEX
as part of a CREATE TABLE
statement.
You can create this inline (it was undocumented at the time of this answer was originally written) with either of the following syntax in SQL Server 2016+:
CREATE TABLE dbo.YourTable (UserIdentifier nvarchar(100) NULL INDEX UI_UI UNIQUE WHERE UserIdentifier IS NOT NULL);
CREATE TABLE dbo.SomeTable (UserIdentifier nvarchar(100) NULL,
INDEX UI_UI UNIQUE (UserIdentifier) WHERE UserIdentifier IS NOT NULL);
db<>fiddle 2014, db<>fiddle 2016
Treating `null` as a distinct value in a table unique constraint
Based on the approach recommended in this answer, the solution is to create two partial indexes.
Using sqlalchemy for the example in the question, this looks like:
class OptionTable(Base):
__tablename__ = "option_table"
id = Column(Integer, primary_key=True)
custom_id = Column(Integer, ForeignKey("custom.id"), nullable=True)
client = Column(String, nullable=False)
option = Column(String, nullable=False)
__table_args__ = (
Index(
"uix_custom_client_option",
"custom_id",
"client",
"option",
unique=True,
postgresql_where=custom_id.isnot(None)
),
Index(
"uix_client_option",
"client",
"option",
unique=True,
postgresql_where=custom_id.is_(None)
),
)
UPSERT based on UNIQUE constraint with NULL values
If you can find a value that can never legally exist in col3
(make sure with a check constraint), you could use a unique index:
CREATE UNIQUE INDEX ON my_table (
col2,
coalesce(col3, -1.0)
);
and use that in your INSERT
:
INSERT INTO my_table (col2, col3, col4)
VALUES (p_col2, p_col3, p_col4)
ON CONFLICT (col2, coalesce(col3, -1.0))
DO UPDATE SET col4 = excluded.col4;
PostgreSQL multiple nullable columns in unique constraint
You are striving for compatibility with your existing Oracle and SQL Server implementations.
Here is a presentation comparing physical row storage formats of the three involved RDBS.
Since Oracle does not implement NULL
values at all in row storage, it can't tell the difference between an empty string and NULL
anyway. So wouldn't it be prudent to use empty strings (''
) instead of NULL
values in Postgres as well - for this particular use case?
Define columns included in the unique constraint as NOT NULL DEFAULT ''
, problem solved:
CREATE TABLE example (
example_id serial PRIMARY KEY
, field1 text NOT NULL DEFAULT ''
, field2 text NOT NULL DEFAULT ''
, field3 text NOT NULL DEFAULT ''
, field4 text NOT NULL DEFAULT ''
, field5 text NOT NULL DEFAULT ''
, CONSTRAINT example_index UNIQUE (field1, field2, field3, field4, field5)
);
Notes
What you demonstrate in the question is a unique index:
CREATE UNIQUE INDEX ...
not the unique constraint you keep talking about. There are subtle, important differences!
- How does PostgreSQL enforce the UNIQUE constraint / what type of index does it use?
I changed that to an actual constraint like you made it the subject of the post.
The keyword
ASC
is just noise, since that is the default sort order. I left it away.Using a
serial
PK column for simplicity which is totally optional but typically better than numbers stored astext
.
Working with it
Just omit empty / null fields from the INSERT
:
INSERT INTO example(field1) VALUES ('F1_DATA');
INSERT INTO example(field1, field2, field5) VALUES ('F1_DATA', 'F2_DATA', 'F5_DATA');
Repeating any of theses inserts would violate the unique constraint.
Or if you insist on omitting target columns (which is a bit of antipattern in persisted INSERT
statements):
Or for bulk inserts where all columns need to be listed:
INSERT INTO example VALUES
('1', 'F1_DATA', DEFAULT, DEFAULT, DEFAULT, DEFAULT)
, ('2', 'F1_DATA','F2_DATA', DEFAULT, DEFAULT,'F5_DATA');
Or simply:
INSERT INTO example VALUES
('1', 'F1_DATA', '', '', '', '')
, ('2', 'F1_DATA','F2_DATA', '', '','F5_DATA');
Or you can write a trigger BEFORE INSERT OR UPDATE
that converts NULL
to ''
.
Alternative solutions
If you need to use actual NULL values I would suggest the unique index with COALESCE
like you mentioned as option (2) and @wildplasser provided as his last example.
The index on an array like @Rudolfo presented is simple, but considerably more expensive. Array handling isn't very cheap in Postgres and there is an array overhead similar to that of a row (24 bytes):
- Calculating and saving space in PostgreSQL
Arrays are limited to columns of the same data type. You could cast all columns to text
if some are not, but it will typically further increase storage requirements. Or you could use a well-known row type for heterogeneous data types ...
A corner case: array (or row) types with all NULL values are considered equal (!), so there can only be 1 row with all involved columns NULL. May or may not be as desired. If you want to disallow all columns NULL:
- NOT NULL constraint over a set of columns
Related Topics
Using Group by on Multiple Columns
Count Work Days Between Two Dates
How to Select the Newest Four Items Per Category
How to Update If Exists, Insert If Not (Aka "Upsert" or "Merge") in MySQL
Need to Return Two Sets of Data With Two Different Where Clauses
Querying Spark SQL Dataframe With Complex Types
How to Comma Delimit Multiple Rows into One Column
Postgresql Group_Concat Equivalent
MySQL: Alternatives to Order by Rand()
Is the SQL Where Clause Short-Circuit Evaluated
Select Top 10 Records For Each Category
How to Create a Calendar Table For 100 Years in Sql
If' in 'Select' Statement - Choose Output Value Based on Column Values
MySQL Fails On: MySQL "Error 1524 (Hy000): Plugin 'Auth_Socket' Is Not Loaded"