How to create unique index where column order is not taken into account (set?)
You can create an index on an expression, in this case least()
and greatest()
:
create unique index idx_obj1_obj2 on table(least(Object1, Object2), greatest(Object1, Object2));
Note: there is one slight weirdness if the columns allow NULL
values. In that case, the same value would only be allowed once, regardless of the column it is in. This can be fixed with a more complicated expression, if it is actually a problem.
Unique constraint for 2 columns that works both ways
You can create a unique index that always indexes the same order of values:
create unique index
on friends (least(requestor, requestee), greatest(requestor, requestee));
How do I enforce set-like uniqueness between multiple columns?
create unique index idx_unique_ab
on x (least(a,b), greatest(a,b));
Does column order matter when defining unique constraints
The order matters if you expect to ever use the index as a partial index. For example, suppose you had a unique index on (col1, col2)
, and you wanted to optimize the following query:
SELECT col1, col2 FROM foo WHERE col1 = 'stack';
The index on (col1, col2)
could still be used here, because col1
, which appears in the WHERE
clause, is the leftmost portion of the index. Had you defined the unique constraint on (col2, col1)
, the index could not be used for this query.
Declaring an Index as unique in SQL Server
Long story short: if your data are intrinsically UNIQUE
, you will benefit from creating a UNIQIE
index on them.
See the article in my blog for detailed explanation:
- Making an index
UNIQUE
Now, the gory details.
As @Mehrdad said, UNIQUENESS
affects the estimated row count in the plan builder.
UNIQUE
index has maximal possible selectivity, that's why:
SELECT *
FROM table1 t2, table2 t2
WHERE t1.id = :myid
AND t2.unique_indexed_field = t1.value
almost surely will use NESTED LOOPS
, while
SELECT *
FROM table1 t2, table2 t2
WHERE t1.id = :myid
AND t2.non_unique_indexed_field = t1.value
may benefit from a HASH JOIN
if the optimizer thinks that non_unique_indexed_field
is not selective.
If your index is CLUSTERED
(i. e. the rows theirselves are contained in the index leaves) and non-UNIQUE
, then a special hidden column called uniquifier
is added to each index key, thus making the key larger and the index slower.
That's why UNIQUE CLUSTERED
index is in fact a little more efficicent than a non-UNIQUE CLUSTERED
one.
In Oracle
, a join on UNIQUE INDEX
is required for a such called key preservation
, which ensures that each row from a table will be selected at most once and makes a view updatable.
This query:
UPDATE (
SELECT *
FROM mytable t1, mytable t2
WHERE t2.reference = t1.unique_indexed_field
)
SET value = other_value
will work in Oracle
, while this one:
UPDATE (
SELECT *
FROM mytable t1, mytable t2
WHERE t2.reference = t1.non_unique_indexed_field
)
SET value = other_value
will fail.
This is not an issue with SQL Server
, though.
One more thing: for a table like this,
CREATE TABLE t_indexer (id INT NOT NULL PRIMARY KEY, uval INT NOT NULL, ival INT NOT NULL)
CREATE UNIQUE INDEX ux_indexer_ux ON t_indexer (uval)
CREATE INDEX ix_indexer_ux ON t_indexer (ival)
, this query:
/* Sorts on the non-unique index first */
SELECT TOP 1 *
FROM t_indexer
ORDER BY
ival, uval
will use a TOP N SORT
, while this one:
/* Sorts on the unique index first */
SELECT TOP 1 *
FROM t_indexer
ORDER BY
uval, ival
will use just an index scan.
For the latter query, there is no point in additional sorting on ival
, since uval
are unique anyway, and the optimizer takes this into account.
On sample data of 200,000
rows (id == uval == ival
), the former query runs for 15
seconds, while the latter one is instant.
Create unique constraint with null columns
Postgres 15 or newer
Postgres 15 adds the clause NULLS NOT DISTINCT
. The release notes:
Allow unique constraints and indexes to treat NULL values as not distinct (Peter Eisentraut)
Previously NULL values were always indexed as distinct values, but
this can now be changed by creating constraints and indexes usingUNIQUE NULLS NOT DISTINCT
.
With this clause NULL
is treated like just another value, and a UNIQUE
constraint does not allow more than one row with the same NULL
value. The task is simple now:
ALTER TABLE favorites
ADD CONSTRAINT favo_uni UNIQUE NULLS NOT DISTINCT (user_id, menu_id, recipe_id);
There are examples in the manual chapter "Unique Constraints".
The clause switches behavior for all keys of the same index. You can't treat NULL
as equal for one key, but not for another.NULLS DISTINCT
remains the default (in line with standard SQL) and does not have to be spelled out.
The same clause works for a UNIQUE
index, too:
CREATE UNIQUE INDEX favo_uni_idx
ON favorites (user_id, menu_id, recipe_id) NULLS NOT DISTINCT;
Note the position of the new clause after the key fields.
Postgres 14 or older
Create two partial indexes:
CREATE UNIQUE INDEX favo_3col_uni_idx ON favorites (user_id, menu_id, recipe_id)
WHERE menu_id IS NOT NULL;
CREATE UNIQUE INDEX favo_2col_uni_idx ON favorites (user_id, recipe_id)
WHERE menu_id IS NULL;
This way, there can only be one combination of (user_id, recipe_id)
where menu_id IS NULL
, effectively implementing the desired constraint.
Possible drawbacks:
- You cannot have a foreign key referencing
(user_id, menu_id, recipe_id)
. (It seems unlikely you'd want a FK reference three columns wide - use the PK column instead!) - You cannot base
CLUSTER
on a partial index. - Queries without a matching
WHERE
condition cannot use the partial index.
If you need a complete index, you can alternatively drop the WHERE
condition from favo_3col_uni_idx
and your requirements are still enforced.
The index, now comprising the whole table, overlaps with the other one and gets bigger. Depending on typical queries and the percentage of NULL
values, this may or may not be useful. In extreme situations it may even help to maintain all three indexes (the two partial ones and a total on top).
This is a good solution for a single nullable column, maybe for two. But it gets out of hands quickly for more as you need a separate partial index for every combination of nullable columns, so the number grows binomially. For multiple nullable columns, see instead:
- Why doesn't my UNIQUE constraint trigger?
Aside: I advise not to use mixed case identifiers in PostgreSQL.
Enforcing mutual uniqueness across multiple columns
You could create an "external" constraint in the form of an indexed view:
CREATE VIEW dbo.OccupiedRooms
WITH SCHEMABINDING
AS
SELECT r.Id
FROM dbo.Occupants AS o
INNER JOIN dbo.Rooms AS r ON r.Id IN (o.LivingRoomId, o.DiningRoomId)
;
GO
CREATE UNIQUE CLUSTERED INDEX UQ_1 ON dbo.OccupiedRooms (Id);
The view is essentially unpivoting the occupied rooms' IDs, putting them all in one column. The unique index on that column makes sure it does not have duplicates.
Here are demonstrations of how this method works:
failed insert;
successful insert.
UPDATE
As hvd has correctly remarked, the above solution does not catch attempts to insert identical LivingRoomId
and DiningRoomId
when they are put on the same row. This is because the dbo.Rooms
table is matched only once in that case and, therefore, the join produces produces just one row for the pair of references.
One way to fix that is suggested in the same comment: additionally to the indexed view, use a CHECK constraint on the dbo.OccupiedRooms
table to prohibit rows with identical room IDs. The suggested LivingRoomId <> DiningRoomId
condition, however, will not work for cases where both columns are NULL. To account for that case, the condition could be expanded to this one:
LivingRoomId <> DinindRoomId AND (LivingRoomId IS NOT NULL OR DinindRoomId IS NOT NULL)
Alternatively, you could change the view's SELECT statement to catch all situations. If LivingRoomId
and DinindRoomId
were NOT NULL
columns, you could avoid a join to dbo.Rooms
and unpivot the columns using a cross-join to a virtual 2-row table:
SELECT Id = CASE x.r WHEN 1 THEN o.LivingRoomId ELSE o.DiningRoomId END
FROM dbo.Occupants AS o
CROSS
JOIN (SELECT 1 UNION ALL SELECT 2) AS x (r)
However, as those columns allow NULLs, this method would not allow you to insert more than one single-reference row. To make it work in your case, you would need to filter out NULL entries, but only if they come from rows where the other reference is not NULL. I believe adding the following WHERE clause to the above query would suffice:
WHERE o.LivingRoomId IS NULL AND o.DinindRoomId IS NULL
OR x.r = 1 AND o.LivingRoomId IS NOT NULL
OR x.r = 2 AND o.DinindRoomId IS NOT NULL
How can I create a unique constraint on my column (SQL Server 2008 R2)?
To create these constraints through the GUI you need the "indexes and keys" dialogue not the check constraints one.
But in your case you just need to run the piece of code you already have. It doesn't need to be entered into the expression dialogue at all.
How do I create a unique constraint that also allows nulls?
SQL Server 2008 +
You can create a unique index that accept multiple NULLs with a WHERE
clause. See the answer below.
Prior to SQL Server 2008
You cannot create a UNIQUE constraint and allow NULLs. You need set a default value of NEWID().
Update the existing values to NEWID() where NULL before creating the UNIQUE constraint.
Related Topics
How to Return a New Identity Column Value from an SQLserver Select Statement
SQL Select Group by and String Concat
SQL Server 2005 - Order of Inner Joins
Derby's Handling of Null Values
Aggregate String Concatenation in Oracle 10G
Months Between Two Dates in SQL Server with Starting and End Date of Each of Them in SQL Server
How to Generate All N-Grams in Hive
Duplicate Column Name Error While Creating View
How to Select a Max Row for Each Group in SQL
T-SQL - Left Outer Joins - Filters in the Where Clause Versus the on Clause
Using Output Clause to Insert Value Not in Inserted
When Should You Consider Indexing Your SQL Tables
How to Use Output to Capture New and Old Id
Unpivot on an Indeterminate Number of Columns
Count Values for Every Column in a Table