How to Create a Postgres Table with Unique Combined Primary Key

How to create a Postgres table with unique combined primary key?

Create a unique index:

CREATE UNIQUE INDEX matches_uni_idx ON matches
   (greatest(winner, loser), least(winner, loser));

Can't be a UNIQUE or PRIMARY KEY constraint, since those only work with columns, not expressions.

You might add a serial column to serve as PK, but with just two integer columns, your original PK is very efficient, too (see comments). And it makes both columns NOT NULL automatically. (Else, add NOT NULL constraints.)

You also might add a CHECK constraint to rule out players playing against themselves:

CHECK (winner <> loser)

Hint: To search for a pair of IDs (where you don't know who won), build the same expressions into your query, and the index will be used:

SELECT * FROM matches
WHERE  greatest(winner, loser) = 3  -- the greater value, obviously
AND    least(winner, loser) = 1;

If you deal with unknown parameters and you don't know which is greater ahead of time:

WITH input AS (SELECT $id1 AS _id1, $id2 AS _id2)  -- input once
SELECT * FROM matches, input
WHERE  greatest(winner, loser) = greatest(_id1, _id2)
AND    least(winner, loser) = least(_id1, _id2);

The CTE wrapper is just for convenience to enter parameters once only and not necessary in some contexts.

Postgres: How to do Composite keys?

Your compound PRIMARY KEY specification already does what you want. Omit the line that's giving you a syntax error, and omit the redundant CONSTRAINT (already implied), too:

 CREATE TABLE tags
      (
               question_id INTEGER NOT NULL,
               tag_id SERIAL NOT NULL,
               tag1 VARCHAR(20),
               tag2 VARCHAR(20),
               tag3 VARCHAR(20),
               PRIMARY KEY(question_id, tag_id)
      );

NOTICE:  CREATE TABLE will create implicit sequence "tags_tag_id_seq" for serial column "tags.tag_id"
    NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "tags_pkey" for table "tags"
    CREATE TABLE
    pg=> \d tags
                                         Table "public.tags"
       Column    |         Type          |                       Modifiers       
    -------------+-----------------------+-------------------------------------------------------
     question_id | integer               | not null
     tag_id      | integer               | not null default nextval('tags_tag_id_seq'::regclass)
     tag1        | character varying(20) |
     tag2        | character varying(20) |
     tag3        | character varying(20) |
    Indexes:
        "tags_pkey" PRIMARY KEY, btree (question_id, tag_id)

Unique composite index with primary key

Yes, that additional constraint is meaningless: if id is unique by virtue of being the primary key, the combination of id and name is unique as well.

The need for this conceptually unnecessary unique constraint arises because a foreign key has to reference a primary key or unique constraints that contains exactly the targeted rows. Otherwise it might not be clear which of several constraints is referenced by a certain foreign key.

Unique partial composite primary key in Postgres

You can use the INCLUDE option to add extra columns in the index that are not actually part of the index itself.

create table foo (
  id integer not null,
  yesno boolean not null,
  extra text
);
Create unique index foo_uk 
           on foo (id, yesno) 
      include (extra);

You did not indicate what Postgres version you have, so this may not be appropriate, as you need at least version 11.

Why Composite Primary key when I can use Single Primary key with Unique constraints on composite columns?

You do not need a primary key to enforce uniqueness. You can use a unique constraint or index instead.

I am not a fan of composite primary keys. Here are some reasons:

All foreign key references have to include all the keys in the correct order and matching types. This makes is slightly more cumbersome to define those tables.
Because the composite keys are included in all referencing tables, those tables are often larger, which results in worse performance.
If you decide that you want to change the type of one of the component keys -- say the length of a string or an int to a numeric -- you have to modify lots and lots of tables.
When joining tables, you have to include all the keys. If you miss one . . . well, the code is syntactically correct but the results are wrong.

There are occasions where composite keys are acceptable, such as tables that have no foreign key references. Even in those cases, I use synthetic keys, but I totally understand the other perspective.

How do I add a Composite primary key with Knex.js?

As per Knex's documentation here:

primary — column.primary([constraintName]); table.primary(columns,
[constraintName]) When called on a single column it will set that
column as the primary key for a table. If you need to create a
composite primary key, call it on a table with an array of column
names instead. Constraint name defaults to tablename_pkey unless
constraintName is specified.

Therefore, in your case you could add:

table.primary(['name_of_column_1', 'name_of_column_2']);

Primary key for multiple columns in PostgreSQL?

There can only be one primary key per table - as indicated by the word "primary".

You can have additional UNIQUE columns like:

CREATE TABLE test(
   sl_no int PRIMARY KEY,  -- NOT NULL due to PK
   emp_id int UNIQUE NOT NULL,
   emp_name text,
   emp_addr text
);

Columns that are (part of) the PRIMARY KEY are marked NOT NULL automatically.

Or use a table constraint instead of a column constraint to create a single multicolumn primary key. This is semantically different from the above: Now, only the combination of both columns must be unique, each column can hold duplicates on its own.

CREATE TABLE test(
   sl_no int,     -- NOT NULL due to PK below
   emp_id int ,   -- NOT NULL due to PK below
   emp_name text,
   emp_addr text,
   PRIMARY KEY (sl_no, emp_id)
);

Multicolumn UNIQUE constraints are possible, too.

Aside: Don't use CaMeL-case identifiers in Postgres. Use legal, lower-case identifiers so you never have to use double-quotes. Makes your life easier. See:

Are PostgreSQL column names case-sensitive?

Composite PRIMARY KEY enforces NOT NULL constraints on involved columns

If you need to allow NULL values, use a UNIQUE constraint (or index) instead of a PRIMARY KEY (and add a surrogate PK column - I suggest a serial or IDENTITY column in Postgres 10 or later).

Auto increment table column

A UNIQUE constraint allows columns to be NULL:

CREATE TABLE distributor (
  distributor_id GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, m_id integer
, x_id integer
, UNIQUE(m_id, x_id)  -- !
-- , CONSTRAINT distributor_my_name_uni UNIQUE (m_id, x_id)  -- verbose form
);

The manual:

For the purpose of a unique constraint, null values are not considered equal, unless NULLS NOT DISTINCT is specified.

In your case, you could enter something like (1, NULL) for (m_id, x_id) any number of times without violating the constraint. Postgres never considers two NULL values equal - as per definition in the SQL standard.

If you need to treat NULL values as equal (i.e. "not distinct") to disallow such "duplicates", I see ~~two~~ three (since Postgres 15) options:

0. `NULLS NOT DISTINCT`

This option was added with Postgres 15 and allows to treat NULL values as "not distinct", so two of them conflict in a unique constraint or index. This is the most convenient option, going forward. The manual:

That means even in the presence of a unique constraint it is possible
to store duplicate rows that contain a null value in at least one of
the constrained columns. This behavior can be changed by adding the
clause NULLS NOT DISTINCT ...

Detailed instructions:

Create unique constraint with null columns

1. Two partial indexes

In addition to the UNIQUE constraint above:

CREATE UNIQUE INDEX dist_m_uni_idx ON distributor (m_id) WHERE x_id IS NULL;
CREATE UNIQUE INDEX dist_x_uni_idx ON distributor (x_id) WHERE m_id IS NULL;

But this gets out of hands quickly with more than two columns that can be NULL. See:

Create unique constraint with null columns

2. A multi-column `UNIQUE` index on expressions

Instead of the UNIQUE constraint. We need a free default value that is never present in involved columns, like -1. Add CHECK constraints to disallow it:

CREATE TABLE distributor (
   distributor serial PRIMARY KEY
 , m_id integer
 , x_id integer
 , CHECK (m_id <> -1)
 , CHECK (x_id <> -1)
);

CREATE UNIQUE INDEX distributor_uni_idx
ON distributor (COALESCE(m_id, -1), COALESCE(x_id, -1));

Do I need a primary key for my table, which has a UNIQUE (composite 4-columns), one of which can be NULL?

Should I use a "serial" primary key just in case I ever need one?

You can easily add a serial column later if you need one:

ALTER TABLE product_pricebands ADD COLUMN id serial;

The column will be filled with unique values automatically. You can even make it the primary key in the same statement (if no primary key is defined, yet):

ALTER TABLE product_pricebands ADD COLUMN id serial PRIMARY KEY;

If you reference the table from other tables I would advise to use such a surrogate primary key, because it is rather unwieldy to link by four columns. It is also slower in SELECTs with JOINs.

Either way, you should define a primary key. The UNIQUE index including a nullable column is not a full replacement. It allows duplicates for combinations including a NULL value, because two NULL values are never considered the same. This can lead to trouble.

the colourid field can be NULL

you might want to create two unique indexes. The combination (template_sku, siteid, currencyid, colourid) cannot be a PRIMARY KEY, because of the nullable colourid, but you can create a UNIQUE constraint like you already have (implementing an index automatically):

ALTER TABLE product_pricebands ADD CONSTRAINT product_pricebands_uni_idx
UNIQUE (template_sku, siteid, currencyid, colourid)

This index perfectly covers the queries you mention in 2).

Create a partial unique index in addition if you want to avoid "duplicates" with (colourid IS NULL):

CREATE UNIQUE INDEX product_pricebands_uni_null_idx
ON product_pricebands (template_sku, siteid, currencyid)
WHERE colourid IS NULL;

To cover all bases. I wrote more about that technique in a related answer on dba.SE.

The simple alternative to the above is to make colourid NOT NULL and create a primary key instead of the above product_pricebands_uni_idx.

Also, as you

basically DELETE most of the data

for your refill operation, it will be faster to drop indexes, that are not needed during the refill operation, and recreate those afterwards. It is faster by an order of magnitude to build an index from scratch than to add all rows incrementally.

How do you know, which indexes are used (needed)?

Test your queries with EXPLAIN ANALYZE.
Or use the built-in statistics. pgAdmin displays statistics in a separate tab for the selected object.

It may also be faster to select the few rows with my_custom_field = TRUE into a temporary table, TRUNCATE the base table and re-INSERT the survivors. Depends on whether you have foreign keys defined. Would look like this:

CREATE TEMP TABLE pr_tmp AS
SELECT * FROM product_pricebands WHERE my_custom_field;

TRUNCATE product_pricebands;
INSERT INTO product_pricebands SELECT * FROM pr_tmp;

This avoids a lot of vacuuming.

PostgreSQL composite primary key

If you create a composite primary key, on (x, y, z), PostgreSQL implements this with the help of one UNIQUE multi-column btree index on (x, y, z). In addition, all three columns are NOT NULL (implicitly), which is the main difference between a PRIMARY KEY and a UNIQUE INDEX.

Besides obvious restrictions on your data, the multi-column index also has a somewhat different effect on the performance of queries than three individual indexes on x, y and z.

Related discussion on dba.SE:

Working of indexes in PostgreSQL

With examples, benchmarks, discussion and outlook on the new feature of index-only scans in Postgres 9.2.

In particular, a primary key on (x, y, z) will speed up queries with conditions on x, (x,y) or (x,y,z) optimally. It will also help with queries on y, z, (y,z) or (x,z) but to a far lesser extent.

If you need to speed up queries on the latter combinations, you may want to change the order of column in your PK constraint and/or create one or more additional indexes. See:

Is a composite index also good for queries on the first field?

How to Create a Postgres Table with Unique Combined Primary Key