How to correctly do upsert in postgres 9.5
The ON CONFLICT
construct requires a UNIQUE
constraint to work. From the documentation on INSERT .. ON CONFLICT
clause:
The optional
ON CONFLICT
clause specifies an alternative action to raising a unique violation or exclusion constraint violation error. For each individual row proposed for insertion, either the insertion proceeds, or, if an arbiter constraint or index specified by conflict_target is violated, the alternative conflict_action is taken.ON CONFLICT DO NOTHING
simply avoids inserting a row as its alternative action.ON CONFLICT DO UPDATE
updates the existing row that conflicts with the row proposed for insertion as its alternative action.
Now, the question is not very clear but you probably need a UNIQUE
constraint on the 2 columns combined: (category_id, gallery_id)
.
ALTER TABLE category_gallery
ADD CONSTRAINT category_gallery_uq
UNIQUE (category_id, gallery_id) ;
If the row to be inserted matches both values with a row already on the table, then instead of INSERT
, do an UPDATE
:
INSERT INTO category_gallery (
category_id, gallery_id, create_date, create_by_user_id
) VALUES ($1, $2, $3, $4)
ON CONFLICT (category_id, gallery_id)
DO UPDATE SET
last_modified_date = EXCLUDED.create_date,
last_modified_by_user_id = EXCLUDED.create_by_user_id ;
You can use either the columns of the UNIQUE constraint:
ON CONFLICT (category_id, gallery_id)
or the constraint name:
ON CONFLICT ON CONSTRAINT category_gallery_uq
PostgreSQL 9.5 UPSERT in rule
How do you define uniqueness? If it is the combination of name + code + district, then just add a constraint UNIQUE(name, code, district)
on the table geo_pays_gex.voie
. The 3, together, must be unique... but you can have several time the same name, or code, or district.
See it at http://rextester.com/EWR73154
EDIT ***
Since you can have Nulls and want to treat them as a unique value, you can replace the constraint creation by a unique index that replace the nulls
CREATE UNIQUE INDEX
voie_uniq ON voie
(COALESCE(name,''), code, COALESCE(district,''));
How to find out if an upsert was an update with PostgreSQL 9.5+ UPSERT?
I believe xmax::text::int > 0
would be the easiest trick:
so=# DROP TABLE IF EXISTS tab;
NOTICE: table "tab" does not exist, skipping
DROP TABLE
so=# CREATE TABLE tab(id INT PRIMARY KEY, col text);
CREATE TABLE
so=# INSERT INTO tab(id, col) VALUES (1,'a'), (2, 'b');
INSERT 0 2
so=# INSERT INTO tab(id, col)
VALUES (3, 'c'), (4, 'd'), (1,'aaaa')
ON CONFLICT (id) DO UPDATE SET col = EXCLUDED.col
returning *,case when xmax::text::int > 0 then 'updated' else 'inserted' end,ctid;
id | col | case | ctid
----+------+----------+-------
3 | c | inserted | (0,3)
4 | d | inserted | (0,4)
1 | aaaa | updated | (0,5)
(3 rows)
INSERT 0 3
so=# INSERT INTO tab(id, col)
VALUES (3, 'c'), (4, 'd'), (1,'aaaa')
ON CONFLICT (id) DO UPDATE SET col = EXCLUDED.col
returning *,case when xmax::text::int > 0 then 'updated' else 'inserted' end,ctid;
id | col | case | ctid
----+------+---------+-------
3 | c | updated | (0,6)
4 | d | updated | (0,7)
1 | aaaa | updated | (0,8)
(3 rows)
INSERT 0 3
How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
9.5 and newer:
PostgreSQL 9.5 and newer support INSERT ... ON CONFLICT (key) DO UPDATE
(and ON CONFLICT (key) DO NOTHING
), i.e. upsert.
Comparison with ON DUPLICATE KEY UPDATE
.
Quick explanation.
For usage see the manual - specifically the conflict_action clause in the syntax diagram, and the explanatory text.
Unlike the solutions for 9.4 and older that are given below, this feature works with multiple conflicting rows and it doesn't require exclusive locking or a retry loop.
The commit adding the feature is here and the discussion around its development is here.
If you're on 9.5 and don't need to be backward-compatible you can stop reading now.
9.4 and older:
PostgreSQL doesn't have any built-in UPSERT
(or MERGE
) facility, and doing it efficiently in the face of concurrent use is very difficult.
This article discusses the problem in useful detail.
In general you must choose between two options:
- Individual insert/update operations in a retry loop; or
- Locking the table and doing batch merge
Individual row retry loop
Using individual row upserts in a retry loop is the reasonable option if you want many connections concurrently trying to perform inserts.
The PostgreSQL documentation contains a useful procedure that'll let you do this in a loop inside the database. It guards against lost updates and insert races, unlike most naive solutions. It will only work in READ COMMITTED
mode and is only safe if it's the only thing you do in the transaction, though. The function won't work correctly if triggers or secondary unique keys cause unique violations.
This strategy is very inefficient. Whenever practical you should queue up work and do a bulk upsert as described below instead.
Many attempted solutions to this problem fail to consider rollbacks, so they result in incomplete updates. Two transactions race with each other; one of them successfully INSERT
s; the other gets a duplicate key error and does an UPDATE
instead. The UPDATE
blocks waiting for the INSERT
to rollback or commit. When it rolls back, the UPDATE
condition re-check matches zero rows, so even though the UPDATE
commits it hasn't actually done the upsert you expected. You have to check the result row counts and re-try where necessary.
Some attempted solutions also fail to consider SELECT races. If you try the obvious and simple:
-- THIS IS WRONG. DO NOT COPY IT. It's an EXAMPLE.
BEGIN;
UPDATE testtable
SET somedata = 'blah'
WHERE id = 2;
-- Remember, this is WRONG. Do NOT COPY IT.
INSERT INTO testtable (id, somedata)
SELECT 2, 'blah'
WHERE NOT EXISTS (SELECT 1 FROM testtable WHERE testtable.id = 2);
COMMIT;
then when two run at once there are several failure modes. One is the already discussed issue with an update re-check. Another is where both UPDATE
at the same time, matching zero rows and continuing. Then they both do the EXISTS
test, which happens before the INSERT
. Both get zero rows, so both do the INSERT
. One fails with a duplicate key error.
This is why you need a re-try loop. You might think that you can prevent duplicate key errors or lost updates with clever SQL, but you can't. You need to check row counts or handle duplicate key errors (depending on the chosen approach) and re-try.
Please don't roll your own solution for this. Like with message queuing, it's probably wrong.
Bulk upsert with lock
Sometimes you want to do a bulk upsert, where you have a new data set that you want to merge into an older existing data set. This is vastly more efficient than individual row upserts and should be preferred whenever practical.
In this case, you typically follow the following process:
CREATE
aTEMPORARY
tableCOPY
or bulk-insert the new data into the temp tableLOCK
the target tableIN EXCLUSIVE MODE
. This permits other transactions toSELECT
, but not make any changes to the table.Do an
UPDATE ... FROM
of existing records using the values in the temp table;Do an
INSERT
of rows that don't already exist in the target table;COMMIT
, releasing the lock.
For example, for the example given in the question, using multi-valued INSERT
to populate the temp table:
BEGIN;
CREATE TEMPORARY TABLE newvals(id integer, somedata text);
INSERT INTO newvals(id, somedata) VALUES (2, 'Joe'), (3, 'Alan');
LOCK TABLE testtable IN EXCLUSIVE MODE;
UPDATE testtable
SET somedata = newvals.somedata
FROM newvals
WHERE newvals.id = testtable.id;
INSERT INTO testtable
SELECT newvals.id, newvals.somedata
FROM newvals
LEFT OUTER JOIN testtable ON (testtable.id = newvals.id)
WHERE testtable.id IS NULL;
COMMIT;
Related reading
- UPSERT wiki page
- UPSERTisms in Postgres
- Insert, on duplicate update in PostgreSQL?
- http://petereisentraut.blogspot.com/2010/05/merge-syntax.html
- Upsert with a transaction
- Is SELECT or INSERT in a function prone to race conditions?
- SQL
MERGE
on the PostgreSQL wiki - Most idiomatic way to implement UPSERT in Postgresql nowadays
What about MERGE
?
SQL-standard MERGE
actually has poorly defined concurrency semantics and is not suitable for upserting without locking a table first.
It's a really useful OLAP statement for data merging, but it's not actually a useful solution for concurrency-safe upsert. There's lots of advice to people using other DBMSes to use MERGE
for upserts, but it's actually wrong.
Other DBs:
INSERT ... ON DUPLICATE KEY UPDATE
in MySQLMERGE
from MS SQL Server (but see above aboutMERGE
problems)MERGE
from Oracle (but see above aboutMERGE
problems)
Postgres - upsert on passed parameter
BEGIN;
CREATE TABLE users (
user_id bigint PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
name text,
url text
);
INSERT INTO users (name, url)
VALUES ('Hello', 'world');
COMMIT;
using psql: https://www.postgresql.org/docs/current/app-psql.html
set variable in psql: How do you use script variables in psql?
You can also set variable in an transaction.
BEGIN;
\set name 'hi'
\set url 'yech'
INSERT INTO users (user_id, name, url)
VALUES (1, :'name', :'url')
ON CONFLICT (user_id)
DO UPDATE SET
name = EXCLUDED.name, url = EXCLUDED.url
RETURNING
*;
TABLE users;
COMMIT;
Upsert query is not updating the the records in postgres
Your syntax looks correct, but I don't think you want the where
clause. Instead:
Insert into store ( . . . )
select . . .
from store_temp t
on conflict (id) do update
set source = EXCLUDED.Source;
The . . .
are for the column list. I recommend being explicit in insert
s.
Then you need to be sure that id
is declared as the primary key or at least has a unique constraint or index.
Related Topics
How to Tell What Edition of SQL Server Runs on the MAChine
I Don't Understand Collation? (Mysql, Rdbms, Character Sets)
Oracle Convert Seconds to Hours:Minutes:Seconds
How to Document Your Database Structure
Limit Results from Joined Table to One Row
Xml Output Is Truncated in SQL
Can SQL Server SQL_Latin1_General_Cp1_Ci_As Be Safely Converted to Latin1_General_Ci_As
Using Multiple Joins. Sum() Producing Wrong Value
Alter Table Then Update in Single Statement
Watching Variables in Ssis During Debug
How to Select Bottom Most Rows
Differencebetween a Primary Key and a Unique Constraint
Return Bit Value as 1/0 and Not True/False in SQL Server
SQL Select Max(Date) and Corresponding Value