To ignore duplicate keys during 'copy from' in postgresql
Use the same approach as you described, but DELETE
(or group, or modify ...) duplicate PK
in the temp table before loading to the main table.
Something like:
CREATE TEMP TABLE tmp_table
ON COMMIT DROP
AS
SELECT *
FROM main_table
WITH NO DATA;
COPY tmp_table FROM 'full/file/name/here';
INSERT INTO main_table
SELECT DISTINCT ON (PK_field) *
FROM tmp_table
ORDER BY (some_fields)
Details: CREATE TABLE AS
, COPY
, DISTINCT ON
how to have postgres ignore inserts with a duplicate key but keep going
If you're using Postgres 9.5 or newer (which I assume you are, since it was released back in January 2016), there's a very useful ON CONFLICT
cluase you can use:
INSERT INTO mytable (id, col1, col2)
VALUES (123, 'some_value', 'some_other_value')
ON CONFLICT (id) DO NOTHING
how to emulate insert ignore and on duplicate key update (sql merge) with postgresql?
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
Ignore duplicate rows in PostgreSQL
error is clear , you need to have a unique constraint on your table for the columns that needs to be unique
alter table movie_data
add constraint constraintname unique (title, description);
now you can use this constraint :
INSERT INTO movie_data(title, description, rating, published, cast_and_crew, age_group, country)
VALUES ('movie name', 'something', 8, 2020, 'an actor', 'pg-13', 'GB')
on conflict on constraint constraintname do nothing;
as Adrian pointed out , instead of that , if your Primary key is on those column that needs to be unique , you simply could :
INSERT INTO movie_data(title, description, rating, published, cast_and_crew, age_group, country)
VALUES ('movie name', 'something', 8, 2020, 'an actor', 'pg-13', 'GB')
on conflict (title, description) do nothing;
Copy data from one table to another - Ignore duplicates Postgresql
Assuming id is your primary key, and table structures are identical(both table has common columns as number of columns and data type respectively), use not exists
:
insert into TableB
select *
from TableA a
where not exists ( select 0 from TableB b where b.id = a.id )
Insert, on duplicate update in PostgreSQL?
PostgreSQL since version 9.5 has UPSERT syntax, with ON CONFLICT clause. with the following syntax (similar to MySQL)
INSERT INTO the_table (id, column_1, column_2)
VALUES (1, 'A', 'X'), (2, 'B', 'Y'), (3, 'C', 'Z')
ON CONFLICT (id) DO UPDATE
SET column_1 = excluded.column_1,
column_2 = excluded.column_2;
Searching postgresql's email group archives for "upsert" leads to finding an example of doing what you possibly want to do, in the manual:
Example 38-2. Exceptions with UPDATE/INSERT
This example uses exception handling to perform either UPDATE or INSERT, as appropriate:
CREATE TABLE db (a INT PRIMARY KEY, b TEXT);
CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS
$$
BEGIN
LOOP
-- first try to update the key
-- note that "a" must be unique
UPDATE db SET b = data WHERE a = key;
IF found THEN
RETURN;
END IF;
-- not there, so try to insert the key
-- if someone else inserts the same key concurrently,
-- we could get a unique-key failure
BEGIN
INSERT INTO db(a,b) VALUES (key, data);
RETURN;
EXCEPTION WHEN unique_violation THEN
-- do nothing, and loop to try the UPDATE again
END;
END LOOP;
END;
$$
LANGUAGE plpgsql;
SELECT merge_db(1, 'david');
SELECT merge_db(1, 'dennis');
There's possibly an example of how to do this in bulk, using CTEs in 9.1 and above, in the hackers mailing list:
WITH foos AS (SELECT (UNNEST(%foo[])).*)
updated as (UPDATE foo SET foo.a = foos.a ... RETURNING foo.id)
INSERT INTO foo SELECT foos.* FROM foos LEFT JOIN updated USING(id)
WHERE updated.id IS NULL;
See a_horse_with_no_name's answer for a clearer example.
Related Topics
In Oracle, How to Insert or Update a Record Through a View
Query Times Out from Web App But Runs Fine from Management Studio
Group Datetime into 5,15,30 and 60 Minute Intervals
Ora-00972 Identifier Is Too Long Alias Column Name
Get Avg Ignoring Null or Zero Values
Why Don't Dbms's Support Assertion
Getting SQL Server Cross Database Dependencies
Using Like in an Oracle in Clause
T-SQL Skip Take Stored Procedure
Oracle in VS Exists Difference
MySQL Strip Time Component from Datetime
Execute a Stored Procedure in Another Stored Procedure in SQL Server
Timestamp Difference in Hours for Postgresql