To Ignore Duplicate Keys During 'Copy From' in Postgresql

To ignore duplicate keys during 'copy from' in postgresql

Use the same approach as you described, but DELETE (or group, or modify ...) duplicate PK in the temp table before loading to the main table.

Something like:

CREATE TEMP TABLE tmp_table 
ON COMMIT DROP
AS
SELECT * 
FROM main_table
WITH NO DATA;

COPY tmp_table FROM 'full/file/name/here';

INSERT INTO main_table
SELECT DISTINCT ON (PK_field) *
FROM tmp_table
ORDER BY (some_fields)

Details: CREATE TABLE AS, COPY, DISTINCT ON

how to have postgres ignore inserts with a duplicate key but keep going

If you're using Postgres 9.5 or newer (which I assume you are, since it was released back in January 2016), there's a very useful ON CONFLICT cluase you can use:

INSERT INTO mytable (id, col1, col2)
VALUES (123, 'some_value', 'some_other_value')
ON CONFLICT (id) DO NOTHING

how to emulate insert ignore and on duplicate key update (sql merge) with postgresql?

Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.

You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.

There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.

That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.

This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)

Ignore duplicate rows in PostgreSQL

error is clear , you need to have a unique constraint on your table for the columns that needs to be unique

alter table movie_data
add constraint constraintname unique (title, description);

now you can use this constraint :

 INSERT INTO movie_data(title, description, rating, published, cast_and_crew, age_group, country)
 VALUES ('movie name', 'something', 8, 2020, 'an actor', 'pg-13', 'GB')
 on conflict on constraint constraintname do nothing;

as Adrian pointed out , instead of that , if your Primary key is on those column that needs to be unique , you simply could :

 INSERT INTO movie_data(title, description, rating, published, cast_and_crew, age_group, country)
 VALUES ('movie name', 'something', 8, 2020, 'an actor', 'pg-13', 'GB')
 on conflict (title, description) do nothing;

Copy data from one table to another - Ignore duplicates Postgresql

Assuming id is your primary key, and table structures are identical(both table has common columns as number of columns and data type respectively), use not exists :

insert into TableB
select * 
  from TableA a 
 where not exists ( select 0 from TableB b where b.id = a.id )

Insert, on duplicate update in PostgreSQL?

PostgreSQL since version 9.5 has UPSERT syntax, with ON CONFLICT clause. with the following syntax (similar to MySQL)

INSERT INTO the_table (id, column_1, column_2) 
VALUES (1, 'A', 'X'), (2, 'B', 'Y'), (3, 'C', 'Z')
ON CONFLICT (id) DO UPDATE 
  SET column_1 = excluded.column_1, 
      column_2 = excluded.column_2;

Searching postgresql's email group archives for "upsert" leads to finding an example of doing what you possibly want to do, in the manual:

Example 38-2. Exceptions with UPDATE/INSERT

This example uses exception handling to perform either UPDATE or INSERT, as appropriate:

CREATE TABLE db (a INT PRIMARY KEY, b TEXT);

CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS
$$
BEGIN
    LOOP
        -- first try to update the key
        -- note that "a" must be unique
        UPDATE db SET b = data WHERE a = key;
        IF found THEN
            RETURN;
        END IF;
        -- not there, so try to insert the key
        -- if someone else inserts the same key concurrently,
        -- we could get a unique-key failure
        BEGIN
            INSERT INTO db(a,b) VALUES (key, data);
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            -- do nothing, and loop to try the UPDATE again
        END;
    END LOOP;
END;
$$
LANGUAGE plpgsql;

SELECT merge_db(1, 'david');
SELECT merge_db(1, 'dennis');

There's possibly an example of how to do this in bulk, using CTEs in 9.1 and above, in the hackers mailing list:

WITH foos AS (SELECT (UNNEST(%foo[])).*)
updated as (UPDATE foo SET foo.a = foos.a ... RETURNING foo.id)
INSERT INTO foo SELECT foos.* FROM foos LEFT JOIN updated USING(id)
WHERE updated.id IS NULL;

See a_horse_with_no_name's answer for a clearer example.

To Ignore Duplicate Keys During 'Copy From' in Postgresql