What Happens with Duplicates When Inserting Multiple Rows

What happens with duplicates when inserting multiple rows?

The INSERT will just insert all rows and nothing special will happen, unless you have some kind of constraint disallowing duplicate / overlapping values (PRIMARY KEY, UNIQUE, CHECK or EXCLUDE constraint) - which you did not mention in your question. But that's what you are probably worried about.

Assuming a UNIQUE or PK constraint on (col1,col2), you are dealing with a textbook UPSERT situation. Many related questions and answers to find here.

Generally, if any constraint is violated, an exception is raised which (unless trapped in subtransaction like it's possible in a procedural server-side language like plpgsql) will roll back not only the statement, but the whole transaction.

Without concurrent writes

I.e.: No other transactions will try to write to the same table at the same time.

  • Exclude rows that are already in the table with WHERE NOT EXISTS ... or any other applicable technique:

  • Select rows which are not present in other table

  • And don't forget to remove duplicates within the inserted set as well, which would not be excluded by the semi-anti-join WHERE NOT EXISTS ...

One technique to deal with both at once would be EXCEPT:

INSERT INTO tbl (col1, col2)
VALUES
(text 'v1', text 'v2') -- explicit type cast may be needed in 1st row
, ('v3', 'v4')
, ('v3', 'v4') -- beware of dupes in source
EXCEPT SELECT col1, col2 FROM tbl;

EXCEPT without the key word ALL folds duplicate rows in the source. If you know there are no dupes, or you don't want to fold duplicates silently, use EXCEPT ALL (or one of the other techniques). See:

  • Using EXCEPT clause in PostgreSQL

Generally, if the target table is big, WHERE NOT EXISTS in combination with DISTINCT on the source will probably be faster:

INSERT INTO tbl (col1, col2)
SELECT *
FROM (
SELECT DISTINCT *
FROM (
VALUES
(text 'v1', text'v2')
, ('v3', 'v4')
, ('v3', 'v4') -- dupes in source
) t(c1, c2)
) t
WHERE NOT EXISTS (
SELECT FROM tbl
WHERE col1 = t.c1 AND col2 = t.c2
);

If there can be many dupes, it pays to fold them in the source first. Else use one subquery less.

Related:

  • Select rows which are not present in other table

With concurrent writes

Use the Postgres UPSERT implementation INSERT ... ON CONFLICT ... in Postgres 9.5 or later:

INSERT INTO tbl (col1,col2)
SELECT DISTINCT * -- still can't insert the same row more than once
FROM (
VALUES
(text 'v1', text 'v2')
, ('v3','v4')
, ('v3','v4') -- you still need to fold dupes in source!
) t(c1, c2)
ON CONFLICT DO NOTHING; -- ignores rows with *any* conflict!

Further reading:

  • How to use RETURNING with ON CONFLICT in PostgreSQL?
  • How do I insert a row which contains a foreign key?

Documentation:

  • The manual
  • The commit page
  • The Postgres Wiki page

Craig's reference answer for UPSERT problems:

  • How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?

Mysql insert multiple rows, if duplicate value column then update

You must create unique index which will detect the duplication:

CREATE UNIQUE INDEX idx ON test (customer_id, textkey);

Now you can use INSERT .. ODKU:

INSERT INTO test (customer_id, textkey, value) VALUES
(1, 'text1', 'valueX'),
(1, 'text3', 'valueY'),
(1, 'text5', 'value5'),
(1, 'text6', 'value6')
ON DUPLICATE KEY UPDATE
value = VALUES(value);

https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=6dcc351706f574f4c9f13c1fc95b9225

MySQL ON DUPLICATE KEY UPDATE for multiple rows insert in single query

Beginning with MySQL 8.0.19 you can use an alias for that row (see reference).

INSERT INTO beautiful (name, age)
VALUES
('Helen', 24),
('Katrina', 21),
('Samia', 22),
('Hui Ling', 25),
('Yumie', 29)
AS new
ON DUPLICATE KEY UPDATE
age = new.age
...

For earlier versions use the keyword VALUES (see reference, deprecated with MySQL 8.0.20).

INSERT INTO beautiful (name, age)
VALUES
('Helen', 24),
('Katrina', 21),
('Samia', 22),
('Hui Ling', 25),
('Yumie', 29)
ON DUPLICATE KEY UPDATE
age = VALUES(age),
...

How to INSERT multiple rows when some might be DUPLICATES of an already-existing row?

In my opinion, simply deleting the existing choices and then inserting the new ones is the best way to go. It may not be the most efficient overall, but it is simple to code and thus has a much better chance of being correct.

Otherwise it is necessary to find the intersection of the new choices and old choices. Then either delete the obsolete ones or change them to the new choices (and then insert/delete depending on if the new set of choices is bigger or smaller than the original set). The added risk of the extra complexity does not seem worth it.

Edit As @Andrew points out in the comments, deleting the originals en masse may not be a good plan if these records happened to be "parent" records in a referential integrity definition. My thinking was that this seemed like an unlikely situation based on the OP's description. But it is definitely worth consideration.

MySql Insert multiple and Ignore duplicate records

To start with, you want to create a unique constraint on categories/product tuples to avoid duplicates:

alter table _categories_products  
add constraint _categories_products _bk
unique (product_id, category_id);

From that point on, an attempt to insert duplicates in the table would, by default, raise an error. You can trap and manage that error with MySQL on duplicate key syntax:

insert into _categories_products (product_id, category_id) 
values (1, 14), (1, 8), (1, 1), (1, 23)
on duplicate key update product_id = values(product_id)

In the case of duplicate, the above query performs a dumb update on product_id, which actually turns the insert to a no-op.

Unique constraint on multiple rows..how just not insert duplicate without stopping all

Why not write your insert as select statement in such a way as to filter out duplicates using the where condition.

There's an example of this approach (though obviously not the exact same problem), but see this: Avoid duplicates in INSERT INTO SELECT query in SQL Server

In your case it would involve doing a join between AnnotationCommitReport and the insert as select, to avoid the duplicates.

MySQL INSERT multiple rows ON DUPLICATE KEY UPDATE ... WHERE EXISTS(subquery)

Personally, I would maintain access rights in the application layer and only execute the INSERT query if the user had those access rights. If you must do it solely in MySQL your last query is pretty much your only alternative, although it can be simplified (and made more efficient) by using a CROSS JOIN; this way you don't need the dummy column as all rows of the UNION table will be automatically joined to the single row from the access check (assuming it returns data). Note in my query I've added the id key column for demo purposes:

INSERT INTO mytable (id, col1, col2, col3)
SELECT t1.*
FROM ((
SELECT 2 id, 1 col1, 2 col2, 3 col3
UNION SELECT 1, 4, 5, 6
UNION SELECT 3, 7, 8, 9
) t1
CROSS JOIN (SELECT 1 FROM accesstable WHERE user_id=114 LIMIT 1) t2
)
ON DUPLICATE KEY UPDATE
col1=VALUES(col1), col2=VALUES(col2), col3=VALUES(col3)

Demo on dbfiddle

Insert multiple values and ignore duplicates on not key field

Use LEFT JOIN to determine if record is already in table

SqlFiddleDemo

CREATE TABLE persons(ID INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
content VARCHAR(100));

INSERT INTO persons (name, content)
VALUES ('john', 'some content'), ('bob', 'another content');

INSERT INTO persons (name,content)
SELECT t.name, t.content
FROM
(
SELECT "Foo" AS name,"meh" AS content
UNION ALL
SELECT "Moo","beh"
UNION ALL
SELECT "bob","huh"
UNION ALL
SELECT "dude","haha"
) AS t
LEFT JOIN persons p
ON p.name = t.name
WHERE p.id IS NULL;

SELECT *
FROM persons;

Same using NOT EXISTS

INSERT INTO persons (name,content)
SELECT t.name, t.content
FROM
(
SELECT "Foo" AS name,"meh" AS content
UNION ALL
SELECT "Moo","beh"
UNION ALL
SELECT "bob","huh"
UNION ALL
SELECT "dude","haha"
) AS t
WHERE NOT EXISTS
(SELECT 1 FROM persons p WHERE p.name = t.name)

EDIT:

Add UNIQUE KEY and use INSERT IGNORE INTO

SqlFiddleDemo

CREATE TABLE persons(ID INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) UNIQUE KEY,
content VARCHAR(100));

INSERT INTO persons (name, content)
VALUES ('john', 'some content'), ('bob', 'another content');

INSERT IGNORE INTO persons (name,content)
VALUES
("Foo","meh"),
("Moo","beh"),
("bob","huh"),
("dude","haha");

SELECT *
FROM persons;

MySQL - prevent duplicates for multiple insert statement without using a unique index

You can use insert . . . select:

INSERT INTO `table1` (`a`, `b`, `c`)
select t.*
from (select 'a' as a, 'b' as b, 'c' as c union all
select '1', '2', 'bar' union all
select 'd', 'e', 'f'
) t
where not exists (select 1 from table1 t1 where t1.a = t.a and t1.b = t.b);


Related Topics



Leave a reply



Submit