Fixing Holes/Gaps in Numbers Generated by Postgres Sequence

Sequence incremented randomly postgres

As discussed above the real problem is not the sequence it self but we found that we had a big number of aborted transactions!

And after fixing the sequence issue was fixed as well.

serial in postgres is being increased even though I added on conflict do nothing

The reason this feels weird to you is that you are thinking of the increment on the counter as part of the insert operation, and therefore the "DO NOTHING" ought to mean "don't increment anything". You're picturing this:

  1. Check values to insert against constraint
  2. If duplicate detected, abort
  3. Increment sequence
  4. Insert data

But in fact, the increment has to happen before the insert is attempted. A SERIAL column in Postgres is implemented as a DEFAULT which executes the nextval() function on a bound SEQUENCE. Before the DBMS can do anything with the data, it's got to have a complete set of columns, so the order of operations is like this:

  1. Resolve default values, including incrementing the sequence
  2. Check values to insert against constraint
  3. If duplicate detected, abort
  4. Insert data

This can be seen intuitively if the duplicate key is in the autoincrement field itself:

CREATE TABLE foo ( id SERIAL NOT NULL PRIMARY KEY, bar text );
-- Insert row 1
INSERT INTO foo ( bar ) VALUES ( 'test' );
-- Reset the sequence
SELECT setval(pg_get_serial_sequence('foo', 'id'), 0, true);
-- Attempt to insert row 1 again
INSERT INTO foo ( bar ) VALUES ( 'test 2' )
ON CONFLICT (id) DO NOTHING;

Clearly, this can't know if there's a conflict without incrementing the sequence, so the "do nothing" has to come after that increment.

How to check a sequence efficiently for used and unused values in PostgreSQL

Consider not doing it. Read these related answers first:

  • Gap-less sequence where multiple transactions with multiple tables are involved
  • Compacting a sequence in PostgreSQL

If you still insist on filling in gaps, here is a rather efficient solution:

1. To avoid searching large parts of the table for the next missing chart_number, create a helper table with all current gaps once:

CREATE TABLE chart_gap AS
SELECT chart_number
FROM generate_series(1, (SELECT max(chart_number) - 1 -- max is no gap
FROM charts)) chart_number
LEFT JOIN charts c USING (chart_number)
WHERE c.chart_number IS NULL;

2. Set charts_chartnumber_seq to the current maximum and convert chart_number to an actual serial column:

SELECT setval('charts_chartnumber_seq', max(chart_number)) FROM charts;

ALTER TABLE charts
ALTER COLUMN chart_number SET NOT NULL
, ALTER COLUMN chart_number SET DEFAULT nextval('charts_chartnumber_seq');

ALTER SEQUENCE charts_chartnumber_seq OWNED BY charts.chart_number;

Details:

  • How to reset postgres' primary key sequence when it falls out of sync?
  • Safely and cleanly rename tables that use serial primary key columns in Postgres?

3. While chart_gap is not empty fetch the next chart_number from there.
To resolve possible race conditions with concurrent transactions, without making transactions wait, use advisory locks:

WITH sel AS (
SELECT chart_number, ... -- other input values
FROM chart_gap
WHERE pg_try_advisory_xact_lock(chart_number)
LIMIT 1
FOR UPDATE
)
, ins AS (
INSERT INTO charts (chart_number, ...) -- other target columns
TABLE sel
RETURNING chart_number
)
DELETE FROM chart_gap c
USING ins i
WHERE i.chart_number = c.chart_number;

Alternatively, Postgres 9.5 or later has the handy FOR UPDATE SKIP LOCKED to make this simpler and faster:

...
SELECT chart_number, ... -- other input values
FROM chart_gap
LIMIT 1
FOR UPDATE SKIP LOCKED
...

Detailed explanation:

  • Postgres UPDATE ... LIMIT 1

Check the result. Once all rows are filled in, this returns 0 rows affected. (you could check in plpgsql with IF NOT FOUND THEN ...). Then switch to a simple INSERT:

   INSERT INTO charts (...)  -- don't list chart_number
VALUES (...); -- don't provide chart_number

How to fill in the holes in auto-increment fields?

What is the reason you need this functionality? Your db should be fine with the gaps, and if you're approaching the max size of your key, just make it unsigned or change the field type.

How to reset Postgres' primary key sequence when it falls out of sync?

-- Login to psql and run the following

-- What is the result?
SELECT MAX(id) FROM your_table;

-- Then run...
-- This should be higher than the last result.
SELECT nextval('your_table_id_seq');

-- If it's not higher... run this set the sequence last to your highest id.
-- (wise to run a quick pg_dump first...)

BEGIN;
-- protect against concurrent inserts while you update the counter
LOCK TABLE your_table IN EXCLUSIVE MODE;
-- Update the sequence
SELECT setval('your_table_id_seq', COALESCE((SELECT MAX(id)+1 FROM your_table), 1), false);
COMMIT;

Source - Ruby Forum

How can we find gaps in sequential numbering in MySQL?

A better answer

ConfexianMJS provided a much better answer in terms of performance.

The (not as fast as possible) answer

Here's a version that works on a table of any size (not just on 100 rows):

SELECT (t1.id + 1) as gap_starts_at,
(SELECT MIN(t3.id) -1 FROM arrc_vouchers t3 WHERE t3.id > t1.id) as gap_ends_at
FROM arrc_vouchers t1
WHERE NOT EXISTS (SELECT t2.id FROM arrc_vouchers t2 WHERE t2.id = t1.id + 1)
HAVING gap_ends_at IS NOT NULL
  • gap_starts_at - first id in current gap
  • gap_ends_at - last id in current gap


Related Topics



Leave a reply



Submit