Why Postgres Returns Unordered Data in Select Query, After Updation of Row

Why does the order of data in Postgres database change if one is changed?

You should always use an order by clause in your query to list the jobs in a particular order. Otherwise, there is no guaranty that records will be ordered in the same way each time you query.

In your your application, update your query something like this select title, field2 from jobs where client = 10 order by job_id. Then, loop through the jobs. When they alter the job, and let's say they altered job id 4 of 10, requery their data using the order by clause. That'll give the same ordering over and over again.

Postgres update rule returning number of rows affected

You can't. Unless you do it in a stored procedure and return the number of affected rows.

A good explanation in the official doc

Ordering in selection result. Postgres

A SQL query -- like a SQL table -- represents an unordered set. There is no ordering, unless an ORDER BY is present for the outermost SELECT.

As an unordered set, the same query can return results in a different order each time it is run.

So, if you want results in a particular order, use ORDER BY.

I should add that if multiple rows have the same key, then these rows can appear in any order, even with an ORDER BY. In general, you should ensure that the keys in the ORDER BY uniquely define each row (say by including the primary key as the final key).

Wrong PostgreSQL query results with explicit locks and concurrent transaction

There are a couple of things going on here. First, this is documented behavior. Second, you don't see the whole story, because you didn't try to update anything in session "B".

This seems like a violation of transaction isolation.

Depends on what isolation level you're running at. PostgreSQL's default transaction isolation level is READ COMMITTED.

This is documented behavior in PostgreSQL.

It is possible for a SELECT command running at the READ COMMITTED
transaction isolation level and using ORDER BY and a locking clause to
return rows out of order. This is because ORDER BY is applied first.
The command sorts the result, but might then block trying to obtain a
lock on one or more of the rows. Once the SELECT unblocks, some of the
ordering column values might have been modified, leading to those rows
appearing to be out of order (though they are in order in terms of the
original column values).

One workaround (also documented, same link) is to move the FOR UPDATE into a subquery, but this requires a table lock.

To see what PostgreSQL really does in this situation, run an update in session "B".

create table test (
id integer primary key,
value char(1) not null,
created_at timestamp not null
);
insert into test values
(1, 'A', '2014-01-01 00:00:00'),
(2, 'A', '2014-01-02 00:00:00'),
(3, 'B', '2014-01-03 00:00:00'),
(4, 'B', '2014-01-04 00:00:00'),
(5, 'A', '2014-01-05 00:00:00'),
(6, 'B', '2014-01-06 00:00:00'),
(7, 'A', '2014-01-07 00:00:00'),
(8, 'B', '2014-01-08 00:00:00');

A: begin; /* Begin transaction A */
B: begin; /* Begin transaction B */
A: select * from test where id = 1 for update; /* Lock one row */
B: select * from test where value = 'B' order by created_at limit 3 for update; /* This query returns immediately since it does not need to return row with id=1 */
B: select * from test where value = 'A' order by created_at limit 3 for update; /* This query blocks because row id=1 is locked by transaction A */
A: update test set created_at = '2014-01-09 00:00:00' where id = 1; /* Modify the locked row */
A: commit;
B: update test set value = 'C' where id in (select id from test where value = 'A' order by created_at limit 3); /* Updates 3 rows */
B: commit;

Now, look at the table.


scratch=# select * from test order by id;
id | value | created_at
----+-------+---------------------
1 | A | 2014-01-09 00:00:00
2 | C | 2014-01-02 00:00:00
3 | B | 2014-01-03 00:00:00
4 | B | 2014-01-04 00:00:00
5 | C | 2014-01-05 00:00:00
6 | B | 2014-01-06 00:00:00
7 | C | 2014-01-07 00:00:00
8 | B | 2014-01-08 00:00:00

Session "A" succeeded in updating the row having id 1 to '2014-01-09'. Session "B" succeeded in updating the three remaining rows whose value was 'A'. The update statement obtained locks on id numbers 2, 5, and 7; we know that because those were the rows actually updated. The earlier select statement locked different rows--rows 1, 2, and 5.

You can block session B's update if you start a third terminal session, and lock row 7 for update.



Related Topics



Leave a reply



Submit