Postgresql Incorrect Sorting

Postgresql - Incorrect Sorting

You seem to have your database created with the C collation, which sorts according to code point.

But you can specify other collations, for example

ORDER BY ll.name COLLATE "en_US" DESC

You have to use a collation that is defined in your database, see pg_collation for the list.

Incorrect ORDER BY PostgreSQL

Presumably, the values are strings. You can sort them in one of two ways. The first is the "string" way:

 order by length(company_id), company_id

The second is to convert to a number:

order by company_id::numeric

Incorrect sort/collation/order with spaces in Postgresql 9.4

On Unix/Linux SE, a friendly expert explained that what you see is the proper way to sort Unicode. Basically, the standard is trying to sort:

di Silva Fred                  di Silva Fred
di Silva John diSilva Fred
diSilva Fred disílva Fred
diSilva John -> di Silva John
disílva Fred diSilva John
disílva John disílva John

Now if spaces were as important as letters, the sort could not separate the various identical spellings of Fred and John. So what happens is that it first sorts without spaces. Then in a second pass, strings that are the same without whitespace are sorted. (This is a simplification, the real algorithm looks fairly complex, assigning whitespace, accents and non-printable characters various levels of precedence.)

You can bypass the Unicode collation by setting:

export LC_ALL=C

Or in Postgres by casting to byte array for sorting:

order by name::bytea

Or (from Kiln's answer) by specifying the C collation:

order by name collate "C"

Or by altering the default collation for the column:

alter table products alter column name type text collate "C";

Postgresql ORDER BY not working as expected

Indeed, I've tried @jjanes's suggestion to use the C collation and the output is the one I would expect:

SELECT
id, word
FROM testing1
ORDER BY word collate "C" ;

Sample Image

How weird, I have been using postgresql for some years now and I never noticed that behaviour.

Relevant section from the docs:

23.2.2.1. Standard Collations

On all platforms, the collations named default, C, and POSIX are available. > Additional collations may be available depending on operating system
support. The default collation selects the LC_COLLATE and LC_CTYPE values
specified at database creation time. The C and POSIX collations both
specify “traditional C” behavior, in which only the ASCII letters “A”
through “Z” are treated as letters, and sorting is done strictly by
character code byte values.

Postgres order by in wrong order

Assuming that you're using UTF8 encoding, specifying the collation instead of accepting the default should fix your immediate problem. Whether this is the right thing to do is application-dependent.

There are several different ways to specify the collation. You can specify it when the database cluster is initialized, when the database is created, when you run a query, etc. See Collation support in the docs for much more detail.

CREATE TABLE test_post (
"id" serial NOT NULL PRIMARY KEY,
"title" text NOT NULL,
"url" text collate ucs_basic NOT NULL,
"created" timestamp with time zone NOT NULL
);

INSERT INTO test_post (title, url, created) VALUES
('Aging Is', 'http://nautil.us/issue/70/variables/aging-is-a-communication-breakdown', NOW()) ON CONFLICT DO NOTHING;
INSERT INTO test_post (title, url, created) VALUES
('Untrusted – a user', 'https://github.com/felixse/FluentTerminal', NOW()) ON CONFLICT DO NOTHING;
INSERT INTO test_post (title, url, created) VALUES
('Artyping (1939)', 'http://www.bbc.com/future/story/20160408-the-ancient-peruvian-mystery-solved-from-space', NOW()) ON CONFLICT DO NOTHING;
INSERT INTO test_post (title, url, created) VALUES
(' Applying the Universal', 'http://www.graffathon.fi/2016/presentations/additive_slides.pdf', NOW()) ON CONFLICT DO NOTHING;

SELECT (url) FROM test_post ORDER BY url;

http://nautil.us/issue/70/variables/aging-is-a-communication-breakdown
http://www.bbc.com/future/story/20160408-the-ancient-peruvian-mystery-solved-from-space
http://www.graffathon.fi/2016/presentations/additive_slides.pdf
https://github.com/felixse/FluentTerminal

Postgres: incorrect sorting if timestamp in smtp format

PostgreSQL is surprisingly flexible in the input it accepts for timestamps.

SELECT 'Fri, 18 Aug 17 14:15:26 UTC'::timestamptz;

timestamptz
------------------------
2017-08-18 16:15:26+02
(1 row)

So you could just

ORDER BY smtp_time::timestamptz


Related Topics



Leave a reply



Submit