Alphanumeric Sorting with Postgresql

Postgresql sorting mixed alphanumeric data

You can simply cast name column to bytea data type allowing collate-agnostic ordering:

SELECT name
FROM folders
ORDER BY name::bytea;

Result:

     name     
--------------
a test
alphanumeric
test 1
test 10
test 19
test 20
(6 rows)

Alphanumeric sorting with PostgreSQL

The ideal way would be to normalize your design and split the two components of the column into two separate columns. One of type integer, one text.

With the current table, you could:

SELECT col
FROM tbl
ORDER BY (substring(col, '^[0-9]+'))::int -- cast to integer
, substring(col, '[^0-9_].*$'); -- works as text

The same substring() expressions can be used to split the column.

These regular expressions are somewhat fault tolerant:

The first regex picks the longest numeric string from the left, NULL if no digits are found, so the cast to integer can't go wrong.

The second regex picks the rest of the string from the first character that is not a digit or '_'.

If the underscore (_) is an unambiguous separator, split_part() is faster:

SELECT col
FROM tbl
ORDER BY split_part(col, '_', 1)::int
, split_part(col, '_', 2);

db<>fiddle here

See:

  • Split comma separated column data into additional columns

Alphanumeric Sorting in PostgreSQL

When sorting character data types, collation rules apply - unless you work with locale "C" which sorts characters by there byte values. Applying collation rules may or may not be desirable. It makes sorting more expensive in any case. If you want to sort without collation rules, don't cast to bytea, use COLLATE "C" instead:

SELECT * FROM table ORDER BY column COLLATE "C";

However, this does not yet solve the problem with numbers in the string you mention. Split the string and sort the numeric part as number.

SELECT *
FROM table
ORDER BY split_part(column, '-', 2)::numeric;

Or, if all your numbers fit into bigint or even integer, use that instead (cheaper).

I ignored the leading part because you write:

... the basis for ordering is the last whole number of the string, regardless of what the character before that number is.

Related:

  • Alphanumeric sorting with PostgreSQL
  • Split comma separated column data into additional columns
  • What is the impact of LC_CTYPE on a PostgreSQL database?

Typically, it's best to save distinct parts of a string in separate columns as proper respective data types to avoid any such confusion.

And if the leading string is identical for all columns, consider just dropping the redundant noise. You can always use a VIEW to prepend a string for display, or do it on-the-fly, cheaply.

How to sort Alphanumeric using Postgresql (specific data)?

I think you can do that, some thing like this:

   WITH x(t) AS (
VALUES
('GPS-10')
,('GPS-1')
,('GPS-2')
,('GPS-8B')
,('GPS-8A')
,('GPS-14')
,('SPS-2')
,('SPS-14')
)
SELECT t
FROM x
ORDER BY substring(t, 1,3), (substring(substring(t, 5, length(t)), '^[0-9]+'))::int


Related Topics



Leave a reply



Submit