Postgresql sorting mixed alphanumeric data
You can simply cast name
column to bytea
data type allowing collate-agnostic ordering:
SELECT name
FROM folders
ORDER BY name::bytea;
Result:
name
--------------
a test
alphanumeric
test 1
test 10
test 19
test 20
(6 rows)
Alphanumeric sorting with PostgreSQL
The ideal way would be to normalize your design and split the two components of the column into two separate columns. One of type integer
, one text
.
With the current table, you could:
SELECT col
FROM tbl
ORDER BY (substring(col, '^[0-9]+'))::int -- cast to integer
, substring(col, '[^0-9_].*$'); -- works as text
The same substring()
expressions can be used to split the column.
These regular expressions are somewhat fault tolerant:
The first regex picks the longest numeric string from the left, NULL
if no digits are found, so the cast to integer
can't go wrong.
The second regex picks the rest of the string from the first character that is not a digit or '_'.
If the underscore (_
) is an unambiguous separator, split_part()
is faster:
SELECT col
FROM tbl
ORDER BY split_part(col, '_', 1)::int
, split_part(col, '_', 2);
db<>fiddle here
See:
- Split comma separated column data into additional columns
Alphanumeric Sorting in PostgreSQL
When sorting character data types, collation rules apply - unless you work with locale "C" which sorts characters by there byte values. Applying collation rules may or may not be desirable. It makes sorting more expensive in any case. If you want to sort without collation rules, don't cast to bytea
, use COLLATE "C"
instead:
SELECT * FROM table ORDER BY column COLLATE "C";
However, this does not yet solve the problem with numbers in the string you mention. Split the string and sort the numeric part as number.
SELECT *
FROM table
ORDER BY split_part(column, '-', 2)::numeric;
Or, if all your numbers fit into bigint
or even integer
, use that instead (cheaper).
I ignored the leading part because you write:
... the basis for ordering is the last whole number of the string, regardless of what the character before that number is.
Related:
- Alphanumeric sorting with PostgreSQL
- Split comma separated column data into additional columns
- What is the impact of LC_CTYPE on a PostgreSQL database?
Typically, it's best to save distinct parts of a string in separate columns as proper respective data types to avoid any such confusion.
And if the leading string is identical for all columns, consider just dropping the redundant noise. You can always use a VIEW
to prepend a string for display, or do it on-the-fly, cheaply.
How to sort Alphanumeric using Postgresql (specific data)?
I think you can do that, some thing like this:
WITH x(t) AS (
VALUES
('GPS-10')
,('GPS-1')
,('GPS-2')
,('GPS-8B')
,('GPS-8A')
,('GPS-14')
,('SPS-2')
,('SPS-14')
)
SELECT t
FROM x
ORDER BY substring(t, 1,3), (substring(substring(t, 5, length(t)), '^[0-9]+'))::int
Related Topics
Representing Ecommerce Products and Variations Cleanly in the Database
Query Runs Slow with Date Expression, But Fast with String Literal
How to Use an Alias in a Postgresql Order by Clause
SQL Server 2008 - Help Writing Simple Insert Trigger
How to Split String Using Delimiter Char Using T-Sql
Insert Select Statement in Oracle 11G
MySQL #1140 - Mixing of Group Columns
Split Words with a Capital Letter in SQL
Is There a Product Function Like There Is a Sum Function in Oracle SQL
Get Month from Datetime in SQLite
Pivoting of Data Using Two Columns
Hierarchical Queries in SQL Server 2005
Why Do We Need Group by with Aggregate Functions
SQL Server 2005 Row_Number() Without Order By
Pass String Variable in R Script to Use It in SQL Statement