Natural Sorting

what is natural ordering when we talk about sorting?

Natural ordering is a kind of alphanumerical sort which seems natural to humans.

In a classical alphanumerical sort we will have something like :

1 10 11 12 2 20 21 3 4 5 6 7

If you're using Natural ordering, it will be :

1 2 3 4 5 6 7 10 11 12 20 21

Depending on the language, natural ordering sometimes ignore Capital letters and accentuated one (ie all accentuated letters are treated like their non-accentuated counterpart).

Many languages have a function to order a String naturally. However, an Employee is too "high level" for the language, you must decide what it means for you to order them naturally and create the according function.

In my point of view, ordering Employee will start by ordering them by name using a natural sort, then age and finally date of joining.

According to statistics there are two types of categorical variables. Variables having categories without a numerical ordering (nominal) and those which do have ordered categories (ordinal). The example of an Employee's name, age and date of joining is actually considered a nominal variable so there can be no sorting by natural ordering. Natural ordering could exist for example in age had you categorized it in levels of child, teenager, adult, in which one can observe an ascending type of sorting.

Is there a built in function for string natural sort?

There is a third party library for this on PyPI called natsort (full disclosure, I am the package's author). For your case, you can do either of the following:

>>> from natsort import natsorted, ns
>>> x = ['Elm11', 'Elm12', 'Elm2', 'elm0', 'elm1', 'elm10', 'elm13', 'elm9']
>>> natsorted(x, key=lambda y: y.lower())
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']
>>> natsorted(x, alg=ns.IGNORECASE) # or alg=ns.IC
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']

You should note that natsort uses a general algorithm so it should work for just about any input that you throw at it. If you want more details on why you might choose a library to do this rather than rolling your own function, check out the natsort documentation's How It Works page, in particular the Special Cases Everywhere! section.


If you need a sorting key instead of a sorting function, use either of the below formulas.

>>> from natsort import natsort_keygen, ns
>>> l1 = ['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']
>>> l2 = l1[:]
>>> natsort_key1 = natsort_keygen(key=lambda y: y.lower())
>>> l1.sort(key=natsort_key1)
>>> l1
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']
>>> natsort_key2 = natsort_keygen(alg=ns.IGNORECASE)
>>> l2.sort(key=natsort_key2)
>>> l2
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']

Update November 2020

Given that a popular request/question is "how to sort like Windows Explorer?" (or whatever is your operating system's file system browser), as of natsort version 7.1.0 there is a function called os_sorted to do exactly this. On Windows, it will sort in the same order as Windows Explorer, and on other operating systems it should sort like whatever is the local file system browser.

>>> from natsort import os_sorted
>>> os_sorted(list_of_paths)
# your paths sorted like your file system browser

For those needing a sort key, you can use os_sort_keygen (or os_sort_key if you just need the defaults).

Caveat - Please read the API documentation for this function before you use to understand the limitations and how to get best results.

Natural sorting of a list in Python3

Your code is correct, but your data look incorrect: all the entries have a leading whitespace, which implies they are "before" the one you identify as least, that actually have no leading whitespace.

If the data is fine as they are I suggest you to revise the code to ignore leading whitespaces (check this: How do I remove leading whitespace in Python?).

How to natural-sort a collection of objects in JavaScript?

Ascending and Descending order

const arr = [{name: "John", size: "100"},{name: "John", size: "80"},{name: "John", size: "82"},{name: "John", size: "110"},{name: "John", size: "70"},{name: "John", size: "M"},{name: "John", size: "S"},{name: "John", size: "XS"},{name: "John", size: "L"},{name: "John", size: "XL"}],      handler = (a, b, desc) => {        if (isNaN(+a) && isNaN(+b)) return (desc ? b.localeCompare(a) : a.localeCompare(b));        else if (!isNaN(+a) && !isNaN(+b)) return (desc ? (+b - +a) : (+a - +b));        else if (isNaN(+a)) return (desc ? -1 : 1);         else return (desc ? 1 : -1);      },      descending = arr.sort(({size: a}, {size: b}) => handler(a, b, true)),      ascending = [...arr].sort(({size: a}, {size: b}) => handler(a, b));
console.log("Ascending")console.log(ascending);console.log("Descending")console.log(descending);
.as-console-wrapper { max-height: 100% !important; top: 0; }

Natural sorting

Google: Python natural sorting.

Result 1: The page you linked to.

But don't stop there!

Result 2: Jeff Atwood's blog that explains how to do it properly.

Result 3: An answer I posted based on Jeff Atwood's blog.

Here's the code from that answer:

import re

def natural_sort(l):
convert = lambda text: int(text) if text.isdigit() else text.lower()
alphanum_key = lambda key: [convert(c) for c in re.split('([0-9]+)', key)]
return sorted(l, key=alphanum_key)

Results for your data:


PresserInc-1.jpg
PresserInc-1_10.jpg
PresserInc-1_11.jpg
PresserInc-2.jpg
PresserInc-3.jpg
etc...

See it working online: ideone

Natural Sort Order in C#

The easiest thing to do is just P/Invoke the built-in function in Windows, and use it as the comparison function in your IComparer:

[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
private static extern int StrCmpLogicalW(string psz1, string psz2);

Michael Kaplan has some examples of how this function works here, and the changes that were made for Vista to make it work more intuitively. The plus side of this function is that it will have the same behaviour as the version of Windows it runs on, however this does mean that it differs between versions of Windows so you need to consider whether this is a problem for you.

So a complete implementation would be something like:

[SuppressUnmanagedCodeSecurity]
internal static class SafeNativeMethods
{
[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
public static extern int StrCmpLogicalW(string psz1, string psz2);
}

public sealed class NaturalStringComparer : IComparer<string>
{
public int Compare(string a, string b)
{
return SafeNativeMethods.StrCmpLogicalW(a, b);
}
}

public sealed class NaturalFileInfoNameComparer : IComparer<FileInfo>
{
public int Compare(FileInfo a, FileInfo b)
{
return SafeNativeMethods.StrCmpLogicalW(a.Name, b.Name);
}
}

Natural sorting in pandas

Use str.extract, sort_values, then use the index to reindex df.

idx = (df.assign(ID2=df.ID.str.extract(r'(\d+)$').astype(int))
.sort_values(['ID2', 'Time'])
.index)

df.iloc[idx]

ID Time oneMissing singleValue empty oneEmpty
0 CS1-1 1 10000.0 NaN None 0.0
2 CS1-1 2 30000.0 NaN None 0.0
3 CS1-2 1 10000.0 NaN None NaN
1 CS1-2 2 20000.0 0.0 None 0.0
5 CS1-2 3 30000.0 NaN None NaN
4 CS1-11 1 NaN 0.0 None NaN

This is under the assumption that your ID column follows the pattern "XXX-NUMBER".


A fool-proof solution will involve the use of the natsort module, which excels at fast natural sorting. With a little elbow-grease, we can argsort your data.

from natsort import natsorted
idx, *_ = zip(*natsorted(
zip(df.index, df.ID, df.Time), key=lambda x: (x[1], x[2])))

df.iloc[list(idx)]

ID Time oneMissing singleValue empty oneEmpty
0 CS1-1 1 10000.0 NaN None 0.0
2 CS1-1 2 30000.0 NaN None 0.0
3 CS1-2 1 10000.0 NaN None NaN
1 CS1-2 2 20000.0 0.0 None 0.0
5 CS1-2 3 30000.0 NaN None NaN
4 CS1-11 1 NaN 0.0 None NaN

Use PyPi to install: pip install natsort.

Natural sort order string comparison in Java - is one built in?

In java the "natural" order meaning is "lexicographical" order, so there is no implementation in the core like the one you're looking for.

There are open source implementations.

Here's one:

NaturalOrderComparator.java

Make sure you read the:

Cougaar Open Source License

I hope this helps!



Related Topics



Leave a reply



Submit