Python Analog of PHP's Natsort Function (Sort a List Using a "Natural Order" Algorithm)

Python analog of PHP's natsort function (sort a list using a natural order algorithm)

From my answer to Natural Sorting algorithm:

import re
def natural_key(string_):
"""See https://blog.codinghorror.com/sorting-for-humans-natural-sort-order/"""
return [int(s) if s.isdigit() else s for s in re.split(r'(\d+)', string_)]

Example:

>>> L = ['image1.jpg', 'image15.jpg', 'image12.jpg', 'image3.jpg']
>>> sorted(L)
['image1.jpg', 'image12.jpg', 'image15.jpg', 'image3.jpg']
>>> sorted(L, key=natural_key)
['image1.jpg', 'image3.jpg', 'image12.jpg', 'image15.jpg']

To support Unicode strings, .isdecimal() should be used instead of .isdigit(). See example in @phihag's comment. Related: How to reveal Unicodes numeric value property.

.isdigit() may also fail (return value that is not accepted by int()) for a bytestring on Python 2 in some locales e.g., '\xb2' ('²') in cp1252 locale on Windows.

Is there a built in function for string natural sort?

There is a third party library for this on PyPI called natsort (full disclosure, I am the package's author). For your case, you can do either of the following:

>>> from natsort import natsorted, ns
>>> x = ['Elm11', 'Elm12', 'Elm2', 'elm0', 'elm1', 'elm10', 'elm13', 'elm9']
>>> natsorted(x, key=lambda y: y.lower())
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']
>>> natsorted(x, alg=ns.IGNORECASE) # or alg=ns.IC
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']

You should note that natsort uses a general algorithm so it should work for just about any input that you throw at it. If you want more details on why you might choose a library to do this rather than rolling your own function, check out the natsort documentation's How It Works page, in particular the Special Cases Everywhere! section.


If you need a sorting key instead of a sorting function, use either of the below formulas.

>>> from natsort import natsort_keygen, ns
>>> l1 = ['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']
>>> l2 = l1[:]
>>> natsort_key1 = natsort_keygen(key=lambda y: y.lower())
>>> l1.sort(key=natsort_key1)
>>> l1
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']
>>> natsort_key2 = natsort_keygen(alg=ns.IGNORECASE)
>>> l2.sort(key=natsort_key2)
>>> l2
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']

Update November 2020

Given that a popular request/question is "how to sort like Windows Explorer?" (or whatever is your operating system's file system browser), as of natsort version 7.1.0 there is a function called os_sorted to do exactly this. On Windows, it will sort in the same order as Windows Explorer, and on other operating systems it should sort like whatever is the local file system browser.

>>> from natsort import os_sorted
>>> os_sorted(list_of_paths)
# your paths sorted like your file system browser

For those needing a sort key, you can use os_sort_keygen (or os_sort_key if you just need the defaults).

Caveat - Please read the API documentation for this function before you use to understand the limitations and how to get best results.

a way to sorting in python

sort your list with a lambda:

sorted(Lista,key=lambda x: int(x.split(".")[0]))

int(x.split(".")[0]) is the directory number so in '1. Introducción' it would be 1 and so on..

Or sort in-place Lista.sort(Lista,key=lambda x: int(x.split(".")[0]))

sorted creates a new list list.sort sorts the original list

A link to the docs that explains the difference between list.sort and sorted

Taken from the docs:

lambda_expr ::= "lambda" [parameter_list]: expression

lambda_expr_nocond ::= "lambda" [parameter_list]: expression_nocond

Lambda expressions (sometimes called lambda forms) are used to create anonymous functions. The expression lambda arguments: expression yields a function object. The unnamed object behaves like a function object defined with

def <lambda>(arguments):
return expression

A simple example:

lam = lambda x : x + 4

def foo(x):
return x+4

print("Calling foo: {}".format(foo(5)))
print("Calling lam: {}".format(lam(5)))
Calling foo: 9
Calling lam: 9

how to sort a list of string by every element's numeric value

a = ["part 1", "part 3" , "part 10", "part 2"]
print sorted(a, key=lambda x:int(x.split()[1]))

Output

['part 1', 'part 2', 'part 3', 'part 10']

If you want to sort in-place,

a.sort(key=lambda x:int(x.split()[1]))
print a

Reading files in directory in sorted order Python

Try sorting the files list using sorted.

Ex:

root_dir = r'C:\Users\ab\pythonfiles\Compressed_images'
for root, dirs, files in os.walk(root_dir):
for file in sorted(files, key=lambda x: int(x.split(".")[0])):
print('For file: \n', file)

python equivalent to php natcasesort

import re

def atoi(text):
return int(text) if text.isdigit() else text.lower()

def natural_keys(text):
'''
alist.sort(key=natural_keys) sorts in human order
http://nedbatchelder.com/blog/200712/human_sorting.html
(See Toothy's implementation in the comments)
'''
return [ atoi(c) for c in re.split('(\d+)', text) ]

names = ('IMG0.png', 'img12.png', 'img10.png', 'img2.png', 'img1.png', 'IMG3.png')

Standard sorting:

print(sorted(names))
# ['IMG0.png', 'IMG3.png', 'img1.png', 'img10.png', 'img12.png', 'img2.png']

Natural order sorting (case-insensitive):

print(sorted(names, key = natural_keys))
# ['IMG0.png', 'img1.png', 'img2.png', 'IMG3.png', 'img10.png', 'img12.png']

Python sort files by filename with numbers as first letters

Assuming that there will always be an underscore _ as a separator:

files = ['0_sound.wav', '10_sound.wav', '15_sound.wav', '20_sound.wav', '5_sound.wav']
files.sort(key=lambda x: int(x.split('_')[0]))

Result:

['0_sound.wav', '5_sound.wav', '10_sound.wav', '15_sound.wav', '20_sound.wav']

Note: This will fail if there are unexpected filenames that don't have this separator or that don't start with a number, if that is even possible I would suggest filtering that list first so you don't get unexpected errors

Python sorting problem

There are two ways to approach this:

  1. Define your own sorting function cmp(x, y), where x and y are strings, and you return 1 if the second one is greater than the first, -1 if the first is greater, and 0 if they're the same. Then pass this function as the "cmp" argument to the built-in sort() function.

  2. Convert all of the strings into a format where the "natural" sorting order is exactly what you want. For example you could just zero-pad them like "Season 03, Episode 07". Then you can sort them using sort().

Either way, I'd suggest using a simple regular expression to get the season and episode out of the string, something like:

m = re.match('Season ([0-9]+), Episode ([0-9]+): .*', s)
(season, episode) = (int(m.group(1)), int(m.group(2)))


Related Topics



Leave a reply



Submit