Python - How to Sort a List of Alpha and Numeric Values

How to sort alpha numeric set in python

Short and sweet:

sorted(data, key=lambda item: (int(item.partition(' ')[0])
if item[0].isdigit() else float('inf'), item))

This version:

  • Works in Python 2 and Python 3, because:

    • It does not assume you compare strings and integers (which won't work in Python 3)
    • It doesn't use the cmp parameter to sorted (which doesn't exist in Python 3)
  • Will sort on the string part if the quantities are equal

If you want printed output exactly as described in your example, then:

data = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
r = sorted(data, key=lambda item: (int(item.partition(' ')[0])
if item[0].isdigit() else float('inf'), item))
print ',\n'.join(r)

How to sort a list containing alphanumeric values?

You want to use natural sort:

import re

_nsre = re.compile('([0-9]+)')
def natural_sort_key(s):
return [int(text) if text.isdigit() else text.lower()
for text in re.split(_nsre, s)]

Example usage:

>>> list1 = ["1", "100A", "342B", "2C", "132", "36", "302F"]
>>> list1.sort(key=natural_sort_key)
>>> list1
['1', '2C', '36', '100A', '132', '302F', '342B']

This functions by splitting the elements into lists separating out the numbers and comparing them as integers instead of strings:

>>> natural_sort_key("100A")
['', 100, 'a']
>>> natural_sort_key("342B")
['', 342, 'b']

Note that this only works in Python3 if you are always comparing ints with ints and strings with strings, otherwise you get a TypeError: unorderable types exception.

Sort a list by alpha numeric in python

Split the item on whitespace, take the second half, convert it into an integer, and sort using that.

>>> showslist = ("test 2", "test 4", "test 1", "test 9", "test 10", "test 11", "test 6", "test 3")
>>> sorted(showslist, key=lambda item: int(item.split()[1]))
['test 1', 'test 2', 'test 3', 'test 4', 'test 6', 'test 9', 'test 10', 'test 11']

partition also works, but you're accessing the zeroth element of the return value ("test"), instead of the second (the number.)

>>> sorted(showslist, key=lambda item: int(item.partition(' ')[2]))
['test 1', 'test 2', 'test 3', 'test 4', 'test 6', 'test 9', 'test 10', 'test 11']

It looks like your final conditional is trying to ensure that the string has a numerical component at all, which is a good idea, although checking that the 0th character of item is a digit won't do you much good here, since that's "t" for all of the items you've shown.

>>> showslist = ("test 2", "test 4", "oops no number here", "test 3")
>>> sorted(showslist, key=lambda item: int(item.partition(' ')[2]) if ' ' in item and item.partition(' ')[2].isdigit() else float('inf'))
['test 2', 'test 3', 'test 4', 'oops no number here']

If you want to sort first by the textual component, and then the numerical component, you can write a function that takes an item and returns a (text, number) tuple, which Python will sort the way you want.

def getValue(x):
a,_,b = x.partition(" ")
if not b.isdigit():
return (float("inf"), x)
return (a, int(b))

showslist = ("Atest 2", "Atest 4", "Atest 1", "Atest 9", "Atest 10", "Btest 11", "Btest 6", "Ctest 3")
print sorted(showslist, key=getValue)
#result: ['Atest 1', 'Atest 2', 'Atest 4', 'Atest 9', 'Atest 10', 'Btest 6', 'Btest 11', 'Ctest 3']

This can be done in one line, although you lose more in readability than you gain in file size:

print sorted(showslist, key= lambda x: (lambda a, _, b: (a, int(b)) if b.isdigit() else (float("inf"), x))(*x.partition(" ")))

Python - How to sort a list of alpha and numeric values?

You can sort with a (Boolean, value) tuple:

L = ['J', 'E', 3, 7, 0]

res = sorted(L, key=lambda x: (isinstance(x, str), x))

# [0, 3, 7, 'E', 'J']

Naturally sort a list moving alphanumeric values to the end

You can actually perform this using natsorted and the correct choice of key.

>>> ns.natsorted(d, key=lambda x: (not x.isdigit(), x))
['0',
'1',
'2',
'3',
'4',
'5',
'6',
'7',
'8',
'9',
'10',
'11',
'2Y',
'3Y',
'4Y',
'5Y',
'9Y']

The key returns a tuple with the original input as the second element. Strings that are digits get placed at the front, all others at the back, then the subsets are sorted individually.

As a side note, Willem Van Onsem's solution uses natsort_key, which has been deprecated as of natsort version 3.0.4 (if you turn on DeprecationWarning in your interpreter you will see that, and the function is now undocumented). It's actually pretty inefficient... it is preferred to use natort_keygen which returns a natural sorting key. natsort_key calls this under the hood, so for every input you are creating a new function and then calling it once.

Below I repeat the tests shown here, and I added my solution using the natsorted method as well as the timing of the other solutions using natsort_keygen instead of natsort_key.

In [13]: %timeit sorted(d, key=lambda x: (not x.isdigit(), ns.natsort_key(x)))
1 loop, best of 3: 33.3 s per loop

In [14]: natsort_key = ns.natsort_keygen()

In [15]: %timeit sorted(d, key=lambda x: (not x.isdigit(), natsort_key(x)))
1 loop, best of 3: 11.2 s per loop

In [16]: %timeit sorted(ns.natsorted(d), key=str.isdigit, reverse=True)
1 loop, best of 3: 9.77 s per loop

In [17]: %timeit ns.natsorted(d, key=lambda x: (not x.isdigit(), x))
1 loop, best of 3: 23.8 s per loop

sorting alpha numeric values in Python

Use a lambda that splits on "_" as key:

out = dict(sorted(dct.items(), key=lambda x: int(x[0].split('_')[1])))

Output:

{'usr_1': '111',
'usr_2': '222',
'usr_8': '888',
'usr_10': '10101',
'usr_11': '11111',
'usr_22': '3333'}

Sort list with alphanumeric items by letter first

what you need is to prioritize the numeric part, so just create a key function yielding a tuple with first the numeric part, then the letter part, and let natural tuple ordering do the rest.

print(sorted(lst,key = lambda x : (x[1:],x[0])))

the numeric part doesn't need to be converted to integer as long as there are an even number of digits (zero-padded)

With such an input:

lst = ['A01', 'A02', 'A03',
'B01', 'B02', 'B03',
'C01', 'C02', 'C03']

you get:

['A01', 'B01', 'C01', 'A02', 'B02', 'C02', 'A03', 'B03', 'C03']

(if you want to protect your list against empty elements do lambda x : (x[1:],x[0]) if x else tuple()), although that defeats the idea of a sorted list with formatted elements)

In Python, how can I naturally sort a list of alphanumeric strings such that alpha characters sort ahead of numeric characters?

re_natural = re.compile('[0-9]+|[^0-9]+')

def natural_key(s):
return [(1, int(c)) if c.isdigit() else (0, c.lower()) for c in re_natural.findall(s)] + [s]

for case in test_cases:
print case[1]
print sorted(case[0], key=natural_key)

['a', 'b', 'c']
['a', 'b', 'c']
['A', 'b', 'C']
['A', 'b', 'C']
['a', 'B', 'r', '0', '9']
['a', 'B', 'r', '0', '9']
['a1', 'a2', 'a100', '1a', '10a']
['a1', 'a2', 'a100', '1a', '10a']
['alp1', 'alp2', 'alp10', 'ALP11', 'alp100', 'GAM', '1', '2', '100']
['alp1', 'alp2', 'alp10', 'ALP11', 'alp100', 'GAM', '1', '2', '100']
['A', 'a', 'b', 'r', '0', '9']
['A', 'a', 'b', 'r', '0', '9']
['ABc', 'Abc', 'abc']
['ABc', 'Abc', 'abc']

Edit: I decided to revisit this question and see if it would be possible to handle the bonus case. It requires being more sophisticated in the tie-breaker portion of the key. To match the desired results, the alpha parts of the key must be considered before the numeric parts. I also added a marker between the natural section of the key and the tie-breaker so that short keys always come before long ones.

def natural_key2(s):
parts = re_natural.findall(s)
natural = [(1, int(c)) if c.isdigit() else (0, c.lower()) for c in parts]
ties_alpha = [c for c in parts if not c.isdigit()]
ties_numeric = [c for c in parts if c.isdigit()]
return natural + [(-1,)] + ties_alpha + ties_numeric

This generates identical results for the test cases above, plus the desired output for the bonus case:

['A', 'a', 'A0', 'a0', '0', '00', '0A', '00A', '0a', '00a']


Related Topics



Leave a reply



Submit