How to sort alpha numeric set in python
Short and sweet:
sorted(data, key=lambda item: (int(item.partition(' ')[0])
if item[0].isdigit() else float('inf'), item))
This version:
- Works in Python 2 and Python 3, because:
- It does not assume you compare strings and integers (which won't work in Python 3)
- It doesn't use the
cmp
parameter tosorted
(which doesn't exist in Python 3)
- Will sort on the string part if the quantities are equal
If you want printed output exactly as described in your example, then:
data = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
r = sorted(data, key=lambda item: (int(item.partition(' ')[0])
if item[0].isdigit() else float('inf'), item))
print ',\n'.join(r)
How to sort a list containing alphanumeric values?
You want to use natural sort:
import re
_nsre = re.compile('([0-9]+)')
def natural_sort_key(s):
return [int(text) if text.isdigit() else text.lower()
for text in re.split(_nsre, s)]
Example usage:
>>> list1 = ["1", "100A", "342B", "2C", "132", "36", "302F"]
>>> list1.sort(key=natural_sort_key)
>>> list1
['1', '2C', '36', '100A', '132', '302F', '342B']
This functions by splitting the elements into lists separating out the numbers and comparing them as integers instead of strings:
>>> natural_sort_key("100A")
['', 100, 'a']
>>> natural_sort_key("342B")
['', 342, 'b']
Note that this only works in Python3 if you are always comparing ints with ints and strings with strings, otherwise you get a TypeError: unorderable types
exception.
Sort list with alphanumeric items by letter first
what you need is to prioritize the numeric part, so just create a key function yielding a tuple
with first the numeric part, then the letter part, and let natural tuple
ordering do the rest.
print(sorted(lst,key = lambda x : (x[1:],x[0])))
the numeric part doesn't need to be converted to integer as long as there are an even number of digits (zero-padded)
With such an input:
lst = ['A01', 'A02', 'A03',
'B01', 'B02', 'B03',
'C01', 'C02', 'C03']
you get:
['A01', 'B01', 'C01', 'A02', 'B02', 'C02', 'A03', 'B03', 'C03']
(if you want to protect your list against empty elements do lambda x : (x[1:],x[0]) if x else tuple())
, although that defeats the idea of a sorted list with formatted elements)
In Python, how can I naturally sort a list of alphanumeric strings such that alpha characters sort ahead of numeric characters?
re_natural = re.compile('[0-9]+|[^0-9]+')
def natural_key(s):
return [(1, int(c)) if c.isdigit() else (0, c.lower()) for c in re_natural.findall(s)] + [s]
for case in test_cases:
print case[1]
print sorted(case[0], key=natural_key)
['a', 'b', 'c']
['a', 'b', 'c']
['A', 'b', 'C']
['A', 'b', 'C']
['a', 'B', 'r', '0', '9']
['a', 'B', 'r', '0', '9']
['a1', 'a2', 'a100', '1a', '10a']
['a1', 'a2', 'a100', '1a', '10a']
['alp1', 'alp2', 'alp10', 'ALP11', 'alp100', 'GAM', '1', '2', '100']
['alp1', 'alp2', 'alp10', 'ALP11', 'alp100', 'GAM', '1', '2', '100']
['A', 'a', 'b', 'r', '0', '9']
['A', 'a', 'b', 'r', '0', '9']
['ABc', 'Abc', 'abc']
['ABc', 'Abc', 'abc']
Edit: I decided to revisit this question and see if it would be possible to handle the bonus case. It requires being more sophisticated in the tie-breaker portion of the key. To match the desired results, the alpha parts of the key must be considered before the numeric parts. I also added a marker between the natural section of the key and the tie-breaker so that short keys always come before long ones.
def natural_key2(s):
parts = re_natural.findall(s)
natural = [(1, int(c)) if c.isdigit() else (0, c.lower()) for c in parts]
ties_alpha = [c for c in parts if not c.isdigit()]
ties_numeric = [c for c in parts if c.isdigit()]
return natural + [(-1,)] + ties_alpha + ties_numeric
This generates identical results for the test cases above, plus the desired output for the bonus case:
['A', 'a', 'A0', 'a0', '0', '00', '0A', '00A', '0a', '00a']
Sort a list by alpha numeric in python
Split the item on whitespace, take the second half, convert it into an integer, and sort using that.
>>> showslist = ("test 2", "test 4", "test 1", "test 9", "test 10", "test 11", "test 6", "test 3")
>>> sorted(showslist, key=lambda item: int(item.split()[1]))
['test 1', 'test 2', 'test 3', 'test 4', 'test 6', 'test 9', 'test 10', 'test 11']
partition
also works, but you're accessing the zeroth element of the return value ("test"), instead of the second (the number.)
>>> sorted(showslist, key=lambda item: int(item.partition(' ')[2]))
['test 1', 'test 2', 'test 3', 'test 4', 'test 6', 'test 9', 'test 10', 'test 11']
It looks like your final conditional is trying to ensure that the string has a numerical component at all, which is a good idea, although checking that the 0th character of item
is a digit won't do you much good here, since that's "t" for all of the items you've shown.
>>> showslist = ("test 2", "test 4", "oops no number here", "test 3")
>>> sorted(showslist, key=lambda item: int(item.partition(' ')[2]) if ' ' in item and item.partition(' ')[2].isdigit() else float('inf'))
['test 2', 'test 3', 'test 4', 'oops no number here']
If you want to sort first by the textual component, and then the numerical component, you can write a function that takes an item and returns a (text, number) tuple, which Python will sort the way you want.
def getValue(x):
a,_,b = x.partition(" ")
if not b.isdigit():
return (float("inf"), x)
return (a, int(b))
showslist = ("Atest 2", "Atest 4", "Atest 1", "Atest 9", "Atest 10", "Btest 11", "Btest 6", "Ctest 3")
print sorted(showslist, key=getValue)
#result: ['Atest 1', 'Atest 2', 'Atest 4', 'Atest 9', 'Atest 10', 'Btest 6', 'Btest 11', 'Ctest 3']
This can be done in one line, although you lose more in readability than you gain in file size:
print sorted(showslist, key= lambda x: (lambda a, _, b: (a, int(b)) if b.isdigit() else (float("inf"), x))(*x.partition(" ")))
Naturally sort a list of alpha-numeric tuples by the tuple's first element in Python
Using the second answer from the other question, generalized to support any method on item as the basis for getting the key:
import re
from operator import itemgetter
def sorted_nicely(l, key):
""" Sort the given iterable in the way that humans expect."""
convert = lambda text: int(text) if text.isdigit() else text
alphanum_key = lambda item: [ convert(c) for c in re.split('([0-9]+)', key(item)) ]
return sorted(l, key = alphanum_key)
print sorted_nicely([('b10', 0), ('0', 1), ('b9', 2)], itemgetter(0))
This is exactly the same as that answer except generalized to use any callable as the operation on item. If you just wanted to do it on a string, you'd use lambda item: item
, if you wanted to do it on a list, tuple, dict, or set, you'd use operator.itemgetter(key_or_index_you_want)
, or if you wanted to do it on a class instance you could use operator.attrgetter('attribute_name_you_want')
.
It gives
[('0', 1), ('b9', 2), ('b10', 0)]
for your example #2.
Naturally sort a list moving alphanumeric values to the end
You can actually perform this using natsorted
and the correct choice of key
.
>>> ns.natsorted(d, key=lambda x: (not x.isdigit(), x))
['0',
'1',
'2',
'3',
'4',
'5',
'6',
'7',
'8',
'9',
'10',
'11',
'2Y',
'3Y',
'4Y',
'5Y',
'9Y']
The key returns a tuple with the original input as the second element. Strings that are digits get placed at the front, all others at the back, then the subsets are sorted individually.
As a side note, Willem Van Onsem's solution uses natsort_key
, which has been deprecated as of natsort
version 3.0.4 (if you turn on DeprecationWarning
in your interpreter you will see that, and the function is now undocumented). It's actually pretty inefficient... it is preferred to use natort_keygen
which returns a natural sorting key. natsort_key
calls this under the hood, so for every input you are creating a new function and then calling it once.
Below I repeat the tests shown here, and I added my solution using the natsorted
method as well as the timing of the other solutions using natsort_keygen
instead of natsort_key
.
In [13]: %timeit sorted(d, key=lambda x: (not x.isdigit(), ns.natsort_key(x)))
1 loop, best of 3: 33.3 s per loop
In [14]: natsort_key = ns.natsort_keygen()
In [15]: %timeit sorted(d, key=lambda x: (not x.isdigit(), natsort_key(x)))
1 loop, best of 3: 11.2 s per loop
In [16]: %timeit sorted(ns.natsorted(d), key=str.isdigit, reverse=True)
1 loop, best of 3: 9.77 s per loop
In [17]: %timeit ns.natsorted(d, key=lambda x: (not x.isdigit(), x))
1 loop, best of 3: 23.8 s per loop
Perform a descending numerical and then ascending alphabetical sort from a list of alphanumeric strings in Python
In case the non-numeric part isn't always the same size (or even present):
import re
def na_split(s):
# Split the string into leading numeric & rest
n,s = re.fullmatch('^([0-9]+)(.*)$',s).groups()
return (-int(n),s)
data = ['120d', '120a', '1080p', '1080a', '696p', '696z', '480', '480a', '480p']
print(sorted(data,key=lambda x:na_split(x)))
>> ['1080a', '1080p', '696p', '696z', '480', '480a', '480p', '120a', '120d']
python sort alphanumeric list without ordereddict
You have to specify a custom function as sort key, which would extract the initial numbers from each string
>>> apples.sort(key=lambda x: int(x.split()[0]))
>>> apples
['2 The Yellow Apples ', '7 The Red Apples ', '15 The Green Apples ', '43 The Blue Apples ', '178 The Purple Apples ']
>>>
Or using regex
>>> import re
>>> apples.sort(key=lambda x: int(re.findall('\d+', x)[0]))
>>> apples
['2 The Yellow Apples ', '7 The Red Apples ', '15 The Green Apples ', '43 The Blue Apples ', '178 The Purple Apples ']
>>>
Related Topics
Find in Files Using Ruby or Python
Aes Python Encryption and Ruby Encryption - Different Behaviour
How to Import a JSON from a File on Cloud Storage to Bigquery
Rally APIs: How to Copy Test Folder and Member Test Cases
Python VS. Ruby for Metaprogramming
Which of These Scripting Languages Is More Appropriate for Pen-Testing
Is There Something Like Bpython for Ruby
Learning Ruby from Python; Differences and Similarities
If Monkey Patching Is Permitted in Both Ruby and Python, Why Is It More Controversial in Ruby
Programmatically Extract Data from an Excel Spreadsheet
What Programming Language Features Are Well Suited for Developing a Live Coding Framework
Swift Playground Error: Module 'Python' Has No Member Named 'Import'
How to Split a Multi-Line String into Multiple Lines
What Makes Sets Faster Than Lists
How to Multiply All Items in a List Together with Python
How to Use Mingw's Gcc Compiler When Installing Python Package Using Pip