Accessing the list while being sorted
Looking at the source code (of CPython, maybe different behaviour for other implementations) the strange output of your script becomes obvious:
/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/
saved_ob_size = Py_SIZE(self);
saved_ob_item = self->ob_item;
saved_allocated = self->allocated;
Py_SET_SIZE(self, 0);
The comment says it all: When you begin sorting, the list is emptied. Well, it is "empty" in the eye of an external observer.
I quite like the term "core-dump factory".
Compare also:
b = ['b','e','f','d','c','g','a']
f = 'check this'
def m(i):
print i, b, f
return None
b = sorted(b, key= m)
print b
Should you sort a list when getting or setting it?
Sorting the list at every acccess is a bad idea. You have to have a flag which you set when the collection is modified. Only if this flag is set, you need to sort and then reset the flag.
But the best is if you have a data structure which is per definition always sorted. That means, if you insert a new element, the element is automatically inserted at the right index, thus keeping the collection sorted.
I don't know which platform / framework you are using. I know .NET provides a SortedList class which manages that kind of insertion-sort algorithm for you.
Sorting list based on values from another list
Shortest Code
[x for _, x in sorted(zip(Y, X))]
Example:
X = ["a", "b", "c", "d", "e", "f", "g", "h", "i"]
Y = [ 0, 1, 1, 0, 1, 2, 2, 0, 1]
Z = [x for _,x in sorted(zip(Y,X))]
print(Z) # ["a", "d", "h", "b", "c", "e", "i", "f", "g"]
Generally Speaking
[x for _, x in sorted(zip(Y, X), key=lambda pair: pair[0])]
Explained:
zip
the twolist
s.- create a new, sorted
list
based on thezip
usingsorted()
. - using a list comprehension extract the first elements of each pair from the sorted, zipped
list
.
For more information on how to set\use the key
parameter as well as the sorted
function in general, take a look at this.
List appears to be empty during sorting
From the listobject.c
source code:
/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/
and from the Mutable Sequence Types documentation:
CPython implementation detail: While a list is being sorted, the effect of attempting to mutate, or even inspect, the list is undefined. The C implementation of Python 2.3 and newer makes the list appear empty for the duration, and raises
ValueError
if it can detect that the list has been mutated during a sort.
You could zip a
and b
instead:
b[:] = [bval for (aval, bval) in sorted(zip(a, b))]
Sort a list by the number of occurrences of the elements in the list
This is by design and intentional. CPython temporarily "disallows" access to the list while the list is being sorted in place, the behavior is documented here:
CPython implementation detail: While a list is being sorted, the
effect of attempting to mutate, or even inspect, the list is
undefined. The C implementation of Python makes the list appear empty
for the duration, and raises ValueError if it can detect that the list
has been mutated during a sort.
You can inspect that by printing A
inside the key function - you'll get an empty list:
In [2]: def key_function(x):
...: print(A, x)
...: return A.count(x)
...:
In [3]: A.sort(key=key_function)
([], 2)
([], 1)
([], 3)
([], 4)
([], 2)
([], 2)
([], 3)
But, if you do that for sorted()
:
In [4]: sorted(A, key=key_function)
([2, 1, 3, 4, 2, 2, 3], 2)
([2, 1, 3, 4, 2, 2, 3], 1)
([2, 1, 3, 4, 2, 2, 3], 3)
([2, 1, 3, 4, 2, 2, 3], 4)
([2, 1, 3, 4, 2, 2, 3], 2)
([2, 1, 3, 4, 2, 2, 3], 2)
([2, 1, 3, 4, 2, 2, 3], 3)
Out[4]: [1, 4, 3, 3, 2, 2, 2]
It is also documented inside the sort()
implementation:
/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/.
Related Topics
Split a List into Nested Lists on a Value
Python Worker Failed to Connect Back
For Loops and Iterating Through Lists
Case Insensitive Flask-Sqlalchemy Query
Weird Behavior: Lambda Inside List Comprehension
Fix Not Load Dynamic Library for Tensorflow Gpu
What's a Good Equivalent to Subprocess.Check_Call That Returns the Contents of Stdout
Region: Ioerror: [Errno 22] Invalid Mode ('W') or Filename
Openpyxl 1.8.5: Reading the Result of a Formula Typed in a Cell Using Openpyxl
Most Efficient Way to Search the Last X Lines of a File
How to Grab Number After Word in Python
Check for Identical Rows in Different Numpy Arrays
Cannot List Ftp Directory Using Ftplib - But Ftp Client Works
How to Integrate Flask & Scrapy
Remove Duplicate Rows from Pandas Dataframe Where Only Some Columns Have the Same Value