How to Quickly Remove Items From a List
List isn't an efficient data structure when it comes to removal. You would do better to use a double linked list (LinkedList) as removal simply requires reference updates in the adjacent entries.
Fast way to remove a few items from a list/queue
The list comprehension is the asymptotically optimal solution:
somelist = [x for x in somelist if not determine(x)]
It only makes one pass over the list, so runs in O(n) time. Since you need to call determine() on each object, any algorithm will require at least O(n) operations. The list comprehension does have to do some copying, but it's only copying references to the objects not copying the objects themselves.
Removing items from a list in Python is O(n), so anything with a remove, pop, or del inside the loop will be O(n**2).
Also, in CPython list comprehensions are faster than for loops.
Removing items from a list
You need to use Iterator
and call remove()
on iterator
instead of using for
loop.
How to delete an item in a list if it exists?
1) Almost-English style:
Test for presence using the in
operator, then apply the remove
method.
if thing in some_list: some_list.remove(thing)
The remove
method will remove only the first occurrence of thing
, in order to remove all occurrences you can use while
instead of if
.
while thing in some_list: some_list.remove(thing)
- Simple enough, probably my choice.for small lists (can't resist one-liners)
2) Duck-typed, EAFP style:
This shoot-first-ask-questions-last attitude is common in Python. Instead of testing in advance if the object is suitable, just carry out the operation and catch relevant Exceptions:
try:
some_list.remove(thing)
except ValueError:
pass # or scream: thing not in some_list!
except AttributeError:
call_security("some_list not quacking like a list!")
Off course the second except clause in the example above is not only of questionable humor but totally unnecessary (the point was to illustrate duck-typing for people not familiar with the concept).
If you expect multiple occurrences of thing:
while True:
try:
some_list.remove(thing)
except ValueError:
break
- a little verbose for this specific use case, but very idiomatic in Python.
- this performs better than #1
- PEP 463 proposed a shorter syntax for try/except simple usage that would be handy here, but it was not approved.
However, with contextlib's suppress() contextmanager (introduced in python 3.4) the above code can be simplified to this:
with suppress(ValueError, AttributeError):
some_list.remove(thing)
Again, if you expect multiple occurrences of thing:
with suppress(ValueError):
while True:
some_list.remove(thing)
3) Functional style:
Around 1993, Python got lambda
, reduce()
, filter()
and map()
, courtesy of a Lisp hacker who missed them and submitted working patches*. You can use filter
to remove elements from the list:
is_not_thing = lambda x: x is not thing
cleaned_list = filter(is_not_thing, some_list)
There is a shortcut that may be useful for your case: if you want to filter out empty items (in fact items where bool(item) == False
, like None
, zero, empty strings or other empty collections), you can pass None as the first argument:
cleaned_list = filter(None, some_list)
- [update]: in Python 2.x,
filter(function, iterable)
used to be equivalent to[item for item in iterable if function(item)]
(or[item for item in iterable if item]
if the first argument isNone
); in Python 3.x, it is now equivalent to(item for item in iterable if function(item))
. The subtle difference is that filter used to return a list, now it works like a generator expression - this is OK if you are only iterating over the cleaned list and discarding it, but if you really need a list, you have to enclose thefilter()
call with thelist()
constructor. - *These Lispy flavored constructs are considered a little alien in Python. Around 2005, Guido was even talking about dropping
filter
- along with companionsmap
andreduce
(they are not gone yet butreduce
was moved into the functools module, which is worth a look if you like high order functions).
4) Mathematical style:
List comprehensions became the preferred style for list manipulation in Python since introduced in version 2.0 by PEP 202. The rationale behind it is that List comprehensions provide a more concise way to create lists in situations where map()
and filter()
and/or nested loops would currently be used.
cleaned_list = [ x for x in some_list if x is not thing ]
Generator expressions were introduced in version 2.4 by PEP 289. A generator expression is better for situations where you don't really need (or want) to have a full list created in memory - like when you just want to iterate over the elements one at a time. If you are only iterating over the list, you can think of a generator expression as a lazy evaluated list comprehension:
for item in (x for x in some_list if x is not thing):
do_your_thing_with(item)
- See this Python history blog post by GvR.
- This syntax is inspired by the set-builder notation in math.
- Python 3 has also set and dict comprehensions.
Notes
- you may want to use the inequality operator
!=
instead ofis not
(the difference is important) - for critics of methods implying a list copy: contrary to popular belief, generator expressions are not always more efficient than list comprehensions - please profile before complaining
Best way to remove elements from a list
Use a list comprehension:
Scenario 1:
[item for item in my_list if 1 <= item <=5 ]
Scenario 2:
to_be_removed = {'a', '1', 2}
[item for item in my_list if item not in to_be_removed ]
Scenario 3:
[item for item in my_list if some_condition()]
Scenario 4(Nested list comprehension):
[[item for item in seq if some_condition] for seq in my_list]
Note that if you want to remove just one item then list.remove
, list.pop
and del
are definitely going to be very fast, but using these methods while iterating over the the list can result in unexpected output.
Related: Loop “Forgets” to Remove Some Items
How to remove items from a list while iterating?
You can use a list comprehension to create a new list containing only the elements you don't want to remove:
somelist = [x for x in somelist if not determine(x)]
Or, by assigning to the slice somelist[:]
, you can mutate the existing list to contain only the items you want:
somelist[:] = [x for x in somelist if not determine(x)]
This approach could be useful if there are other references to somelist
that need to reflect the changes.
Instead of a comprehension, you could also use itertools
. In Python 2:
from itertools import ifilterfalse
somelist[:] = ifilterfalse(determine, somelist)
Or in Python 3:
from itertools import filterfalse
somelist[:] = filterfalse(determine, somelist)
How to remove multiple items from a list in just one statement?
In Python, creating a new object e.g. with a list comprehension is often better than modifying an existing one:
item_list = ['item', 5, 'foo', 3.14, True]
item_list = [e for e in item_list if e not in ('item', 5)]
... which is equivalent to:
item_list = ['item', 5, 'foo', 3.14, True]
new_list = []
for e in item_list:
if e not in ('item', 5):
new_list.append(e)
item_list = new_list
In case of a big list of filtered out values (here, ('item', 5)
is a small set of elements), using a set
is faster as the in
operation is O(1) time complexity on average. It's also a good idea to build the iterable you're removing first, so that you're not creating it on every iteration of the list comprehension:
unwanted = {'item', 5}
item_list = [e for e in item_list if e not in unwanted]
A bloom filter is also a good solution if memory is not cheap.
Most efficient way to remove several items in several lists in python?
https://wiki.python.org/moin/TimeComplexity
remove
is O(n) because it first searches linearly through the list and then, if it finds it, every element after the removed object gets shifted one position to the left in memory. Because of this remove
is quite an expensive operation.
Hence remove M
items from a list of length N
be comes O(N*M)
in
on lists is also O(n)
because we need to search through the whole list in order. Hence building a new list with a filter is also O(N*M)
. However, in
on sets is O(1)
due to hashing making our filter O(N)
Hence the best solution is (I'm just going to use a flat list for simplicity, not nested)
def remove_kill_from_data(data, kill):
s = set(kill)
return [i for i in data if i not in kill]
If you don't care about keeping the order, this would be even faster (due to being done at the C level, it's still O(N)
)
def remove_kill_from_data_unordered(data, kill):
s = set(kill)
d = set(data)
return d - s
Applying to your list of lists
kill_set = set(kill)
[remove_kill_from_data(d, kill_set) for d in data]
Some timings (each copies from a static data
first)
%timeit method1(data, kill)
210 ms ± 769 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit method2(data, kill)
208 ms ± 2.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit method3(data, kill)
272 ms ± 1.28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit method4(data, kill) # using remove_kill_from_data
69.6 ms ± 1.33 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit method5(data, kill) # using remove_kill_from_data_unordered
59.5 ms ± 3.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Related Topics
Parallel.For(): Update Variable Outside of Loop
Which Files in a Visual C# Studio Project Don't Need to Be Versioned
Get the Object Is Null Using JSON in Wcf Service
Combining Datatemplates at Runtime
Transparent Window Layer That Is Click-Through and Always Stays on Top
Determine Client's Computer Name
Set Content-Type of Media Files Stored on Blob
Is Graphics.Drawimage Too Slow for Bigger Images
How to Programmatically Limit Bandwidth Usage of My C# Application
How to Add Code Outside the Scope of Main When Using C# 9 Top Level Statements
C#: Getting Names of Properties in a Chain from Lambda Expression
Get Iprincipal from Oauth Bearer Token in Owin
How to Capitalize First Letter of First Name and Last Name in C#
How to Select a Random Value from an Enumeration
Conversion of a Decimal to Double Number in C# Results in a Difference
Why "Decimal" Is Not a Valid Attribute Parameter Type
Invalid Cast from 'System.Int32' to 'System.Nullable'1[[System.Int32, Mscorlib]]