How to Quickly Remove Items from a List

How to Quickly Remove Items From a List

List isn't an efficient data structure when it comes to removal. You would do better to use a double linked list (LinkedList) as removal simply requires reference updates in the adjacent entries.

Fast way to remove a few items from a list/queue

The list comprehension is the asymptotically optimal solution:

somelist = [x for x in somelist if not determine(x)]

It only makes one pass over the list, so runs in O(n) time. Since you need to call determine() on each object, any algorithm will require at least O(n) operations. The list comprehension does have to do some copying, but it's only copying references to the objects not copying the objects themselves.

Removing items from a list in Python is O(n), so anything with a remove, pop, or del inside the loop will be O(n**2).

Also, in CPython list comprehensions are faster than for loops.

Removing items from a list

You need to use Iterator and call remove() on iterator instead of using for loop.

How to delete an item in a list if it exists?

1) Almost-English style:

Test for presence using the in operator, then apply the remove method.

if thing in some_list: some_list.remove(thing)

The removemethod will remove only the first occurrence of thing, in order to remove all occurrences you can use while instead of if.

while thing in some_list: some_list.remove(thing)    
  • Simple enough, probably my choice.for small lists (can't resist one-liners)

2) Duck-typed, EAFP style:

This shoot-first-ask-questions-last attitude is common in Python. Instead of testing in advance if the object is suitable, just carry out the operation and catch relevant Exceptions:

try:
some_list.remove(thing)
except ValueError:
pass # or scream: thing not in some_list!
except AttributeError:
call_security("some_list not quacking like a list!")

Off course the second except clause in the example above is not only of questionable humor but totally unnecessary (the point was to illustrate duck-typing for people not familiar with the concept).

If you expect multiple occurrences of thing:

while True:
try:
some_list.remove(thing)
except ValueError:
break
  • a little verbose for this specific use case, but very idiomatic in Python.
  • this performs better than #1
  • PEP 463 proposed a shorter syntax for try/except simple usage that would be handy here, but it was not approved.

However, with contextlib's suppress() contextmanager (introduced in python 3.4) the above code can be simplified to this:

with suppress(ValueError, AttributeError):
some_list.remove(thing)

Again, if you expect multiple occurrences of thing:

with suppress(ValueError):
while True:
some_list.remove(thing)

3) Functional style:

Around 1993, Python got lambda, reduce(), filter() and map(), courtesy of a Lisp hacker who missed them and submitted working patches*. You can use filter to remove elements from the list:

is_not_thing = lambda x: x is not thing
cleaned_list = filter(is_not_thing, some_list)

There is a shortcut that may be useful for your case: if you want to filter out empty items (in fact items where bool(item) == False, like None, zero, empty strings or other empty collections), you can pass None as the first argument:

cleaned_list = filter(None, some_list)
  • [update]: in Python 2.x, filter(function, iterable) used to be equivalent to [item for item in iterable if function(item)] (or [item for item in iterable if item] if the first argument is None); in Python 3.x, it is now equivalent to (item for item in iterable if function(item)). The subtle difference is that filter used to return a list, now it works like a generator expression - this is OK if you are only iterating over the cleaned list and discarding it, but if you really need a list, you have to enclose the filter() call with the list() constructor.
  • *These Lispy flavored constructs are considered a little alien in Python. Around 2005, Guido was even talking about dropping filter - along with companions map and reduce (they are not gone yet but reduce was moved into the functools module, which is worth a look if you like high order functions).

4) Mathematical style:

List comprehensions became the preferred style for list manipulation in Python since introduced in version 2.0 by PEP 202. The rationale behind it is that List comprehensions provide a more concise way to create lists in situations where map() and filter() and/or nested loops would currently be used.

cleaned_list = [ x for x in some_list if x is not thing ]

Generator expressions were introduced in version 2.4 by PEP 289. A generator expression is better for situations where you don't really need (or want) to have a full list created in memory - like when you just want to iterate over the elements one at a time. If you are only iterating over the list, you can think of a generator expression as a lazy evaluated list comprehension:

for item in (x for x in some_list if x is not thing):
do_your_thing_with(item)
  • See this Python history blog post by GvR.
  • This syntax is inspired by the set-builder notation in math.
  • Python 3 has also set and dict comprehensions.

Notes

  1. you may want to use the inequality operator != instead of is not (the difference is important)
  2. for critics of methods implying a list copy: contrary to popular belief, generator expressions are not always more efficient than list comprehensions - please profile before complaining

Best way to remove elements from a list

Use a list comprehension:

Scenario 1:

[item for item in my_list if 1 <= item <=5 ]

Scenario 2:

to_be_removed = {'a', '1', 2}
[item for item in my_list if item not in to_be_removed ]

Scenario 3:

[item for item in my_list if some_condition()]

Scenario 4(Nested list comprehension):

[[item for item in seq if some_condition] for seq in my_list]

Note that if you want to remove just one item then list.remove, list.pop and del are definitely going to be very fast, but using these methods while iterating over the the list can result in unexpected output.

Related: Loop “Forgets” to Remove Some Items

How to remove items from a list while iterating?

You can use a list comprehension to create a new list containing only the elements you don't want to remove:

somelist = [x for x in somelist if not determine(x)]

Or, by assigning to the slice somelist[:], you can mutate the existing list to contain only the items you want:

somelist[:] = [x for x in somelist if not determine(x)]

This approach could be useful if there are other references to somelist that need to reflect the changes.

Instead of a comprehension, you could also use itertools. In Python 2:

from itertools import ifilterfalse
somelist[:] = ifilterfalse(determine, somelist)

Or in Python 3:

from itertools import filterfalse
somelist[:] = filterfalse(determine, somelist)

How to remove multiple items from a list in just one statement?

In Python, creating a new object e.g. with a list comprehension is often better than modifying an existing one:

item_list = ['item', 5, 'foo', 3.14, True]
item_list = [e for e in item_list if e not in ('item', 5)]

... which is equivalent to:

item_list = ['item', 5, 'foo', 3.14, True]
new_list = []
for e in item_list:
if e not in ('item', 5):
new_list.append(e)
item_list = new_list

In case of a big list of filtered out values (here, ('item', 5) is a small set of elements), using a set is faster as the in operation is O(1) time complexity on average. It's also a good idea to build the iterable you're removing first, so that you're not creating it on every iteration of the list comprehension:

unwanted = {'item', 5}
item_list = [e for e in item_list if e not in unwanted]

A bloom filter is also a good solution if memory is not cheap.

Most efficient way to remove several items in several lists in python?

https://wiki.python.org/moin/TimeComplexity

remove is O(n) because it first searches linearly through the list and then, if it finds it, every element after the removed object gets shifted one position to the left in memory. Because of this remove is quite an expensive operation.

Hence remove M items from a list of length N be comes O(N*M)

in on lists is also O(n) because we need to search through the whole list in order. Hence building a new list with a filter is also O(N*M). However, in on sets is O(1) due to hashing making our filter O(N)

Hence the best solution is (I'm just going to use a flat list for simplicity, not nested)

def remove_kill_from_data(data, kill):
s = set(kill)
return [i for i in data if i not in kill]

If you don't care about keeping the order, this would be even faster (due to being done at the C level, it's still O(N))

def remove_kill_from_data_unordered(data, kill):
s = set(kill)
d = set(data)
return d - s

Applying to your list of lists

kill_set = set(kill)
[remove_kill_from_data(d, kill_set) for d in data]

Some timings (each copies from a static data first)

%timeit method1(data, kill)
210 ms ± 769 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit method2(data, kill)
208 ms ± 2.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit method3(data, kill)
272 ms ± 1.28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit method4(data, kill) # using remove_kill_from_data
69.6 ms ± 1.33 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit method5(data, kill) # using remove_kill_from_data_unordered
59.5 ms ± 3.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


Related Topics



Leave a reply



Submit