Removing Item from List - During Iteration - What's Wrong with This Idiom

Removing Item From List - during iteration - what's wrong with this idiom?

Some answers explain why this happens and some explain what you should've done. I'll shamelessly put the pieces together.



What's the reason for this?

Because the Python language is designed to handle this use case differently. The documentation makes it clear:

It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy.

Emphasis mine. See the linked page for more -- the documentation is copyrighted and all rights are reserved.

You could easily understand why you got what you got, but it's basically undefined behavior that can easily change with no warning from build to build. Just don't do it.

It's like wondering why i += i++ + ++i does whatever the hell it is it that line does on your architecture on your specific build of your compiler for your language -- including but not limited to trashing your computer and making demons fly out of your nose :)



How it could this be re-written to remove every item?

  • del letters[:] (if you need to change all references to this object)
  • letters[:] = [] (if you need to change all references to this object)
  • letters = [] (if you just want to work with a new object)

Maybe you just want to remove some items based on a condition? In that case, you should iterate over a copy of the list. The easiest way to make a copy is to make a slice containing the whole list with the [:] syntax, like so:

#remove unsafe commands
commands = ["ls", "cd", "rm -rf /"]
for cmd in commands[:]:
if "rm " in cmd:
commands.remove(cmd)

If your check is not particularly complicated, you can (and probably should) filter instead:

commands = [cmd for cmd in commands if not is_malicious(cmd)]

How to remove items from a list while iterating?

You can use a list comprehension to create a new list containing only the elements you don't want to remove:

somelist = [x for x in somelist if not determine(x)]

Or, by assigning to the slice somelist[:], you can mutate the existing list to contain only the items you want:

somelist[:] = [x for x in somelist if not determine(x)]

This approach could be useful if there are other references to somelist that need to reflect the changes.

Instead of a comprehension, you could also use itertools. In Python 2:

from itertools import ifilterfalse
somelist[:] = ifilterfalse(determine, somelist)

Or in Python 3:

from itertools import filterfalse
somelist[:] = filterfalse(determine, somelist)

Why does removing items from list while iterating over it skip every other item

What happens here is the following:

You start with the following list:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Then, in the first iteration of the for loop, you get the first item of the list which is 0, and remove it from the list. The updated list now looks like:
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Then, at the second iteration of the loop, you get the second item in the updated list. Here, that second item is not 1 anymore but it is 2 (because you look at the updated list), so you remove 2 from the list. The updated list now looks like:
[1, 3, 4, 5, 6, 7, 8, 9]

And it goes on... until you get:
[1, 3, 5, 7, 9]

How to remove all items in a list that are in another list in Python 3?

Use the filter function to filter out any items present in both lists like this:

def array_diff(a, b):    
return list(filter(lambda item: item not in ls2, ls1))

ls1 = [1, 2, 3, 4, 5]
ls2 = [2, 5]
print(array_diff(ls1, ls2))

Remove elements from collection while iterating

Let me give a few examples with some alternatives to avoid a ConcurrentModificationException.

Suppose we have the following collection of books

List<Book> books = new ArrayList<Book>();
books.add(new Book(new ISBN("0-201-63361-2")));
books.add(new Book(new ISBN("0-201-63361-3")));
books.add(new Book(new ISBN("0-201-63361-4")));

Collect and Remove

The first technique consists in collecting all the objects that we want to delete (e.g. using an enhanced for loop) and after we finish iterating, we remove all found objects.

ISBN isbn = new ISBN("0-201-63361-2");
List<Book> found = new ArrayList<Book>();
for(Book book : books){
if(book.getIsbn().equals(isbn)){
found.add(book);
}
}
books.removeAll(found);

This is supposing that the operation you want to do is "delete".

If you want to "add" this approach would also work, but I would assume you would iterate over a different collection to determine what elements you want to add to a second collection and then issue an addAll method at the end.

Using ListIterator

If you are working with lists, another technique consists in using a ListIterator which has support for removal and addition of items during the iteration itself.

ListIterator<Book> iter = books.listIterator();
while(iter.hasNext()){
if(iter.next().getIsbn().equals(isbn)){
iter.remove();
}
}

Again, I used the "remove" method in the example above which is what your question seemed to imply, but you may also use its add method to add new elements during iteration.

Using JDK >= 8

For those working with Java 8 or superior versions, there are a couple of other techniques you could use to take advantage of it.

You could use the new removeIf method in the Collection base class:

ISBN other = new ISBN("0-201-63361-2");
books.removeIf(b -> b.getIsbn().equals(other));

Or use the new stream API:

ISBN other = new ISBN("0-201-63361-2");
List<Book> filtered = books.stream()
.filter(b -> b.getIsbn().equals(other))
.collect(Collectors.toList());

In this last case, to filter elements out of a collection, you reassign the original reference to the filtered collection (i.e. books = filtered) or used the filtered collection to removeAll the found elements from the original collection (i.e. books.removeAll(filtered)).

Use Sublist or Subset

There are other alternatives as well. If the list is sorted, and you want to remove consecutive elements you can create a sublist and then clear it:

books.subList(0,5).clear();

Since the sublist is backed by the original list this would be an efficient way of removing this subcollection of elements.

Something similar could be achieved with sorted sets using NavigableSet.subSet method, or any of the slicing methods offered there.

Considerations:

What method you use might depend on what you are intending to do

  • The collect and removeAl technique works with any Collection (Collection, List, Set, etc).
  • The ListIterator technique obviously only works with lists, provided that their given ListIterator implementation offers support for add and remove operations.
  • The Iterator approach would work with any type of collection, but it only supports remove operations.
  • With the ListIterator/Iterator approach the obvious advantage is not having to copy anything since we remove as we iterate. So, this is very efficient.
  • The JDK 8 streams example don't actually removed anything, but looked for the desired elements, and then we replaced the original collection reference with the new one, and let the old one be garbage collected. So, we iterate only once over the collection and that would be efficient.
  • In the collect and removeAll approach the disadvantage is that we have to iterate twice. First we iterate in the foor-loop looking for an object that matches our removal criteria, and once we have found it, we ask to remove it from the original collection, which would imply a second iteration work to look for this item in order to remove it.
  • I think it is worth mentioning that the remove method of the Iterator interface is marked as "optional" in Javadocs, which means that there could be Iterator implementations that throw UnsupportedOperationException if we invoke the remove method. As such, I'd say this approach is less safe than others if we cannot guarantee the iterator support for removal of elements.

for loop, not all items that fulfill the condition are removed

Its a bad idea to change the list while iterating over it. Instead you can create a new list and add to it only the elements in l that do not contain 0, as follows:

res=[]

for e in l:
if '0' not in e:
res.append(e)

You can also do it with list comprehension:

res=[e for e in l if '0' not in e]

Remove String From List

The easy way would be to use list comprehensions:

filtered = [ v for v in address if not v.startswith('10.') ]


Related Topics



Leave a reply



Submit