Iterating with for .. in on a Changing Collection

What is the best way to modify a list in a 'foreach' loop?

The collection used in foreach is immutable. This is very much by design.

As it says on MSDN:

The foreach statement is used to
iterate through the collection to get
the information that you want, but can
not be used to add or remove items
from the source collection to avoid
unpredictable side effects. If you
need to add or remove items from the
source collection, use a for loop.

The post in the link provided by Poko indicates that this is allowed in the new concurrent collections.

Iterating with for .. in on a changing collection

The documentation for IteratorProtocol says "whenever you use a for-in loop with an array, set, or any other collection or sequence, you’re using that type’s iterator." So, we are guaranteed that a for in loop is going to be using .makeIterator() and .next() which is defined most generally on Sequence and IteratorProtocol respectively.

The documentation for Sequence says that "the Sequence protocol makes no requirement on conforming types regarding whether they will be destructively consumed by iteration." As a consequence, this means that an iterator for a Sequence is not required to make a copy, and so I do not think that modifying a sequence while iterating over it is, in general, safe.

This same caveat does not occur in the documentation for Collection, but I also don't think there is any guarantee that the iterator makes a copy, and so I do not think that modifying a collection while iterating over it is, in general, safe.

But, most collection types in Swift are structs with value semantics or copy-on-write semantics. I'm not really sure where the documentation for this is, but this link does say that "in Swift, Array, String, and Dictionary are all value types... You don’t need to do anything special — such as making an explicit copy — to prevent other code from modifying that data behind your back." In particular, this means that for Array, .makeIterator() cannot hold a reference to your array because the iterator for Array does not have to "do anything special" to prevent other code (i.e. your code) from modifying the data it holds.

We can explore this in more detail. The Iterator type of Array is defined as type IndexingIterator<Array<Element>>. The documentation IndexingIterator says that it is the default implementation of the iterator for collections, so we can assume that most collections will use this. We can see in the source code for IndexingIterator that it holds a copy of its collection

@frozen
public struct IndexingIterator<Elements: Collection> {
@usableFromInline
internal let _elements: Elements
@usableFromInline
internal var _position: Elements.Index

@inlinable
@inline(__always)
/// Creates an iterator over the given collection.
public /// @testable
init(_elements: Elements) {
self._elements = _elements
self._position = _elements.startIndex
}
...
}

and that the default .makeIterator() simply creates this copy.

extension Collection where Iterator == IndexingIterator<Self> {
/// Returns an iterator over the elements of the collection.
@inlinable // trivial-implementation
@inline(__always)
public __consuming func makeIterator() -> IndexingIterator<Self> {
return IndexingIterator(_elements: self)
}
}

Although you might not want to trust this source code, the documentation for library evolution claims that "the @inlinable attribute is a promise from the library developer that the current definition of a function will remain correct when used with future versions of the library" and the @frozen also means that the members of IndexingIterator cannot change.

Altogether, this means that any collection type with value semantics and an IndexingIterator as its Iterator must make a copy when using using for in loops (at least until the next ABI break, which should be a long-way off). Even then, I don't think Apple is likely to change this behavior.

In Conclusion

I don't know of any place that it is explicitly spelled out in the docs "you can modify an array while you iterate over it, and the iteration will proceed as if you made a copy" but that's also the kind of language that probably shouldn't be written down as writing such code could definitely confuse a beginner.

However, there is enough documentation lying around which says that a for in loop just calls .makeIterator() and that for any collection with value semantics and the default iterator type (for example, Array), .makeIterator() makes a copy and so cannot be influenced by code inside the loop. Further, because Array and some other types like Set and Dictionary are copy-on-write, modifying these collections inside a loop will have a one-time copy penalty as the body of the loop will not have a unique reference to its storage (because the iterator will). This is the exact same penalty that modifying the collection outside the loop with have if you don’t have a unique reference to the storage.

Without these assumptions, you aren't guaranteed safety, but you might have it anyway in some circumstances.

Edit:

I just realized we can create some cases where this is unsafe for sequences.

import Foundation

/// This is clearly fine and works as expected.
print("Test normal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
print(i)
}
}

/// This is also okay. Reassigning `x` does not mutate the reference that the iterator holds.
print("Test reassignment")
for _ in 0...10 {
var x: NSMutableArray = [0,1,2,3]
for i in x {
x = []
print(i)
}
}

/// This crashes. The iterator assumes that the last index it used is still valid, but after removing the objects, there are no valid indices.
print("Test removal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
x.removeAllObjects()
print(i)
}
}

/// This also crashes. `.enumerated()` gets a reference to `x` which it expects will not be modified behind its back.
print("Test removal enumerated")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x.enumerated() {
x.removeAllObjects()
print(i)
}
}

The fact that this is an NSMutableArray is important because this type has reference semantics. Since NSMutableArray conforms to Sequence, we know that mutating a sequence while iterating over it is not safe, even when using .enumerated().

Can a iterator change the collection it is iterating over? Java

The Iterator simply provides an interface into some sort of stream, therefore not only is it perfectly possible for next() to destroy data in some way, but it's even possible for the data in an Iterator to be unique and irreplaceable.

We could come up with more direct examples, but an easy one is the Iterator in DirectoryStream. While a DirectoryStream is technically Iterable, it only allows one Iterator to be constructed, so if you tried to do the following:

Path dir = ...
try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
int count = length(stream.iterator());
for (Path entry: stream) {
...
}
}

You would get an exception in the foreach block, because the stream can only be iterated once. So in summary, it is possible for your length() method to change objects and lose data.

Furthermore, there's no reason an Iterator has to be associated with some separate data-store. Take for example an answer I gave a few months ago providing a clean way to select n random numbers. By using an infinite Iterator we are able to provide, filter, and pass around arbitrarily large amounts of random data lazily, no need to store it all at once, or even compute them until they're needed. Because the Iterator doesn't back any data structure, querying it is obviously destructive.

Now that said, these examples don't make your method bad. Notice that the Guava library (which everyone should be using) provides an Iterators class with exactly the behavior you detail above, called size() to conform with the Collections Framework. The burden is then on the user of such methods to be aware of what sort of data they're working with, and avoid making careless calls such as trying to count the number of results in an Iterator that they know cannot be replaced.

How to iterate through changing collection

If you index into the Dependencies collection there wouldn't be an Enumerator created and you can loop through as the collection is modified. This approach could easily cause headaches if the new items are not appended to the end of the list or items are removed.

for(int i = 0; i < mgr.Dependencies.Count; i++)
{
var item = mgr.Dependecies[i];
if (item.depends.Length > 0)
{
// code unchanged
}
}

A safe approach would be to use a Queue and populate it with the initial items from mgr.Dependencies and then Enqueue any additional items you want to process.

var toBeProcessed = new Queue<Dependency>(mgr.Dependencies);
while(toBeProcessed.Count > 0)
{
var item = toBeProcessed.Dequeue();

// loop

// if a new dependency gets added that needs processing, just add it to the queue.
toBeProcessed.Enqueue(newissue1);

}

How to modify a Collection while iterating using for-each loop without ConcurrentModificationException?

Use Iterator#remove.

This is the only safe way to modify a collection during iteration. For more information, see The Collection Interface tutorial.

If you also need the ability to add elements while iterating, use a ListIterator.

Concepts of modifying the element in collection while iterating?


Question1:- So can i say enhance for loop also uses the fail-fast iterator internally?Though when i execute below code it works fine

Yes, thats right. Have a look at the compiled code, with the javap command to verify this if you like.

My guess is that modification means here removal or addition not for updation of element inside collection for list interface while it also includes modification of element for set interface . Right? Atleast the programmes i tried it is the case with them.

Thats right, if you do emp1.setEmpId(2) or something similar, the iteration will not fail.

...it should throw the concurrent modification exception but it did not . Not sure why?

It only throws the exception if you modify the list. Keep in mind that the list contains references to objects. If you modify the objects, the references does not change, thus the list does not change.

How to use foreach if iterating collection is changed in this loop?

you can force the duplication of the collection --> a new iterator will be created on the duplicated collection.
the easiest way to duplicate the collection is the extension method ToArray() which creates a generic array out of the list. This will not copy the objects but only the references to them.

foreach(myClass obj in myList.ToArray())
{
...
}

this means that you will always loop over all items in the collection before any changes happened! no added, no removed.

Why can't you modify a collection in a for each loop

The Best answer would be that List has some kind of Tracking over its list items and can update its items as you ask but a simple IEnumerable does not so it will not allow you to change them.



Related Topics



Leave a reply



Submit