Explain Swift Iterators
Yes, the next()
method of AnyIterator
calls the given closure.
And in your code, that closure returns the same first element on each call, because it does not remember what elements have been returned already.
If Swift had a yield
statement like Python or C# then things would be easier: you could yield segment
or yield i
and are done.
But – unfortunately? – Swift has no yield
statement, which means
that the closure must explicitly manage some state in order to resume the iteration
with the next element on each call.
One possibility would be to maintain two indices, one for
the current segment and one for the current element within
a segment if that is an array:
func chain(_ segments: Any...) -> AnyIterator<Int> {
var currentSegment = 0 // index of current segment
var currentElement = 0 // index of current element within current segment
return AnyIterator<Int> {
while currentSegment < segments.count {
let next = segments[currentSegment]
switch next {
case let value as Int:
currentSegment += 1
return value
case let segment as [Int]:
if currentElement < segment.count {
let val = segment[currentElement]
currentElement += 1
return val
}
currentSegment += 1
currentElement = 0
default:
return nil
}
}
return nil
}
}
This can be generalized to arbitrarily nested arrays:
func chain(_ segments: Any...) -> AnyIterator<Int> {
var stack: [(Any, Int)] = [(segments, 0)]
return AnyIterator<Int> {
while let (next, idx) = stack.popLast() {
switch next {
case let value as Int:
return value
case let segments as [Any]:
if idx < segments.count {
stack.append((segments, idx + 1))
stack.append((segments[idx], 0))
}
default:
return nil
}
}
return nil
}
}
The still-to-be-processed arrays are on the stack together with
their current index. The arrays themselves are not modified,
so that the copy is cheap.
Example:
let G = chain([1, 2, [3]], 4, [5, 6, [], 7])
while let g = G.next() {
print(g)
}
// 1 2 3 4 5 6 7
See also Implementing recursive generator for simple tree structure in Swift for more
approaches to recursively enumerate a tree-like structure.
What is iterator.element in swift?
Iterator.Element
is the easiest to understand here. The generic parameter S
must be a type that conforms to Sequence
, as you've specified here:
func computeScoreIncrement<S : Sequence>(_ pastTurnsReversed: S) -> Int
// ^^^^^^^^^^^^^^
Therefore, S.Iterator.Element
refers to the type of sequence that S
is. If, say, S
is inferred to be [Int]
, then S.Iterator.Element
is Int
- [Int]
is a sequence of Int
.
Now onto the where Turn == S.Iterator.Element
part.
As mentioned above, S
must be a type that conforms to Sequence
, but that's not all of the constraints! S.Iterator.Element
must also be the same type as Turn
. You didn't show how Turn
is defined. It may be a generic parameter of the enclosing class, a class, struct or enum.
Thus, I can pass a [Turn]
to this method, to instances of some other type that is a sequence of Turn
s.
Iterator Int in Swift
The equivalent of Iterator is Sequence.
func put<S: Sequence>(path: S) where S.Element == Int
Of course you don't need the where
clause if you don't care about the type of the elements.
Typically a trie would itself be generic over its own Element, so it'd be something like
struct Trie<Element: Comparable> {
func put<S: Sequence>(path: S) where S.Element == Element { ... }
}
If you really want to accept iterators directly, you can, but this is pretty rare. It would be along these lines:
func put<I: IteratorProtocol>(path: inout I) where I.Element == Int
To directly make use of an iterator, you need to modify it, and that means you need it to be inout
(or you could return a copy of it, but can be dangerous since advancing multiple copies of an iterator is undefined). This is really unusual, though. You likely mean Sequence
.
Why are Swift iterators slower than array building?
You asked "why are its [Swift] generators so slow, even slower than Python’s in some cases?"
My answer is that I do not think that they are nearly as slow as your results may indicate. In particular, I will try to demonstrate that looping through an iterator should be faster than constructing an array for all your test cases.
In earlier work (see a related blog post at http://lemire.me/blog/2016/09/22/swift-versus-java-the-bitset-performance-test/), I found that Swift iterators were about half as fast as the equivalent in Java when working over a bitset class. That's not great but Java is very efficient in this respect. Meanwhile, Go does worse. I submit to you that Swift iterators are probably not ideally efficient, but they are probably within a factor of two of what is possible with raw C code. And the performance gap probably has to do with insufficient function inlining in Swift.
I see that you are using an AnyIterator
. I suggest deriving a struct
from IteratorProtocol
instead which has the benefit of insuring that there does not have to be any dynamic dispatch. Here is a relatively efficient possibility:
public struct FastFlattenIterator: IteratorProtocol {
let segments: [Any]
var i = 0 // top-level index
var j = 0 // second-level index
var jmax = 0 // essentially, this is currentarray.count, but we buffer it
var currentarray : [Int]! // quick reference to an int array to be flatten
init(_ segments: [Any]) {
self.segments = segments
}
public mutating func next() -> Int? {
if j > 0 { // we handle the case where we iterate within an array separately
let val = currentarray[j]
j += 1
if j == jmax {
j = 0
i += 1
}
return val
}
while i < segments.count {
switch segments[i] {
case let e as Int: // found an integer value
i += 1
return e
case let E as [Int]: // first encounter with an array
jmax = E.count
currentarray = E
if jmax > 0 {
j = 1
return E[0]
}
i += 1
default:
return nil
}
}
return nil
}
}
With this class, I am getting the following numbers. For each test case, the first four approaches are taken from your code sample whereas the last two (fast iterator) are build using the new struct. Notice that "Looping through a fast iterator" is always fastest.
test case 1 (small mixed input)
Filling an array : 0.0073099999999999997 ms
Filling an array, and looping through it : 0.0069870000000000002 ms
Looping through a generator : 0.18385799999999999 ms
Materializing the generator to an array : 0.18745700000000001 ms
Looping through a fast iterator : 0.005372 ms
Materializing the fast iterator : 0.015883999999999999 ms
test case 2 (large input, [Int] s)
Filling an array : 2.125931 ms
Filling an array, and looping through it : 2.1169820000000001 ms
Looping through a generator : 15.064767 ms
Materializing the generator to an array : 15.45152 ms
Looping through a fast iterator : 1.572919 ms
Materializing the fast iterator : 1.964912 ms
test case 3 (large input, Int s)
Filling an array : 2.9140269999999999 ms
Filling an array, and looping through it : 2.9064290000000002 ms
Looping through a generator : 9.8297640000000008 ms
Materializing the generator to an array : 9.8297640000000008 ms
Looping through a fast iterator : 1.978038 ms
Materializing the fast iterator : 2.2565339999999998 ms
You will find my complete code sample on GitHub: https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/extra/swift/iterators
Difference between Enumerator and Iterator in Swift or Objective C
The distinction between the two in programming is not always large and you'll find cases where the words are used interchangeably, however they do describe something different. You wrote:
As per my understating iterator performs iteration/looping over collection. e.g forEach, for loop, while, do while etc. But I do not what is enumerator.
The key point of iteration is just "iteration/looping" – simply repeating something, e.g. the Newton-Raphson method is an iterative algorithm for approximating a square-root and is typically implemented in a programming language using a loop.
Enumeration is listing things out, that's where your "over collection" comes from. So the for
/in
loop (an iterative construct) enumerates a collection. The standard collections all provide enumerators which return NSEnumerator
objects, and you typically, but not always, would use these in an iterative construct to process the collection.
Enumerated types, enum
's, are so called because they list out all the values, i.e. enumerate them.
So your interviewer might have been expecting you to state that iteration is supported in Swift/Objective-C by the iterative statements of the languages (for
, while
etc.), while enumeration is provided by objects/methods which either return each item in turn (e.g. NSArray
's objectEnumerator
) or which process each element in turn (e.g. NSArray
's enumerateObjectsUsingBlock:
). The for
/in
combines these two, it is an iterative construct which directly uses an object enumerator rather than requiring the programmer to use the enumerator directly.
HTH
Java-like Iterator in Swift
A SequenceType
(which CollectionType
, and thus all Swift collections including array, conform to) is pretty simple. It requires you provide a generate()
function that returns a type conforming to GeneratorType
.
GeneratorType
in turn only needs to provide one method: a next()
that returns each element until the elements are exhausted. It returns an optional, returning nil
after the last element is returned. This makes them pretty similar to Java iterators, only with next
and hasNext
combined into one via use of optionals.
Swift’s for…in
is really syntactic sugar for a combination of getting a generator and then repeatedly calling next
on it:
let a = [1, 2, 3]
for i in a { print(i) }
// is equivalent to:
var g = a.generate()
// the generator, being stateful, must be declared with var
while let i = g.next() { print(i) }
If using generators like this, take note of the comment above the definition of GeneratorType
in the std lib doc:
Encapsulates iteration state and interface for iteration over a
sequence.
- Note: While it is safe to copy a generator, advancing one
copy may invalidate the others.
Since writing a generator for a collection often involves a lot of boiler plate, there is a helper type, IndexingGenerator
, that can be used. This implements a generator that starts at startIndex
, and returns the value at that index and increments the index each time. A generate()
that returns an IndexingGenerator
is provided as the default implementation for CollectionType
, which means if this is good enough for your purposes, you don’t need to implement generate
when implementing a collection:
struct Bitfield: CollectionType {
let data: UInt
var startIndex: UInt { return 0 }
var endIndex: UInt { return UInt(sizeofValue(data)*8) }
subscript(idx: UInt) -> Bit {
return (data >> idx) & 1 == 0
? Bit.Zero : Bit.One
}
// no need to implement generate()
}
This default was added in Swift 2.0. Prior to that, you had to provide a minimal generator that just returned an IndexingGenerator(self)
.
Iterating with for .. in on a changing collection
The documentation for IteratorProtocol says "whenever you use a for-in loop with an array, set, or any other collection or sequence, you’re using that type’s iterator." So, we are guaranteed that a for in
loop is going to be using .makeIterator()
and .next()
which is defined most generally on Sequence
and IteratorProtocol
respectively.
The documentation for Sequence says that "the Sequence
protocol makes no requirement on conforming types regarding whether they will be destructively consumed by iteration." As a consequence, this means that an iterator for a Sequence
is not required to make a copy, and so I do not think that modifying a sequence while iterating over it is, in general, safe.
This same caveat does not occur in the documentation for Collection, but I also don't think there is any guarantee that the iterator makes a copy, and so I do not think that modifying a collection while iterating over it is, in general, safe.
But, most collection types in Swift are struct
s with value semantics or copy-on-write semantics. I'm not really sure where the documentation for this is, but this link does say that "in Swift, Array
, String
, and Dictionary
are all value types... You don’t need to do anything special — such as making an explicit copy — to prevent other code from modifying that data behind your back." In particular, this means that for Array
, .makeIterator()
cannot hold a reference to your array because the iterator for Array
does not have to "do anything special" to prevent other code (i.e. your code) from modifying the data it holds.
We can explore this in more detail. The Iterator
type of Array
is defined as type IndexingIterator<Array<Element>>
. The documentation IndexingIterator says that it is the default implementation of the iterator for collections, so we can assume that most collections will use this. We can see in the source code for IndexingIterator
that it holds a copy of its collection
@frozen
public struct IndexingIterator<Elements: Collection> {
@usableFromInline
internal let _elements: Elements
@usableFromInline
internal var _position: Elements.Index
@inlinable
@inline(__always)
/// Creates an iterator over the given collection.
public /// @testable
init(_elements: Elements) {
self._elements = _elements
self._position = _elements.startIndex
}
...
}
and that the default .makeIterator()
simply creates this copy.
extension Collection where Iterator == IndexingIterator<Self> {
/// Returns an iterator over the elements of the collection.
@inlinable // trivial-implementation
@inline(__always)
public __consuming func makeIterator() -> IndexingIterator<Self> {
return IndexingIterator(_elements: self)
}
}
Although you might not want to trust this source code, the documentation for library evolution claims that "the @inlinable
attribute is a promise from the library developer that the current definition of a function will remain correct when used with future versions of the library" and the @frozen
also means that the members of IndexingIterator
cannot change.
Altogether, this means that any collection type with value semantics and an IndexingIterator
as its Iterator
must make a copy when using using for in
loops (at least until the next ABI break, which should be a long-way off). Even then, I don't think Apple is likely to change this behavior.
In Conclusion
I don't know of any place that it is explicitly spelled out in the docs "you can modify an array while you iterate over it, and the iteration will proceed as if you made a copy" but that's also the kind of language that probably shouldn't be written down as writing such code could definitely confuse a beginner.
However, there is enough documentation lying around which says that a for in
loop just calls .makeIterator()
and that for any collection with value semantics and the default iterator type (for example, Array
), .makeIterator()
makes a copy and so cannot be influenced by code inside the loop. Further, because Array
and some other types like Set
and Dictionary
are copy-on-write, modifying these collections inside a loop will have a one-time copy penalty as the body of the loop will not have a unique reference to its storage (because the iterator will). This is the exact same penalty that modifying the collection outside the loop with have if you don’t have a unique reference to the storage.
Without these assumptions, you aren't guaranteed safety, but you might have it anyway in some circumstances.
Edit:
I just realized we can create some cases where this is unsafe for sequences.
import Foundation
/// This is clearly fine and works as expected.
print("Test normal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
print(i)
}
}
/// This is also okay. Reassigning `x` does not mutate the reference that the iterator holds.
print("Test reassignment")
for _ in 0...10 {
var x: NSMutableArray = [0,1,2,3]
for i in x {
x = []
print(i)
}
}
/// This crashes. The iterator assumes that the last index it used is still valid, but after removing the objects, there are no valid indices.
print("Test removal")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x {
x.removeAllObjects()
print(i)
}
}
/// This also crashes. `.enumerated()` gets a reference to `x` which it expects will not be modified behind its back.
print("Test removal enumerated")
for _ in 0...10 {
let x: NSMutableArray = [0,1,2,3]
for i in x.enumerated() {
x.removeAllObjects()
print(i)
}
}
The fact that this is an NSMutableArray
is important because this type has reference semantics. Since NSMutableArray
conforms to Sequence
, we know that mutating a sequence while iterating over it is not safe, even when using .enumerated()
.
How safe are swift collections when used with invalidated iterators / indices?
As you say, #1 is not an issue. You do not have a pointer to the object in Swift. You either have its value or a reference to it. If you have its value, then it's a copy. If you have a reference, then it's protected. So there's no issue here.
But let's consider the second and experiment, be surprised, and then stop being surprised.
var xs = [1,2,3,4]
for x in xs { // (1)
if x == 2 {
xs.removeAll() // (2)
}
print(x) // Prints "1\n2\n3\n\4\n"
}
xs // [] (3)
Wait, how does it print all the values when we blow away the values at (2). We are very surprised now.
But we shouldn't be. Swift arrays are values. The xs
at (1) is a value. Nothing can ever change it. It's not "a pointer to memory that includes an array structure that contains 4 elements." It's the value [1,2,3,4]
. At (2), we don't "remove all elements from the thing xs
pointed to." We take the thing xs is, create an array that results if you remove all the elements (that would be []
in all cases), and then assign that new array to xs
. Nothing bad happens.
So what does the documentation mean by "invalidates all indices?" It means exactly that. If we generated indices, they're no good anymore. Let's see:
var xs = [1,2,3,4]
for i in xs.indices {
if i == 2 {
xs.removeAll()
}
print(xs[i]) // Prints "1\n2\n" and then CRASH!!!
}
Once xs.removeAll()
is called, there's no promise that the old result of xs.indices
means anything anymore. You are not permitted to use those indices safely against the collection they came from.
"Invalidates indices" in Swift is not the same as C++'s "invalidates iterators." I'd call that pretty safe, except the fact that using collection indices is always a bit dangerous and so you should avoid indexing collections when you can help it; iterate them instead. Even if you need the indexes for some reason, use enumerate
to get them without creating any of the danger of indexing.
(Side note, dict["key"]
is not indexing into dict
. Dictionaries are a little confusing because their key is not their index. Accessing dictionaries by their DictionaryIndex
index is just as dangerous as accessing arrays by their Int
index.)
Note also that the above doesn't apply to NSArray
. If you modify NSArray
while iterating it, you'll get a "mutated collection while iterating" error. I'm only discussing Swift data types.
EDIT: for-in
is very explicit in how it works:
The generate() method is called on the collection expression to obtain a value of a generator type—that is, a type that conforms to the GeneratorType protocol. The program begins executing a loop by calling the next() method on the stream. If the value returned is not None, it is assigned to the item pattern, the program executes the statements, and then continues execution at the beginning of the loop. Otherwise, the program does not perform assignment or execute the statements, and it is finished executing the for-in statement.
The returned Generator
is a struct
and contains a collection value. You would not expect any changes to some other value to modify its behavior. Remember: [1,2,3]
is no different than 4
. They're both values. When you assign them, they make copies. So when you create a Generator over a collection value, you're going to snapshot that value, just like if I created a Generator over the number 4. (This raises an interesting problem, because Generators aren't really values, and so really shouldn't be structs. They should be classes. Swift stdlib has been fixing that. See the new AnyGenerator
for instance. But they still contain an array value, and you would never expect changes to some other array value to impact them.)
See also "Structures and Enumerations Are Value Types" which goes into more detail on the importance of value types in Swift. Arrays are just structs.
Yes, that means there's logically copying. Swift has many optimizations to minimize actual copying when it's not needed. In your case, when you mutate the dictionary while it's being iterated, that will force a copy to happen. Mutation is cheap if you're the only consumer of a particular value's backing storage. But it's O(n) if you're not. (This is determined by the Swift builtin isUniquelyReferenced()
.) Long story short: Swift Collections are Copy-on-Write, and simply passing an array does not cause real memory to be allocated or copied.
You don't get COW for free. Your own structs are not COW. It's something that Swift does in stdlib. (See Mike Ash's great discussion of how you would recreate it.) Passing your own custom structs causes real copies to happen. That said, the majority of the memory in most structs is stored in collections, and those collections are COW, so the cost of copying structs is usually pretty small.
The book doesn't spend a lot of time drilling into value types in Swift (it explains it all; it just doesn't keep saying "hey, and this is what that implies"). On the other hand, it was the constant topic at WWDC. You may be interested particularly in Building Better Apps with Value Types in Swift which is all about this topic. I believe Swift in Practice also discussed it.
EDIT2:
@KarlP raises an interesting point in the comments below, and it's worth addressing. None of the value-safety promises we're discussing are related to for-in
. They're based on Array
. for-in
makes no promises at all about what would happen if you mutated a collection while it is being iterated. That wouldn't even be meaningful. for-in
doesn't "iterate over collections," it calls next()
on Generators
. So if your Generator
becomes undefined if the collection is changed, then for-in
will blow up because the Generator
blew up.
That means that the following might be unsafe, depending on how strictly you read the spec:
func nukeFromOrbit<C: RangeReplaceableCollectionType>(var xs: C) {
var hijack = true
for x in xs {
if hijack {
xs.removeAll()
hijack = false
}
print(x)
}
}
And the compiler won't help you here. It'll work fine for all of the Swift collections. But if calling next()
after mutation for your collection is undefined behavior, then this is undefined behavior.
My opinion is that it would be poor Swift to make a collection that allows its Generator
to become undefined in this case. You could even argue that you've broken the Generator
spec if you do (it offers no UB "out" unless the generator has been copied or has returned nil). So you could argue that the above code is totally within spec and your generator is broken. Those arguments tend to be a bit messy with a "spec" like Swift's which doesn't dive into all the corner cases.
Does this mean you can write unsafe code in Swift without getting a clear warning? Absolutely. But in the many cases that commonly cause real-world bugs, Swift's built-in behavior does the right thing. And in that, it is safer than some other options.
Trying to get generic code from working Iterator pattern
Solution found!
struct Whatevers<T> {
let whatevers: [T]
}
extension Whatevers: Sequence {
func makeIterator() -> WhateversIterator<T> {
return WhateversIterator(sequence: whatevers, current: 0)
}
}
struct WhateversIterator<T>: IteratorProtocol {
let sequence: [T]
var current = 0
mutating func next() -> T? {
defer { current += 1 }
return sequence.count > current ? sequence[current] : nil
}
}
All the mistakes were about returning types from functions makeIterator and next.
Hope somebody will find it helpful!
Related Topics
Swift: Oslog/Os_Log Not Showing Up in Console App
Xcode 9 Fails to Build Swift 4 Project with Pod
Error with Parse Query Findobjectsinbackgroundwithblock
Swift Spritekit I Detect a Collison But It Reads the Collision Mulitple Times
Displaying State of an Async API Call in Swiftui
Decode Dictionary with Random Initial Key
Cannot Call Value of Non-Function Type 'Nshttpurlresponse' Alamofire Objectmapper
Swift Seems to Be Slower as Objective-C in Loops
How to Encode and Decode the Closures
Swiftui Pick a Value from a List with Ontap Gesture
Connecting Hc-05 with iPhone Se iOS(V11.0)
Cabasicanimation Creates Empty Default Value Copy of Calayer
Why Doesn't Swift Force My Designated Initializer to Call Super
Concatenate Literal with Optional String
Swift, for Some Uiviews to Their Overall Controller When Clicked