Why Does Filters in Swift Iterate the Collection Twice

Why does filters in Swift iterate the collection twice?

Most probably filter is implemented to first count the number of elements it needs to store and then, after using that number to size the allocation of storage for a new array, looping again to copy the ones he needs to keep.

The fact that it loops only once if you always return false means that it optimizes away the second cycle iff the result is empty.

You may want to radar this as a bug but it probably is "working as designed": arrays are not lists, after all.

Filtering multiple times in Swift

You certainly can chain together filters. From your comments, I don't think you need the any references to self in the second or third filters:

@objc private func startSearch()
{
// You don't need [weak self]
bluetoothProvider.startScanning()
.filter
{ newPeripheral in
return !self.scannedPeripherals.contains {
$0.peripheral.identifier == newPeripheral.peripheral.identifier
}
}.filter { $0.rssi.intValue >= 0 }
.filter { $0.peripheral.identifier.uuidString.contains("0863") }
}

Since you're not composing the filters dynamically and not mapping to different types in between, you'll get slightly better performance by condensing them into one filter.

@objc private func startSearch()
{
bluetoothProvider.startScanning().filter
{ newPeripheral in
!self.scannedPeripherals.contains {
$0.peripheral.identifier == newPeripheral.peripheral.identifier
}
&& newPeripheral.rssi >= 0
&& peripheral.identifier.uuidString.contains("0863")
}
}

Since you're new to Swift, I'd like to make a recommendation for learning closures. Start by writing a named function that you pass to filter (or other methods that take a closure parameter). Get that to work. Once you have the named function working, then turn it into a closure.

You could even take it a step further, by writing out a loop that does what you want it to do. The body of the loop is essentially what will go into your closure. Then move the loop's body into a named function, so your loop just calls it for each element. When that works, change the loop into a call to filter (or map or forEach), and pass your named function to it. When that works, change your function into a closure.

Over time you'll become comfortable enough with closures that the intermediate refractoring won't be needed. But at first it's helpful because it starts with something you already understand and works toward the new thing.

Just in case you haven't picked up on it yet, I'll also mention that where you see $0, that's a reference to the first parameter passed to a closure when the closure's definition doesn't assign a name to it. In your the first closure, ignoring the unnecessary [weak self], you start with newPeripheral in. That gives a name to the parameter for that closure, so you can refer to it by newPeripheral, but then when you call contains you're passing another closure nested inside the first one. That nested closure doesn't name its parameter, so it uses $0 to refer to it. For closures that take more than one parameter, there can also be $1, $2, etc...

Also, if you're familiar with the idea of "callback" functions in C, that's basically what closures are often used for, except unlike C function pointers, they don't have to refer to a named function, but can be defined on the fly, and closures can capture variables from the surrounding scope.

weak

Although your question wasn't actually about weak and guard you mention that you didn't understand them so I thought I'm explain them briefly.

[weak self] is "capture syntax" - basically it's a way of telling the compiler how to import symbols from the surrounding context into the closure you're passing to filter. In this case you're capturing self as a weak reference.

To understand what a weak reference is, it's useful to know what a reference is. A reference is Swift's way to referring to something that's dynamically allocated in the heap. In that way, it's like the main use of pointers in C. All of the primitive types, plus Array, Dictionary, String, Set and any struct or enum you define are all value types, so it doesn't make sense to take a weak reference to them. In fact, the only reference that occurs for them is when they're passed as an inout parameter.

Unless you venture into the realm of the UnsafePointer family of types, the only reference types you're likely to use in Swift are some kind of class.

A normal reference, often called a strong reference, increases a reference count for the thing being referred to. When it goes out of scope it decrements the reference count. When the reference count goes to 0, the thing it refers to is deinitialized and deallocated. That's the heart of Swift's memory management.

However, there are circumstances where you can create strong reference cycles. Imagine a tree data structure where children keep a reference to their parents. In that case, if the root node goes out of scope, the tree, if it has more than just a root node, would never be deallocated, because the reference counts never go to 0, because the children hold strong references to the parent.

weak references are the way to solve that problem, because they don't increase the reference count, so in the tree data structure, you'd want each node to have strong references to its children, but a weak reference to its parent. That way, when the root of the tree goes out of scope, the whole tree is deallocated. Same idea would be true for a doubly-linked list. Each node would have a strong reference for its next node, and weak reference for its previous node (or the other way around, as long as there's only one strong reference).

weak references use the Optional syntax because the thing they refer to might not exist when you want to access it.

Strong reference cycles can also happen with a type of closure called an "escaping" closure. Sometimes you'll see @escaping in the declaration for functions that take closure parameters, often for completion handlers. An "escaping" closure is one that is stored away to be called later. In that case any strong references to self inside the closure could prevent the object its referring to from being deinitialized and deallocated as long as the closure is being kept somewhere. Basically a memory leak.

filter doesn't take an @escaping closure. It doesn't store the closure, it calls it before returning, and then it's done with it. So its declaration doesn't include @escaping, and there's no need to capture a weak reference to self in this case.

I didn't mention unowned references, but they're kind of like weak references that are implicitly unwrapped.

guard

guard is basically an inverted if statement with the added requirement that its else block must return, throw or call a function that returns Never, which is how a function indicates that it never returns, either because it terminates the program, like fatalError() does, or because it enters an infinite loop for some reason. In loops, guard's else block can also break or continue, and in switch statements, it can break or fallthrough.

You mention in comments that you've done embedded C. C programs often have "guard" ifs to exit early for error conditions, or for easy cases to move them out of the main logic.

int find(char x, const char* s)
{
if (s == NULL) return -1; // This is a guard "if" in C

for(int i = 0; s[i] != 0; ++i) {
if (s[i] == x) return i;
}

return -1;
}

Basically

guard condition else { return }

is equivalent to

if !condition { return }

Think of guard as saying that its condition must be true in order to get to the main code that follows, otherwise control passes to the else block, which has to somehow get out of the current context.

Often you could use either one with equal ease, but if you combine guard or if with optional binding, there is a difference.

guard let x = foo() else { return }

// x is defined from here on

if let y = foo() {
// y is defined here
}

// y is not defined here

Why does filter(_:)’s predicate get called so many times when evaluating it lazily?

The Problems

The first culprit here is the slicing of the lazy filter collection through the use of prefix(_:) – which, in this case, returns a BidirectionalSlice of the LazyFilterBidirectionalCollection.

In general, the slicing of a Collection entails the storing of the base collection, along with the range of indices that are valid for the slice to ‘view’. Therefore, in order to create a slice of a LazyFilterBidirectionalCollection to view the first 5 elements, the range of indices stored must be startIndex ..< indexAfterTheFifthElement.

In order to get indexAfterTheFifthElement, the LazyFilterBidirectionalCollection must iterate through the base collection (numbers) in order to find the 6th element that meets the predicate (you can see the exact implementation of the indexing here).

Therefore all the elements in the range 0...15 from the above example need to be checked against the predicate simply in order to create a slice of the lazy filter collection.

The second culprit is Array’s init(_:), which accepts a Sequence of elements of the same type as the array’s Element type. The implementation of this initialiser calls _copyToContiguousArray() on the sequence, which, for most sequences, forwards the call to this function:

internal func _copySequenceToContiguousArray<S : Sequence>
(_ source: S) -> ContiguousArray<S.Iterator.Element> {

let initialCapacity = source.underestimatedCount // <- problem here
var builder =
_UnsafePartiallyInitializedContiguousArrayBuffer<S.Iterator.Element>(
initialCapacity: initialCapacity)

var iterator = source.makeIterator()

// FIXME(performance): use _copyContents(initializing:).
// Add elements up to the initial capacity without checking for regrowth.
for _ in 0..<initialCapacity {
builder.addWithExistingCapacity(iterator.next()!)
}

// Add remaining elements, if any.
while let element = iterator.next() {
builder.add(element)
}

return builder.finish()
}

The problem here is underestimatedCount. For plain sequences, this just has a default implementation that returns 0 – however, for collections, it has a default implementation which gets the count of the collection (I go into this in more detail here).

The default implementation for the count of a Collection (which BidirectionalSlice will use here) is simply:

public var count: IndexDistance {
return distance(from: startIndex, to: endIndex)
}

Which, for our slice, will walk through the indices up to indexAfterTheFifthElement, thus re-evaluating the elements in the range 0...15, again.

Finally, an iterator of the slice is made, and is iterated through, adding each element in the sequence to the new array’s buffer. For a BidirectionalSlice, this will use an IndexingIterator, which simply works by advancing indices and outputting the element for each index.

The reason why this walk over the indices doesn’t re-evaluate the elements up to the first element of the result (note in the question’s example, 0 is evaluated one time less) is due to the fact that it doesn’t directly access the startIndex of the LazyFilterBidirectionalCollection, which has to evaluate all elements up to the first element in the result. Instead, the iterator can work from the start index of the slice itself.


The Solution

The simple solution is to avoid slicing the lazy filter collection in order to get its prefix, but instead apply the prefixing lazily.

There are actually two implementations of prefix(_:). One is provided by Collection, and returns a SubSequence (which is some form of slice for most standard library collections).

The other is provided by Sequence, that returns an AnySequence – which, under the hood uses a base sequence of _PrefixSequence, which simply takes an iterator and allows iteration through it until a given number of elements have been iterated over – then just stops returning elements.

For lazy collections, this implementation of prefix(_:) is great, as it doesn’t require any indexing – it just lazily applies the prefixing.

Therefore, if you say:

let result : AnySequence = numbers.lazy
.filter {
// gets called 1x per element :)
print("Calling filter for: \($0)")
return $0 % 3 == 0
}
.prefix(5)

The elements of numbers (up to the 5th match) will only be evaluated once by filter(_:)’s predicate when passed to Array's initialiser, as you’re forcing Swift to use Sequence’s default implementation of prefix(_:).

The foolproof way of preventing all indexing operations on a given lazy filter collection is to simply use a lazy filter sequence instead – this can be done by just wrapping the collection you wish to perform lazy operations on in an AnySequence:

let numbers = Array(0 ..< 50)

let result = AnySequence(numbers).lazy
.filter {
// gets called 1x per element :)
print("Calling filter for: \($0)")
return $0 % 3 == 0
}
.dropFirst(5) // neither of these will do indexing,
.prefix(5) // but instead return a lazily evaluated AnySequence.

print(Array(result)) // [15, 18, 21, 24, 27]

However note that for bidirectional collections, this may have an adverse effect for operations on the end of the collection – as then the entire sequence will have to be iterated through in order to reach the end.

For such operations as suffix(_:) and dropLast(_:), it may well be more efficient to work with a lazy collection over a sequence (at least for small inputs), as they can simply index from the end of the collection.

Although, as with all performance related concerns, you should first check to see if this is even a problem in the first place, and then secondly run your own tests to see which method is better for your implementation.


Conclusion (TL;DR)

So, after all of that – the lesson to learn here is that you should be wary of the fact that slicing a lazy filter collection can re-evaluate every element of the base collection up to the end index which the slice can ‘view’.

Often it’s more desirable to treat a lazy filter collection as a sequence instead, which cannot be indexed, therefore meaning that lazy operations cannot evaluate any elements (doing so would risk destructively iterating over them) until an eager operation happens.

However, you should then be wary of the fact that you’re potentially sacrificing being able to index the collection from the end, which is important for operations such as suffix(_:).

Finally, it’s worth noting that this isn't a problem with lazy views such as LazyMapCollection, as their elements don’t depend on the ‘results’ of previous elements – therefore they can be indexed in constant time if their base collection is a RandomAccessCollection.

Remove duplicates from array of Int in Swift (for-in-loop)

The problematic part is to iterate over an array and update that array at the same time. In this case, removing an element while iterating.

Removing an element decreases array length (count) and also changes indices. Therefore in

for j in i + 1 ..< array.count  {
if array[i] == array[j] {
newArray.remove(at: j)
}
}

After removing the first index, your other indices become invalid. Note that count is always read only once, before the actual iteration.

This is one of the reasons why removing elements during iteration is dangerous and complicated. You could fix it by maintaining the number of removed elements and update indices accordingly. Or you can iterate backwards:

var newArray = array

for i in (0 ..< newArray.count - 1).reversed() {
for j in (i + 1 ..< newArray.count).reversed() {
if newArray[i] == newArray[j] {
newArray.remove(at: j)
}
}
}
return newArray

You are still changing indices and count but since you are iterating backwards, you only change the indices that have been already used.

In general it's simple and safer to build a new array instead of updating the current one:

var newArray: [Int] = []

for value in array {
if !newArray.contains(value) {
newArray.append(value)
}
}

return newArray

which can be very reduced in complexity (performance) by using a Set to keep added elements:

var newArray: [Int] = []
var foundElements: Set<Int> = []

for value in array {
if foundElements.insert(value).inserted {
newArray.append(value)
}
}

return newArray

Which can be simplified using filter:

var foundElements: Set<Int> = []
return array.filter { foundElements.insert($0).inserted }

How to loop through an optional collection

You have a couple of options. You can provide an empty array as the default value for elements in a for ... in loop:

let elements: [Element]?

for element in elements ?? [] {

}

Or you can use forEach and optional chaining on elements

elements?.forEach { element in

}

Bear in mind that optional collections are an anti-pattern in Swift. You can represent the lack of value using an empty collection, so wrapping a collection in an optional doesn't provide any extra value, while complicating the interface.

Sum Double from filtered fetch request?

It seems that you have incorrectly written the code above. In your ForEach, you have a variable value which must be a collection (and I presume the result of your FetchRequest), but it is not defined anywhere. For the sake of this answer, I am going to assume that the first use of value is actually supposed to be singer.

@Fogmeister is correct as to the problem you are having with regard to the error messages. You have to have a View and your code did not supply one. However, you remain with another, deeper problem with trying to use reduce()

The problem you are having using reduce() is the way you have your CoreData model set up. You have an Entity entitled FeedList that has attributes of date and feedquant.

When you get in to your ForEach, singer is a collection, but value is an individual managed object that is simply a String and a Double, but you have to use reduce on a collection, not a Double.

You need to rethink your CoreData implementation. Personally, I would have two entities with a relationship. The first would be Date and the second would be FeedQuant. FeedQuant would have an attribute value which was a Double. I would then make a one to many relationship between Date and Feedquant. From Date I would call it hasFeedquant, and the inverse would be hasDate.

Your FetchRequest would remain the same except the entity would be a Date entity, but now you would get a relationship hasFeedquant that is a collection, and you could then use reduce on that.

Your code would then be:

struct FilteredList: View {
var fetchRequest: FetchRequest<FeedList>
var feedDate: FetchedResults<Date> {
fetchRequest.wrappedValue
}
var total: Double = 0.0

var body: some View {
ForEach(feedDate, id: \.self) {value in
// value.hasFeedquant is a Set<Feedquant> and so is reducible
Text("Feed quantity is: \(value.hasFeedquant.reduce(0) { $0 + $1.value)") }
}
}

init(filter: String) {
fetchRequest = FetchRequest<Date>(
entity: FeedList.entity(), sortDescriptors: [NSSortDescriptor(keyPath: \Date.compdate, ascending: false)],
predicate: NSPredicate(format: "compdate == %@", filter)
)
}
} //: View

How to escape quotation ' in object filter Swift 4

Try filtering using an NSPredicate object, and also I think BEGINSWITH would be easier to use than LIKE in this case. See the Realm documentation for examples and links to additional documentation and cheatsheets for NSPredicate use.

let predicate = NSPredicate(format: "word BEGINSWITH %@", Phrase)
let Words = realm.objects(Word.self).filter(predicate).sorted(byKeyPath: "word", ascending: true)

Filter vs For, which one is cheaper to use?

you already answer you're question.
if you do this split by filter its take more time because every item in array will be checked. I recommend u to read about "Time complexity".
imagine that you have an array of a million or more elements, which means that when using the filter, the processor will go through 2 million operations, and the usual run is only 1 million. and if the condition is more complicated then you can square the number of operations
so the best way in this case will be

customArray.forEach{$0 % 2 == 0 ? evenArrayV1.append($0) : oddArrayV1.append($0)}

and be like swift, use forEach instead of forin



Related Topics



Leave a reply



Submit