Dictionary in Swift With Mutable Array as Value Is Performing Very Slow? How to Optimize or Construct Properly

Dictionary in Swift with Mutable Array as value is performing very slow? How to optimize or construct properly?

Copy on write is a tricky thing, and you need to think carefully about how many things are sharing a structure that you're trying to modify. The culprit is here.

countToColorMap[colorCount]?.append(CountedColor(color: color as! UIColor, colorCount: colorCount))

This is generating a temporary value that is modified and put back into the dictionary. Since two "things" are looking at the same underlying data structure (the dictionary, and append), it forces a copy-on-write.

The secret to fixing this is to make sure that there's only one copy when you modify it. How? Take it out of the dictionary. Replace this:

if countToColorMap[colorCount] != nil {
countToColorMap[colorCount]?.append(CountedColor(color: color as! UIColor, colorCount: colorCount))
} else {
countToColorMap[colorCount] = [CountedColor(color: color as! UIColor, colorCount: colorCount)]
}

which has a runtime of:

Elapsed Time: 74.2517465990022
53217

with this:

var countForColor = countToColorMap.removeValue(forKey: colorCount) ?? []
countForColor.append(CountedColor(color: color as! UIColor, colorCount: colorCount))
countToColorMap[colorCount] = countForColor

which has a runtime of:

Elapsed Time: 0.370953808000195
53217

Swift enormous dictionary of arrays, very slow

This is fairly common performance trap, as also observed in:

  • Dictionary in Swift with Mutable Array as value is performing very slow? How to optimize or construct properly?
  • Swift semantics regarding dictionary access

The issue stems from the fact that the array you're mutating in the expression self.map[term]!.append(...) is a temporary mutable copy of the underlying array in the dictionary's storage. This means that the array is never uniquely referenced and so always has its buffer re-allocated.

This situation will fixed in Swift 5 with the unofficial introduction of generalised accessors, but until then, one solution (as mentioned in both the above Q&As) is to use Dictionary's subscript(_:default:) which from Swift 4.1 can mutate the value directly in storage.

Although your case isn't quite a straightforward case of applying a single mutation, so you need some kind of wrapper function in order to allow you to have scoped access to your mutable array.

For example, this could look like:

class X {

private var map: [String: [Posting]] = [:]

private func withPostings<R>(
forTerm term: String, mutations: (inout [Posting]) throws -> R
) rethrows -> R {
return try mutations(&map[term, default: []])
}

func addTerm(_ term: String, withId id: Int, atPosition position: Int) {

withPostings(forTerm: term) { postings in
if let posting = postings.last, posting.documentId == id {
posting.addPosition(position)
} else {
postings.append(Posting(withId: id, atPosition: position, forTerm: term))
}
}

}
// ...
}

Why is Swift Dictionary MUCH Slower than Array during Memoization

Part of your problem is that your dictionary implementation wipes out previous values that exist for a given amount n every time you add a new value with that amount. You should be adding the value to the inner dictionary instead of replacing the inner dictionary with a new one containing only the new value.

Try this:

func memoize(result:Int, amount n:Int, numberOfCoins size:Int) {
// Get inner dictionary or use an empty one if there isn't one yet
var inner = memDict[n] ?? [:]

// Add the value to the inner dictionary
inner[size] = result

// Replace inner dictionary with the updated one
memDict[n] = inner
}

Alternatively, you can do it like this:

func memoize(result:Int, amount n:Int, numberOfCoins size:Int) {
if memDict[n] != nil {
memDict[n]![size] = result
} else {
memDict[n] = [size : result]
}
}

It is left as an exercise to the reader to figure out which one is faster.

Swift arrays extremely slow (when used in Dictionaries)

Well this issue is mainly cause because of the .append function that it has to create the location and then fill it up,

you can make that slightly faster if you have an idea of the size of the array by giving it a size and that will allocate the space for it, instead of creating allocating and then filling it up
trying this code gave me a slightly faster result.

var arr = [Int].init(repeating: 0, count: 1_000_000)

for i in 0..<1_000_000 {
arr[i] = i
}

This was using a playground, however playground are extremely slow performance compared to a command-line tool or an actual project,
but however this code is slightly faster on playground.

Swift's Dictionary is slow even with -Ofast

TL;DR It's Beta.

I would think that the answer right now is just that Swift is in beta, the tools are in beta, and a lot of optimizations are yet to be done. Replicating your "Holder" class example in Obj-C shows that even it is quite a bit faster at the same -Ofast level.

@import Foundation;

@interface Holder : NSObject

@property NSMutableDictionary *dictionary;
- (void)storeValue:(NSInteger)value forKey:(NSString *)key;

@end

@implementation Holder

- (instancetype)init {
self = [self initWithDict];
return self;
}


- (instancetype)initWithDict {
if (!self) {
self = [super init];
_dictionary = [NSMutableDictionary dictionary];
}

return self;
}

- (void)storeValue:(NSInteger)value forKey:(NSString *)key {
[self.dictionary setValue:@(value) forKey:key];
}

@end

int main(int argc, const char * argv[]) {

Holder *holder = [Holder new];

for (NSInteger i = 0; i < 5000; i++) {
[holder storeValue:i forKey:[NSString stringWithFormat:@"%ld", i]];
}

}

The Obj-C is fast out of the gate.

time ./loop 

real 0m0.013s
user 0m0.006s
sys 0m0.003s

The similarities in time to the NoHolder example you give is a good indication at how much optimization the Obj-C compiler is doing.

Taking a look at the assembly for the -O3 and -Ofast levels in Swift show there is a big difference in the amount of safety checking done. Looking at the Obj-C assembly shows that, well, there is just a lot less of it to be performed. Since the key to making a program fast is to make it not need to do much…

OS-X-Dos-Equis:~ joshwisenbaker$ wc -l objc.txt 
159 objc.txt
OS-X-Dos-Equis:~ joshwisenbaker$ wc -l oFast.txt
3749 oFast.txt

(Edit: Update with results of finalizing the Holder class.)

So another interesting wrinkle is the use of the @final decoration on the class definition. If you know that your class is never going to be subclassed then try adding the keyword like this: @final class Holder

As you can see it also normalizes the performance when compiled the same way.

OS-X-Dos-Equis:~ joshwisenbaker$ swift -sdk $(xcrun --show-sdk-path --sdk macosx) -Ofast bench.swift && time ./bench

real 0m0.013s
user 0m0.007s
sys 0m0.003s

Even using just -O3 the @final works magic.

OS-X-Dos-Equis:~ joshwisenbaker$ swift -sdk $(xcrun --show-sdk-path --sdk macosx) -O3  bench.swift && time ./bench

real 0m0.015s
user 0m0.009s
sys 0m0.003s

Again, I think the differences you are seeing in performance is probably down to the current optimization levels when compiled.

How to use a mutable array as dictionary value in a performant way?

Note that both Swift array and NSMutableArray under the hood kinda do the same things: when they have no more allocated space for new items, they allocate a new buffer. And the new buffer allocation logic should be fairly similar in both Swift and Objective-C.

The problem with your Swift array code, as Alexander pointed out in his comment, is the fact that you create a local copy of the array that is stored into the dictionary. And as soon as you mutate that local array, the copy-on-write mechanism triggers a copy of the array buffer, which results into another heap allocation. And the bigger the array, the more it takes to duplicate the buffer.

The nonlinear growth you see is due to the fact that the array buffer increases don't occur that often, as the array implementations have a pretty good heuristic for the capacity increase, in order to avoid allocations every time a new element is added. However the buffer duplication due to copy-on-write happens at every iteration.

You will get better results with the Swift arrays if you mutate them in place:

for i in 0..<1_000_000 {
let key = i % 7
dict[key, default: [Int]()].append(i)
}

Both measurements and Instruments show a dramatically improved performance with the above change, with the Swift code being faster than the Objective-C one.

Swiftier Swift for 'add to array, or create if not there...'

Swift 4 update:

As of Swift 4, dictionaries have a subscript(_:default:) method, so that

dict[key, default: []].append(newElement)

appends to the already present array, or to an empty array. Example:

var dict: [String: [Int]] = [:]
print(dict["foo"]) // nil

dict["foo", default: []].append(1)
print(dict["foo"]) // Optional([1])

dict["foo", default: []].append(2)
print(dict["foo"]) // Optional([1, 2])

As of Swift 4.1 (currently in beta) this is also fast,
compare Hamish's comment here.


Previous answer for Swift <= 3: There is – as far as I know – no way to "create or update" a dictionary
value with a single subscript call.

In addition to what you wrote, you can use the nil-coalescing operator

dict[key] = (dict[key] ?? []) + [elem]

or optional chaining (which returns nil if the append operation
could not be performed):

if dict[key]?.append(elem) == nil {
dict[key] = [elem]
}

As mentioned in SE-0154 Provide Custom Collections for Dictionary Keys and Values and also by @Hamish in the comments, both methods
make a copy of the array.

With the implementation of SE-0154 you will be able to mutate
a dictionary value without making a copy:

if let i = dict.index(forKey: key) {
dict.values[i].append(elem)
} else {
dict[key] = [key]
}

At present, the most efficient solution is given by Rob Napier
in Dictionary in Swift with Mutable Array as value is performing very slow? How to optimize or construct properly?:

var array = dict.removeValue(forKey: key) ?? []
array.append(elem)
dict[key] = array

A simple benchmark confirms that "Rob's method" is the fastest:

let numKeys = 1000
let numElements = 1000

do {
var dict: [Int: [Int]] = [:]

let start = Date()
for key in 1...numKeys {
for elem in 1...numElements {
if dict.index(forKey: key) == nil {
dict[key] = []
}
dict[key]!.append(elem)

}
}
let end = Date()
print("Your method:", end.timeIntervalSince(start))
}

do {
var dict: [Int: [Int]] = [:]

let start = Date()
for key in 1...numKeys {
for elem in 1...numElements {
dict[key] = (dict[key] ?? []) + [elem]
}
}
let end = Date()
print("Nil coalescing:", end.timeIntervalSince(start))
}


do {
var dict: [Int: [Int]] = [:]

let start = Date()
for key in 1...numKeys {
for elem in 1...numElements {
if dict[key]?.append(elem) == nil {
dict[key] = [elem]
}
}
}
let end = Date()
print("Optional chaining", end.timeIntervalSince(start))
}

do {
var dict: [Int: [Int]] = [:]

let start = Date()
for key in 1...numKeys {
for elem in 1...numElements {
var array = dict.removeValue(forKey: key) ?? []
array.append(elem)
dict[key] = array
}
}
let end = Date()
print("Remove and add:", end.timeIntervalSince(start))
}

Results (on a 1.2 GHz Intel Core m5 MacBook) for 1000 keys/1000 elements:


Your method: 0.470084965229034
Nil coalescing: 0.460215032100677
Optional chaining 0.397282958030701
Remove and add: 0.160293996334076

And for 1000 keys/10,000 elements:


Your method: 14.6810429692268
Nil coalescing: 15.1537700295448
Optional chaining 14.4717089533806
Remove and add: 1.54668599367142

How to change value of an element in dictionary of arrays?

If you know the index of the number you would like to change out of the three, you can change the number 3 directly using the subscripts ["xx"]?["x1"]?[2].

var myArray = [
"xx": [
"x1": [1, 2, 3],
"x2": [4, 5, 6],
"x3": [7, 8, 9]
],
"yy": [
"y1": [10, 11, 12],
"y2": [13, 14, 15],
"y3": [16, 17, 18]
]
]

array["xx"]?["x1"]?[2] = 4


Related Topics



Leave a reply



Submit