Swift Semantics Regarding Dictionary Access

Swift semantics regarding dictionary access

dict["key"]?.change() // Copy

is semantically equivalent to:

if var value = dict["key"] {
value.change() // Copy
dict["key"] = value
}

The value is pulled out of the dictionary, unwrapped into a temporary, mutated, and then placed back into the dictionary.

Because there's now two references to the underlying buffer (one from our local temporary value, and one from the COWStruct instance in the dictionary itself) – we're forcing a copy of the underlying Buffer instance, as it's no longer uniquely referenced.

So, why doesn't

array[0].change() // No Copy

do the same thing? Surely the element should be pulled out of the array, mutated and then stuck back in, replacing the previous value?

The difference is that unlike Dictionary's subscript which comprises of a getter and setter, Array's subscript comprises of a getter and a special accessor called mutableAddressWithPinnedNativeOwner.

What this special accessor does is return a pointer to the element in the array's underlying buffer, along with an owner object to ensure that the buffer isn't deallocated from under the caller. Such an accessor is called an addressor, as it deals with addresses.

Therefore when you say:

array[0].change()

you're actually mutating the actual element in the array directly, rather than a temporary.

Such an addressor cannot be directly applied to Dictionary's subscript because it returns an Optional, and the underlying value isn't stored as an optional. So it currently has to be unwrapped with a temporary, as we cannot return a pointer to the value in storage.

In Swift 3, you can avoid copying your COWStruct's underlying Buffer by removing the value from the dictionary before mutating the temporary:

if var value = dict["key"] {
dict["key"] = nil
value.change() // No Copy
dict["key"] = value
}

As now only the temporary has a view onto the underlying Buffer instance.

And, as @dfri points out in the comments, this can be reduced down to:

if var value = dict.removeValue(forKey: "key") {
value.change() // No Copy
dict["key"] = value
}

saving on a hashing operation.

Additionally, for convenience, you may want to consider making this into an extension method:

extension Dictionary {
mutating func withValue<R>(
forKey key: Key, mutations: (inout Value) throws -> R
) rethrows -> R? {
guard var value = removeValue(forKey: key) else { return nil }
defer {
updateValue(value, forKey: key)
}
return try mutations(&value)
}
}

// ...

dict.withValue(forKey: "key") {
$0.change() // No copy
}

In Swift 4, you should be able to use the values property of Dictionary in order to perform a direct mutation of the value:

if let index = dict.index(forKey: "key") {
dict.values[index].change()
}

As the values property now returns a special Dictionary.Values mutable collection that has a subscript with an addressor (see SE-0154 for more info on this change).

However, currently (with the version of Swift 4 that ships with Xcode 9 beta 5), this still makes a copy. This is due to the fact that both the Dictionary and Dictionary.Values instances have a view onto the underlying buffer – as the values computed property is just implemented with a getter and setter that passes around a reference to the dictionary's buffer.

So when calling the addressor, a copy of the dictionary's buffer is triggered, therefore leading to two views onto COWStruct's Buffer instance, therefore triggering a copy of it upon change() being called.

I have filed a bug over this here. (Edit: This has now been fixed on master with the unofficial introduction of generalised accessors using coroutines, so will be fixed in Swift 5 – see below for more info).


In Swift 4.1, Dictionary's subscript(_:default:) now uses an addressor, so we can efficiently mutate values so long as we supply a default value to use in the mutation.

For example:

dict["key", default: COWStruct()].change() // No copy

The default: parameter uses @autoclosure such that the default value isn't evaluated if it isn't needed (such as in this case where we know there's a value for the key).


Swift 5 and beyond

With the unofficial introduction of generalised accessors in Swift 5, two new underscored accessors have been introduced, _read and _modify which use coroutines in order to yield a value back to the caller. For _modify, this can be an arbitrary mutable expression.

The use of coroutines is exciting because it means that a _modify accessor can now perform logic both before and after the mutation. This allows them to be much more efficient when it comes to copy-on-write types, as they can for example deinitialise the value in storage while yielding a temporary mutable copy of the value that's uniquely referenced to the caller (and then reinitialising the value in storage upon control returning to the callee).

The standard library has already updated many previously inefficient APIs to make use of the new _modify accessor – this includes Dictionary's subscript(_:) which can now yield a uniquely referenced value to the caller (using the deinitialisation trick I mentioned above).

The upshot of these changes means that:

dict["key"]?.change() // No copy

will be able to perform an mutation of the value without having to make a copy in Swift 5 (you can even try this out for yourself with a master snapshot).

Swift dictionary wrong access

A dictionary lookup always returns an optional, so you have to unwrap it before using:

println(a?.email) 

Suggested reading: Optionals and Dictionaries

Make a Swift dictionary where the key is Type ?

Unfortunately, it's currently not possible for metatype types to conform to protocols (see this related question on the matter) – so CellThing.Type does not, and cannot, currently conform to Hashable. This therefore means that it cannot be used directly as the Key of a Dictionary.

However, you can create a wrapper for a metatype, using ObjectIdentifier in order to provide the Hashable implementation. For example:

/// Hashable wrapper for a metatype value.
struct HashableType<T> : Hashable {

static func == (lhs: HashableType, rhs: HashableType) -> Bool {
return lhs.base == rhs.base
}

let base: T.Type

init(_ base: T.Type) {
self.base = base
}

func hash(into hasher: inout Hasher) {
hasher.combine(ObjectIdentifier(base))
}
// Pre Swift 4.2:
// var hashValue: Int { return ObjectIdentifier(base).hashValue }
}

You can then also provide a convenience subscript on Dictionary that takes a metatype and wraps it in a HashableType for you:

extension Dictionary {
subscript<T>(key: T.Type) -> Value? where Key == HashableType<T> {
get { return self[HashableType(key)] }
set { self[HashableType(key)] = newValue }
}
}

which could then use like so:

class CellThing {}
class A : CellThing {}
class B : CellThing {}

var recycle: [HashableType<CellThing>: [CellThing]] = [:]

recycle[A.self] = [A(), A(), A()]
recycle[B.self] = [B(), B()]

print(recycle[A.self]!) // [A, A, A]
print(recycle[B.self]!) // [B, B]

This should also work fine for generics, you would simply subscript your dictionary with T.self instead.


Unfortunately one disadvantage of using a subscript with a get and set here is that you'll incur a performance hit when working with dictionary values that are copy-on-write types such as Array (such as in your example). I talk about this issue more in this Q&A.

A simple operation like:

recycle[A.self]?.append(A())

will trigger an O(N) copy of the array stored within the dictionary.

This is a problem that is aimed to be solved with generalised accessors, which have been implemented as an unofficial language feature in Swift 5. If you are comfortable using an unofficial language feature that could break in a future version (not really recommended for production code), then you could implement the subscript as:

extension Dictionary {
subscript<T>(key: T.Type) -> Value? where Key == HashableType<T> {
get { return self[HashableType(key)] }
_modify {
yield &self[HashableType(key)]
}
}
}

which solves the performance problem, allowing an array value to be mutated in-place within the dictionary.

Otherwise, a simple alternative is to not define a custom subscript, and instead just add a convenience computed property on your type to let you use it as a key:

class CellThing {
// Convenience static computed property to get the wrapped metatype value.
static var hashable: HashableType<CellThing> { return HashableType(self) }
}

class A : CellThing {}
class B : CellThing {}

var recycle: [HashableType<CellThing>: [CellThing]] = [:]

recycle[A.hashable] = [A(), A(), A()]
recycle[B.hashable] = [B(), B()]

print(recycle[A.hashable]!) // [A, A, A]
print(recycle[B.hashable]!) // [B, B]

How to prevent from modifying a copy of dictionary

The issue is that MyClass is a reference type. When you copy the dictionary, it does, truly, make a new copy of the dictionary, but the new copy has references to the same instances of MyClass that the original dictionary has. Changes made to a copy of a reference to an instance of MyClass anywhere, whether it is inside a dictionary or any other value type, will be reflected in any other reference to that same instance of MyClass.

Basically, the dictionary is a value type, which means it has value semantics. But the values in the dictionary are reference types, so they have reference semantics. The only way around this is to create a dictionary with new instances of MyClass for every key in the dictionary. Or, as @EricD suggested, use structs instead of classes to get the value semantics that you want.

How to adapt dictionary's values for reading in swift?

According to your last edit:

struct AdaptedDict<Key: Hashable, Value> {

private var origin: UnsafeMutablePointer<[Key: Value]>
private let transform: (Value) -> Value

init(_ origin: inout [Key: Value], transform: @escaping (Value) -> Value) {
self.origin = UnsafeMutablePointer(&origin)
self.transform = transform
}

subscript(_ key: Key) -> Value? {
if let value = origin.pointee[key] {
return transform(value)
}
return nil
}

}

var origin = ["A": 10, "B": 20]
var adaptedDict = AdaptedDict(&origin) { $0 * 2 }
print(origin["A"], adaptedDict["A"])
origin["A"] = 20
print(origin["A"], adaptedDict["A"])

So basically you store the dictionary using the pointer.

Swift enormous dictionary of arrays, very slow

This is fairly common performance trap, as also observed in:

  • Dictionary in Swift with Mutable Array as value is performing very slow? How to optimize or construct properly?
  • Swift semantics regarding dictionary access

The issue stems from the fact that the array you're mutating in the expression self.map[term]!.append(...) is a temporary mutable copy of the underlying array in the dictionary's storage. This means that the array is never uniquely referenced and so always has its buffer re-allocated.

This situation will fixed in Swift 5 with the unofficial introduction of generalised accessors, but until then, one solution (as mentioned in both the above Q&As) is to use Dictionary's subscript(_:default:) which from Swift 4.1 can mutate the value directly in storage.

Although your case isn't quite a straightforward case of applying a single mutation, so you need some kind of wrapper function in order to allow you to have scoped access to your mutable array.

For example, this could look like:

class X {

private var map: [String: [Posting]] = [:]

private func withPostings<R>(
forTerm term: String, mutations: (inout [Posting]) throws -> R
) rethrows -> R {
return try mutations(&map[term, default: []])
}

func addTerm(_ term: String, withId id: Int, atPosition position: Int) {

withPostings(forTerm: term) { postings in
if let posting = postings.last, posting.documentId == id {
posting.addPosition(position)
} else {
postings.append(Posting(withId: id, atPosition: position, forTerm: term))
}
}

}
// ...
}

Does adding a key value pair to a dictionary change the pointer's address?

Dictionaries in Swift are value, not reference, type like NSDictionary. Updates to the copy won't be reflected in the original. Here's a minimal example:

var a = ["name": "John", "location": "Chicago"]
var b = a

b["title"] = "developer"

print(a) // a does not contain the 'title' key
print(b)

You need to update the original after updating the copy. You can delve into stuffs like UnsafeMutablePointer<T> but it's a dark road down there.



Related Topics



Leave a reply



Submit