Swift: Always Copies on Inout

Swift: always copies on inout?

I'm posting this on behalf of Joe Groff, a Swift compiler developer, on Twitter (See replies). He was very nice in answering my Tweet mentioning this question.

He says:

Inout has value-result semantics. The didSet receives the modified value at the end of the inout. It only optimizes to pass-by-reference if the difference is unobservable (modulo invalid aliasing). The Swift book is supposed to be updated with this info too.

Swift inout how to not copy back property when not changed, to not trigger objects setters

This is how inout works. You can't change that. inout literally means "copy the value into the function at the start and copy the value out of the function at the end." It doesn't do any analysis to decide whether the value was touched at runtime.

One solution is to check for trivial sets in the observer, for example:

var someAttr: String? {
    didSet {
        guard someAttr != oldValue else { return }
        ...
    }
}

As another approach, I suggest keypaths. Assuming that the database object is a reference type (class), I believe the following will do what you want:

func importStringAttribute(_ json: JSON, _ key: String, db: Database,
                           attr: ReferenceWritableKeyPath<Database, String?>) {
    if !json[key].exists() {
        return
    }
    if let v = json[key].string, v != db[keyPath: attr] {
        db[keyPath: attr] = v
    }
}

The call is slightly longer because you need to pass the database itself:

importStringAttribute(json, "someAttr", db: myDBObject, attr: \.someAttr)

That could be made a little prettier by attaching the method to the database (though you still have to pass the database, just as self):

extension Database {
    func importStringAttribute(_ json: JSON, _ key: String,
                               _ attr: ReferenceWritableKeyPath<Database, String?>) {
        if !json[key].exists() {
            return
        }
        if let v = json[key].string, v != self[keyPath: attr] {
            self[keyPath: attr] = v
        }
    }

}

myDBObject.importStringAttribute(json, "someAttr", \.someAttr)

To your question about making this generic over types, that's very straightforward (I just added <Obj: AnyObject> and changed the references to "db" to "obj"):

func importStringAttribute<Obj: AnyObject>(_ json: JSON, _ key: String, obj: Obj,
                           attr: ReferenceWritableKeyPath<Obj, String?>) {
    if !json[key].exists() {
        return
    }
    if let v = json[key].string, v != obj[keyPath: attr] {
        obj[keyPath: attr] = v
    }
}

Swift memory address of outside instance of inout parameter same as copied instance

What's happening is that you're always printing the address of the copy variable inside the printAddress() function. You're not printing the address of the argument you passed in, even though that is what you intended.

The address of the copy variable is always some constant fixed offset past the stack pointer that is current when printAddress() is entered, but the stack pointer changes depending on how deeply nested your code is when printAddress() is called.

To see yet another value, make a function foo() that calls printAddress(), and call foo() from verify().

Again, it's always the memory address of the copy variable you see at the point in time that print() is called.

If you want to print the memory address of the thing passed to printAddress(), you'll need to get rid of the temporary:

func printAddress<T>(anyObj: inout T, message: String = "") {
    withUnsafePointer(to: &anyObj) {
        print("\(message) value \(anyObj) has memory address of: \($0)")
    }
}

Now call:

printAddress(anyObj: &acct, message: "message")

From anywhere and you'll see the same value.

When does the copying take place for swift value types

TL;DR:

So does it mean that the copying actually only takes placed when the passed value type is modified?

Yes!

Is there a way to demonstrate that this is actually the underlying behavior?

See the first example in the section on the copy-on-write optimization.

Should I just use NSArrray in this case or would the Swift Array work fine
as long as I do not try to manipulate the passed in Array?

If you pass your array as inout, then you'll have a pass-by-reference semantics,
hence obviously avoiding unnecessary copies.
If you pass your array as a normal parameter,
then the copy-on-write optimization will kick in and you shouldn't notice any performance drop
while still benefiting from more type safety that what you'd get with a NSArray.

Now as long as I do not explicitly make the variables in the function editable
by using var or inout, then the function can not modify the array anyway.
So does it still make a copy?

You will get a "copy", in the abstract sense.
In reality, the underlying storage will be shared, thanks to the copy-on-write mechanism,
hence avoiding unnecessary copies.

If the original array is immutable and the function is not using var or inout,
there is no point in Swift creating a copy. Right?

Exactly, hence the copy-on-write mechanism.

So what does Apple mean by the phrase above?

Essentially, Apple means that you shouldn't worry about the "cost" of copying value types,
as Swift optimizes it for you behind the scene.

Instead, you should just think about the semantics of value types,
which is that get a copy as soon as you assign or use them as parameters.
What's actually generated by Swift's compiler is the Swift's compiler business.

Value types semantics

Swift does indeed treat arrays as value types (as opposed to reference types),
along with structures, enumerations and most other built-in types
(i.e. those that are part of the standard library and not Foundation).
At the memory level, these types are actually immutable plain old data objects (POD),
which enables interesting optimizations.
Indeed, they are typically allocated on the stack rather than the heap ^[1],
(https://en.wikipedia.org/wiki/Stack-based_memory_allocation).
This allows the CPU to very efficiently manage them,
and to automatically deallocate their memory as soon as the function exits ^[2],
without the need for any garbage collection strategy.

Values are copied whenever assigned or passed as a function.
This semantics has various advantages,
such as avoiding the creation of unintended aliases,
but also as making it easier for the compiler to guarantee the lifetime of values
stored in a another object or captured by a closure.
We can think about how hard it can be to manage good old C pointers to understand why.

One may think it's an ill-conceived strategy,
as it involves copying every single time a variable is assigned or a function is called.
But as counterintuitive it may be,
copying small types is usually quite cheap if not cheaper than passing a reference.
After all, a pointer is usually the same size as an integer...

Concerns are however legitimate for large collections (i.e. arrays, sets and dictionaries),
and very large structures to a lesser extent ^[3].
But the compiler has has a trick to handle these, namely copy-on-write (see later).

What about `mutating`

Structures can define mutating methods,
which are allowed to mutate the fields of the structure.
This doesn't contradict the fact that value types are nothing more than immutable PODs,
as in fact calling a mutating method is merely a huge syntactic sugar
for reassigning a variable to a brand new value that's identical to the previous ones,
except for the fields that were mutated.
The following example illustrates this semantical equivalence:

struct S {
  var foo: Int
  var bar: Int
  mutating func modify() {
    foo = bar
  }
}

var s1 = S(foo: 0, bar: 10)
s1.modify()

// The two lines above do the same as the two lines below:
var s2 = S(foo: 0, bar: 10)
s2 = S(foo: s2.bar, bar: s2.bar)

Reference types semantics

Unlike value types, reference types are essentially pointers to the heap at the memory level.
Their semantics is closer to what we would get in reference-based languages,
such as Java, Python or Javascript.
This means they do not get copied when assigned or passed to a function, their address is.
Because the CPU is no longer able to manage the memory of these objects automatically,
Swift uses a reference counter to handle garbage collection behind the scenes
(https://en.wikipedia.org/wiki/Reference_counting).

Such semantics has the obvious advantage to avoid copies,
as everything is assigned or passed by reference.
The drawback is the danger of unintended aliases,
as in almost any other reference-based language.

What about `inout`

An inout parameter is nothing more than a read-write pointer to the expected type.
In the case of value types, it means the function won't get a copy of the value,
but a pointer to such values,
so mutations inside the function will affect the value parameter (hence the inout keyword).
In other terms, this gives value types parameters a reference semantics in the context of the function:

func f(x: inout [Int]) {
  x.append(12)
}

var a = [0]
f(x: &a)

// Prints '[0, 12]'
print(a)

In the case of reference types, it will make the reference itself mutable,
pretty much as if the passed argument was a the address of the address of the object:

func f(x: inout NSArray) {
  x = [12]
}

var a: NSArray = [0]
f(x: &a)

// Prints '(12)'
print(a)

Copy-on-write

Copy-on-write (https://en.wikipedia.org/wiki/Copy-on-write) is an optimization technique that
can avoid unnecessary copies of mutable variables,
which is implemented on all Swift's built-in collections (i.e. array, sets and dictionaries).
When you assign an array (or pass it to a function),
Swift doesn't make a copy of the said array and actually uses a reference instead.
The copy will take place as soon as the your second array is mutated.
This behavior can be demonstrated with the following snippet (Swift 4.1):

let array1 = [1, 2, 3]
var array2 = array1

// Will print the same address twice.
array1.withUnsafeBytes { print($0.baseAddress!) }
array2.withUnsafeBytes { print($0.baseAddress!) }

array2[0] = 1

// Will print a different address.
array2.withUnsafeBytes { print($0.baseAddress!) }

Indeed, array2 doesn't get a copy of array1 immediately,
as shown by the fact it points to the same address.
Instead, the copy is triggered by the mutation of array2.

This optimization also happens deeper in the structure,
meaning that if for instance your collection is made of other collections,
the latter will also benefit from the copy-on-write mechanism,
as demonstrated by the following snippet (Swift 4.1):

var array1 = [[1, 2], [3, 4]]
var array2 = array1

// Will print the same address twice.
array1[1].withUnsafeBytes { print($0.baseAddress!) }
array2[1].withUnsafeBytes { print($0.baseAddress!) }

array2[0] = []

// Will print the same address as before.
array2[1].withUnsafeBytes { print($0.baseAddress!) }

Replicating copy-on-write

It is in fact rather easy to implement the copy-on-write mechanism in Swift,
as some of the its reference counter API is exposed to the user.
The trick consists of wrapping a reference (e.g. a class instance) within a structure,
and to check whether that reference is uniquely referenced before mutating it.
When that's the case, the wrapped value can be safely mutated,
otherwise it should be copied:

final class Wrapped<T> {
  init(value: T) { self.value = value }
  var value: T
}

struct CopyOnWrite<T> {
  init(value: T) { self.wrapped = Wrapped(value: value) }
  var wrapped: Wrapped<T>
  var value: T {
    get { return wrapped.value }
    set {
      if isKnownUniquelyReferenced(&wrapped) {
        wrapped.value = newValue
      } else {
        wrapped = Wrapped(value: newValue)
      }
    }
  }
}

var a = CopyOnWrite(value: SomeLargeObject())

// This line doesn't copy anything.
var b = a

However, there is an import caveat here!
Reading the documentation for isKnownUniquelyReferenced we get this warning:

If the instance passed as object is being accessed by multiple threads simultaneously,
this function may still return true.
Therefore, you must only call this function from mutating methods
with appropriate thread synchronization.

This means the implementation presented above isn't thread safe,
as we may encounter situations where it'd wrongly assumes the wrapped object can be safely mutated,
while in fact such mutation would break invariant in another thread.
Yet this doesn't mean Swift's copy-on-write is inherently flawed in multithreaded programs.
The key is to understand what "accessed by multiple threads simultaneously" really means.
In our example, this would happen if the same instance of CopyOnWrite was shared across multiple threads,
for instance as part of a shared global variable.
The wrapped object would then have a thread safe copy-on-write semantics,
but the instance holding it would be subject to data race.
The reason is that Swift must establish unique ownership
to properly evaluate isKnownUniquelyReferenced ^[4],
which it can't do if the owner of the instance is itself shared across multiple threads.

Value types and multithreading

It is Swift's intention to alleviate the burden of the programmer
when dealing with multithreaded environments, as stated on Apple's blog
(https://developer.apple.com/swift/blog/?id=10):

One of the primary reasons to choose value types over reference types
is the ability to more easily reason about your code.
If you always get a unique, copied instance,
you can trust that no other part of your app is changing the data under the covers.
This is especially helpful in multi-threaded environments
where a different thread could alter your data out from under you.
This can create nasty bugs that are extremely hard to debug.

Ultimately, the copy-on-write mechanism is a resource management optimization that,
like any other optimization technique,
one shouldn't think about when writing code ^[5].
Instead, one should think in more abstract terms
and consider values to be effectively copied when assigned or passed as arguments.

^[1]
This holds only for values used as local variables.
Values used as fields of a reference type (e.g. a class) are also stored in the heap.

^[2]
One could get confirmation of that by checking the LLVM byte code that's produced
when dealing with value types rather than reference types,
but the Swift compiler being very eager to perform constant propagation,
building a minimal example is a bit tricky.

^[3]
Swift doesn't allow structures to reference themselves,
as the compiler would be unable to compute the size of such type statically.
Therefore, it is not very realistic to think of a structure that is so large
that copying it would become a legitimate concern.

^[4]
This is, by the way, the reason why isKnownUniquelyReferenced accepts an inout parameter,
as it's currently Swift's way to establish ownership.

^[5]
Although passing copies of value-type instances should be safe,
there's a open issue that suggests some problems with the current implementation
(https://bugs.swift.org/browse/SR-6543).

When to use inout parameters?

inout means that modifying the local variable will also modify the passed-in parameters. Without it, the passed-in parameters will remain the same value. Trying to think of reference type when you are using inout and value type without using it.

For example:

import UIKit

var num1: Int = 1
var char1: Character = "a"

func changeNumber(var num: Int) {
    num = 2
    print(num) // 2
    print(num1) // 1
}
changeNumber(num1)

func changeChar(inout char: Character) {
    char = "b"
    print(char) // b
    print(char1) // b
}
changeChar(&char1)

A good use case will be swap function that it will modify the passed-in parameters.

Swift 3+ Note: Starting in Swift 3, the inout keyword must come after the colon and before the type. For example, Swift 3+ now requires func changeChar(char: inout Character).

Why can't we pass const values by reference to inout functions in swift?

As to my understanding,

The 'inout' in swift does not actually send the reference of
the property to the function. When you change the value of the
property passed as 'inout', the swift on its own end checks if you
have changed the value and then performs that change to the actual
property on its own. So the only thing 'inout' does in swift is
changing the value of the actual property. So even if it allows you to send a constant as
'inout', you still won't be able to make any use of it, because of the very nature of the 'constant'. You just cant change the value of a 'constant', because its immutable.

Reference Link: https://docs.swift.org/swift-book/LanguageGuide/Functions.html

Edit:
As mentioned by @SouravKannanthaB in the comment above, swift can automatically optimize the inout by mimicking the behavior of pass by reference, but i don't believe one can force the optimization to take place.

Inout in swift and reference type

Suppose we wrote the hypothetical function you're talking about:

class C {}

func swapTwoC(_ lhs: C, rhs: C) {
    let originalLHS = lhs
    lhs = rhs
    rhs = originalLHS
}

The immediate problem is that lhs and rhs are immutable. To mutate them, we would need to make mutable copies:

func swapTwoC(_ lhs: C, rhs: C) {
    var lhs = lhs; var rhs = rhs
    let originalLHS = lhs
    lhs = rhs
    rhs = originalLHS
}

But now the problem is that we're mutating our copies, and not the original references our caller gave us.

More fundamentally, the issue is that when you pass a reference (to an instance of a class, which we call an object) to a function, the reference itself is copied (it behaves like a value type). If that function changes the value of the reference, it's only mutating it own local copy, as we saw.

When you have an inout C, and you pass in &myObject, what you're actually passing in is a reference to your reference to myObject. When the function arguments are copied, what's copied is this "ref to a ref". The function can then use that "ref to a ref" to assign a new value to the reference myObject the caller has

Swift closure capture and inout variables

It makes sense that this wouldn't update your success variable because your inout parameter is a parameter of foo, not of the closure itself. You get the desired behavior if you make the inout parameter a parameter of the closure:

var success = false
let closure = { (inout flag: Bool) -> () in
    flag = true
    print(flag)
}

closure(&success)  //prints "true"
print(success)     //prints "true"

This pattern also works with the function, too, as long as you keep the inout parameter a parameter of the closure:

func foo() -> ((inout Bool)->()) {
    return { flag in
        flag = true
        print (flag)
    }
}

var success = false
let closure = foo()

closure(&success)  //prints "true"
print(success)     //prints "true"

You also get the desired behavior if you use a reference type:

class BooleanClass: CustomStringConvertible {
    var value: Bool

    init(value: Bool) {
        self.value = value
    }

    var description: String { return "\(value)" }
}

func foo(flag: BooleanClass) -> (()->()) {
    return {
        flag.value = true
        print (flag)
    }
}

let success = BooleanClass(value: false)
let closure = foo(success)

closure()          //prints "true"
print(success)     //prints "true"

Is it possible to point class instance variables to `inout` parameters in Swift?

Short Answer

No.

Long Answer

There are other approaches that can satisfy this.

Binding Variables

In SwiftUI we use Binding Variables to do stuff like this. When the Binding variable updates, it also updates the bound variable. I'm not sure if it will work in Sprite Kit.

class X {
    var mySwitch: Binding<Bool>

    init(_ someSwitch: Binding<Bool>) {
        self.mySwitch = someSwitch 
    }

   func toggle() { mySwitch.wrappedValue.toggle() }
}

struct Y {
    @State var mySwitch: Bool = false
    lazy var switchHandler = X($mySwitch)
}

Callbacks

We can add a callback to X and call it on didSet of the boolean.

class X {
    var mySwitch: Bool {
        didSet { self.callback(mySwitch) } // hands the new value back to the call site in Y
    }

    let callback: (Bool) -> Void

    init(_ someSwitch: Bool, _ callback: @escaping (Bool) -> Void) {
        self.mySwitch = someSwitch
        self.callback = callback
    }

    func toggle() { mySwitch = !mySwitch } // explicitly set to trigger didSet
}

class Y {
    var mySwitch: Bool = false
    lazy var switchHandler = X(mySwitch) {
        self.mySwitch = $0 // this is where we update the local value
    }
}

Swift: Always Copies on Inout