When Creating Thread Safe Reads in Swift, Why Is a Variable Create Outside The Concurrent Queue

When creating thread safe reads in Swift, why is a variable create outside the concurrent queue?

There is no difference. There are two DispatchQueue.sync methods:

public func sync(execute block: () -> Swift.Void)
public func sync<T>(execute work: () throws -> T) rethrows -> T

and in your first example the second one is used: The closure can return a value, which then becomes the return
value of the sync call. Therefore in

get { 
return self.concurrentQueue.sync { return self._name }
}

the return value of sync { ... } is self._name and that is returned
from the getter method. This is equivalent to (but simpler than) storing the
value in a temporary variable (and here the closure returns Void):

get { 
var result: String!
self.concurrentQueue.sync { result = self._name }
return result
}

Of course that works only with synchronously dispatched closures,
not with asynchronous calls. These are stored for later execution and must
return Void:

public func async(..., execute work: @escaping @convention(block) () -> Swift.Void)

Swift weird behavior with multithreading dispatch queues

This is not strange or weired, but intended.
A closure is capturing its surrounding variables by reference, even if they are value types.
What happens is:

  • Before the first block is executed, multiple closures are created and added by queue.async
  • Each closure references the same index variable.
  • When (after some milliseconds) the closures are executed in the queue, the then-valid index value is outputted

If you do not want this behaviour, you could either

  • copy the index value to a local variable outside the closure and use that or
  • use a capture clause like:
queue.async {
[index] in
fun(index)
dispatchGroup.leave()
}

See e.g. https://www.marcosantadev.com/capturing-values-swift-closures/

Swift - Is setting variable property concurrency or multithreading safe?

First off

Class value are stored in heap memory while struct/enum value are stored in stack memory and compiler will try to allocate those memory at compile time (according to my first ref and many online answer). You can check using this code:

class MemTest {}
class Node {
var data = MemTest()
}
let node = Node()
let concurrentQueue = DispatchQueue(label: "queue", attributes: .concurrent)

for index in 0...100000 {
concurrentQueue.async {
node.data = MemTest()
withUnsafePointer(to: &node.data) {
print("Node data @ \($0)")
}
withUnsafePointer(to: node.data) {
print("Node data value no. \(index) @ \($0)")
}
}

How to: run 2 time and check memory address for value changed time 500, switch MemTest between class and struct will show the difference. Struct will show the same while class will show different address between each time.
So changing value type is just like changing address but the the memory blocks will not be restored while changing reference type is not just changing address but also restore and allocate new memory blocks which will cause the program to break.

Secondly

But if running first code block of @trungduc with withUnsafePointer, it will show us that the index var of for loop is allocated on the go and in the heap memory, so why is that?
As mentioned before, compiler will only try to allocate the mem if the value type is calculable at compile time. If the they are not calculable, the value will be allocated in heap memory and stay there till the end of scope (according to my second ref). So may be the explanation here is the system will restored the allocated stack after everything is done - end of scope (This I'm not very sure). So in this case the code will not produce a crash as we know.
My conclusion, the reference type variable's mem will be allocated and restored with no restriction in this case whereas value type variable's mem will be allocated and removed only after the system enter and exit a the scope which contains said variable

Thanks

  • @Gokhan Topcu for Check section "Memory Management"
  • @Bruno Rocha for Check section "Heap Allocated Value Types"

Some words

My answer is not very solid and might have lots of grammar and spelling error. All update are appreciated. Thanks in advance

Update

For the part I'm not very sure:
variable index was copied to a new memory address with operator = so it doesn't matter where the scope end, stack will be released after for loop
After some digging, in @trungduc's code, with reference type variable, it will do 3 things:

  1. Allocate new memory for class Data
  2. Reduce reference to old Data stored in node.data, even free old Data if it's no longer referenced
  3. Point node.data to new Data

While for value type it will do 1 thing only:

  1. Point node.data to Integer in stack memory
    The major difference is in step 2 where there is a chance the old Data memory is restored
    There are possibilities where this scenario will happen with reference type
________Task 1________|________Task 2________
Allocate new Data #1 |
|Allocate new Data #2
Load pointer to old |
Data |
Reduce reference count|
to old Data |
|Load pointer to old
|Data
Free old Data |
|Reduce reference count
|to old Data (!)
|Free old Data (!)
Reference new Data #1 |
|Reference new Data #2

while with value type, this will happen

________Task 1________|________Task 2________
Reference to Integer 1|
|Reference to Integer 2

In the first case we will have various alternative scenarios but in most case, we get a segmentation fault because thread 2 tries to dereference that pointer after thread 1 free it. There might be other issues like memory leaking as we notice, thread 2 might not reduce reference count to Data #1 correctly
Whereas in second case, it's just changing the pointer.

Note

In second case, it will never cause crash on Intel CPU but not guaranteed on other CPUs as well because many CPUs do not promise that doing this will not cause a crash.

How can we make 'static' variables Thread-Safe in swift?

Initialization of static variable is thread-safe. But if the object, itself, is not thread-safe, must synchronize your interaction with it from multiple threads (as you must with any non-thread-safe object, whether static or not).

At the bare minimum, you can make your exposed property a computed property that synchronizes access to some private property. For example:

class MyClass {
private static let lock = NSLock()
private static var _name: String = "Hello"

static var name: String {
get { lock.withCriticalSection { _name } }
set { lock.withCriticalSection { _name = newValue } }
}
}

Where

extension NSLocking {
func withCriticalSection<T>(block: () throws -> T) rethrows -> T {
lock()
defer { unlock() }
return try block()
}
}

Or you can use GCD serial queue, reader-writer, or a variety of other mechanisms to synchronize, too. The basic idea would be the same, though.

That having been said, it’s worth noting that this sort of property accessor synchronization is insufficient for mutable types. A higher level of synchronization is needed.

Consider:

let group = DispatchGroup()

DispatchQueue.global().async(group: group) {
for _ in 0 ..< 100_000 {
MyClass.name += "x"
}
}

DispatchQueue.global().async(group: group) {
for _ in 0 ..< 100_000 {
MyClass.name += "y"
}
}

group.notify(queue: .main) {
print(MyClass.name.count)
}

You’d think that because we have thread-safe accessors that everything is OK. But it’s not. This will not add 200,000 characters to the name. You’d have to do something like:

class MyClass {
private static let lock = NSLock()
private static var _name: String = ""

static var name: String {
get { lock.withCriticalSection { _name } }
}

static func appendString(_ string: String) {
lock.withCriticalSection {
_name += string
}
}
}

And then the following works:

let group = DispatchGroup()

DispatchQueue.global().async(group: group) {
for _ in 0 ..< 100_000 {
MyClass.appendString("x")
}
}

DispatchQueue.global().async(group: group) {
for _ in 0 ..< 100_000 {
MyClass.appendString("y")
}
}

group.notify(queue: .main) {
print(MyClass.name.count)
}

The other classic example is where you have two properties that related to each other, for example, maybe firstName and lastName. You cannot just make each of the two properties thread-safe, but rather you need to make the single task of updating both properties thread-safe.

These are silly examples, but illustrate that sometimes a higher level of abstraction is needed. But for simple applications, the synchronizing the computed properties’ accessor methods may be sufficient.


As a point of clarification, while statics, like globals, are instantiated lazily, standard stored properties bearing the lazy qualifier are not thread-safe. As The Swift Programming Language: Properties warns us:

If a property marked with the lazy modifier is accessed by multiple threads simultaneously and the property hasn’t yet been initialized, there’s no guarantee that the property will be initialized only once.

Swift: thread-safe initialization of the reference type stored property

Is there a difference between initializing property inline vs. in the init?

No, there's no meaningful difference between assigning to a property inside of an init or providing it a default value outside of an init. Default properties are assigned immediately before initializers are called, so

class X {
var y = Y()
var z: Z

init(z: Z) {
self.z = z
}
}

is conceptually equivalent to

class X {
var y: Y
var z: Z

func _assignDefaultValues() {
y = Y()
}

init(z: Z) {
_assignDefaultValues()
self.z = z
}
}

which is equivalent to

class X {
var y: Y
var z: Z

init(z: Z) {
y = Y()
self.z = z
}
}

In other words, by the time the end of an init(...) is reached, all stored properties must be fully initialized, and there is no difference between having initialized them with a default value, or explicitly.



Is the init of the property itself thread-safe?

Teasing this apart, I believe there are two components to this question:

  1. "By the time init() returns, is b.a guaranteed to be assigned to?", and
  2. "If so, is the assignment guaranteed to be done in a way that other threads reading the value will be guaranteed to read a value that matches the assigned value, and that matches what other threads see?", i.e., reading the value without tearing?

The answer to (1) is yes. The Swift language guide covers the specifics in great detail, but has this to say specifically:

Classes and structures must set all of their stored properties to an appropriate initial value by the time an instance of that class or structure is created. Stored properties can’t be left in an indeterminate state.

This means that by the time you are able to read a b out of

let b = B()

b.a must have been assigned a valid A value.

The answer to (2) is a bit more nuanced. Typically, Swift does not guarantee thread-safe or atomic behavior in the default case, and there is no documentation that I could find, or references in the Swift source code which indicate that Swift make any promises as to atomic assignment to Swift properties during initialization. Although it's impossible to prove a negative, I think it's relatively safe to say that Swift does not guarantee that you get consistent behavior across threads without explicit synchronization.

However, what is guaranteed is that for the lifetime of b, it has a stable address in memory, and at that, b.a will have a stable address as well. At least part of the reason that your original code snippet appears to work in this specific case is that

  1. All threads are reading from the same address in memory,
  2. On many (most?) platforms that Swift supports, word-size (32 bits on 32-bit platforms; 64 bits on 64-bit platforms) reads and writes are atomic, and not susceptible to tearing (reading only part of a value out of a variable before another part is written to it) — and pointers in Swift are word sized. This does not guarantee that reads and writes will be synchronized across threads as you expect, but you won't get invalid addresses this way. But,
  3. Your code creates and assigns b.a before the other threads are ever spawned, which means that the assignment to b.a is much more likely to "go through" before they ever read from that memory

If you were to start assigning to b.a after spawning the concurrentPerform(iterations:), then all bets would be off because you'd have unsynchronized reads and writes interleaving in unexpected ways.

In general:

  1. Creating read-only data and passing it off to multiple threads isn't safe, but will typically work as expected in practice (but should not be relied upon!),
  2. Creating read-write data and passing off references to multiple threads isn't safe, and concurrent mutations also will not work as expected, and
  3. If you need a guarantee for safe atomic handling of variables and synchronization, it's recommended you use synchronization mechanisms like locks or atomics (e.g. from the official swift-atomics) package

When in doubt, too, it's recommended you run your code through the sanitizer tools offered by LLVM through Xcode — specifically, the Address Sanitizer to catch any memory-related issues, and in this case, the Thread Sanitizer as well, to help capture synchronization issues and race conditions. While not perfect, these tools can help give confidence that your code is correct.

is GCD really Thread-Safe?

You said:

I have studied GCD and Thread-Safe. In apple document, GCD is Thread-Safe that means multiple thread can access. And I learned meaning of Thread-Safe that always give same result whenever multiple thread access to some object.

They are saying the same thing. A block of code is thread-safe only if it is safe to invoke it from different threads at the same time (and this thread safety is achieved by making sure that the critical portion of code cannot run on one thread at the same time as another thread).

But let us be clear: Apple is not saying that if you use GCD, that your code is automatically thread-safe. Yes, the dispatch queue objects, themselves, are thread-safe (i.e. you can safely dispatch to a queue from whatever thread you want), but that doesn’t mean that your own code is necessarily thread-safe. If one’s code is accessing the same memory from multiple threads concurrently, one must provide one’s own synchronization to prevent writes simultaneous with any other access.

In the Threading Programming Guide: Synchronization, which predates GCD, Apple outlines various mechanisms for synchronizing code. You can also use a GCD serial queue for synchronization. If you using a concurrent queue, one achieves thread safety if you use a “barrier” for write operations. See the latter part of this answer for a variety of ways to achieve thread safety.

But be clear, Apple is not introducing a different definition of “thread-safe”. As they say in that aforementioned guide:

When it comes to thread safety, a good design is the best protection you have. Avoiding shared resources and minimizing the interactions between your threads makes it less likely for those threads to interfere with each other. A completely interference-free design is not always possible, however. In cases where your threads must interact, you need to use synchronization tools to ensure that when they interact, they do so safely.

And in the Concurrency Programming Guide: Migrating Away from Threads: Eliminating Lock-Based Code, which was published when GCD was introduced, Apple says:

For threaded code, locks are one of the traditional ways to synchronize access to resources that are shared between threads. ... Instead of using a lock to protect a shared resource, you can instead create a queue to serialize the tasks that access that resource.

But they are not saying that you can just use GCD concurrent queues and automatically achieve thread-safety, but rather that with careful and proper use of GCD queues, one can achieve thread-safety without using locks.


By the way, Apple provides tools to help you diagnose whether your code is thread-safe, namely the Thread Sanitizer (TSAN). See Diagnosing Memory, Thread, and Crash Issues Early.

Serial vs concurrent blocking main queue in similar fashion

let queue = DispatchQueue(label: "queue_label", attributes: .concurrent) // concurrent queue

If this queue has multiple items queued, then it may use multiple threads to run them in parallel. It does not promise that it will, but it may. And it will only do so if it has multiple items queued at the same time.

for i in 1..<11 {
queue.sync { ... }
}

This loop queues a single item, blocks until the item is scheduled and completed, and then queues another item. If this is all the code, then at no point are there multiple items on the queue. Of course if there is other code running in parallel, that enqueues items on queue, then there may be parallel items running.

As written, this code is legal, but seems pretty useless. If printNumber is time consuming, and this is on the main queue, it could crash the app (or at least beachball it on Mac).

Nothing you've done here is a "read-write" problem, so I don't think that's related. Queues can be used to do all kinds of things. In this particular example, there doesn't seem to be any reason to use the queue at all, so I would delete that code. If you have a different problem, you can open a question asking about that.

dispatch_async update variable crash - thread safety

yeah, Array is not thread safe, so when write to array, should ensure atomic.

So you can add heigh performance lock : dispatch_semaphore_t.

func method(completion: (inner: () throws -> String)->Void){
// add lock
let lock: dispatch_semaphore_t = dispatch_semaphore_create(1)
let group:dispatch_group_t = dispatch_group_create();
let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
var res = Array<Int>()
for i in 0 ..< 5 {
dispatch_group_async(group,queue){
// lock
dispatch_semaphore_wait(lock, DISPATCH_TIME_FOREVER)
res.append(i)
// unlock
dispatch_semaphore_signal(lock)
var s = 0
for k in 0..<1000 {
s = 2+3
}
}
}

dispatch_group_wait(group, DISPATCH_TIME_FOREVER);

print("All background tasks are done!!");
print(res)
}

But be careful, if your async task is not time-consuming operation like the above, don't use multi-thread, because thread schedule is time-consuming and could lead to performance loss.



Related Topics



Leave a reply



Submit