Process Array in Parallel Using Gcd

Process Array in parallel using GCD

This is a slightly different take on the approach in @Eduardo's answer, using the Array type's withUnsafeMutableBufferPointer<R>(body: (inout UnsafeMutableBufferPointer<T>) -> R) -> R method. That method's documentation states:

Call body(p), where p is a pointer to the Array's mutable contiguous storage. If no such storage exists, it is first created.

Often, the optimizer can eliminate bounds- and uniqueness-checks within an array algorithm, but when that fails, invoking the same algorithm on body's argument lets you trade safety for speed.

That second paragraph seems to be exactly what's happening here, so using this method might be more "idiomatic" in Swift, whatever that means:

func calcSummary() {
let group = dispatch_group_create()
let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)

self.summary.withUnsafeMutableBufferPointer {
summaryMem -> Void in
for i in 0 ..< 10 {
dispatch_group_async(group, queue, {
let base = i * 50000
for x in base ..< base + 50000 {
summaryMem[i] += self.array[x]
}
})
}
}

dispatch_group_notify(group, queue, {
println(self.summary)
})
}

How to process an array IN PARALLEL

If you can target iOS 4+ or Mac OS X 10.6+, use Grand Central Dispatch (and I'm using a category because I think they're cool):

#import <dispatch/dispatch.h>

@interface NSArray (DDAsynchronousAdditions)

- (void) makeObjectsPerformSelectorAsynchronously:(SEL)selector;
- (void) makeObjectsPerformSelectorAsynchronously:(SEL)selector withObject:(id)object;

@end

@implementation NSArray (DDAsynchronousAdditions)

- (void) makeObjectsPerformSelectorAsynchronously:(SEL)selector {
[self makeObjectsPerformSelectorAsynchronously:selector withObject:nil];
}

- (void) makeObjectsPerformSelectorAsynchronously:(SEL)selector withObject:(id)object {
for (id element in self) {
dispatch_async(dispatch_get_global_queue(0,0), ^{
[element performSelector:selector withObject:object];
});
}
}

@end

//elsewhere:

[myArray makeObjectsPerformSelectorAsynchronously:@selector(doFoo)];

Or, if you don't want to use a category...

[myArray enumerateObjectsUsingBlock:^(id obj, NSUInteger index, BOOL * stop) {
dispatch_async(dispatch_get_global_queue(0,0), ^{
[obj performSelector:@selector(doFoo)];
};
}];

Using Grand Central Dispatch in Swift to parallelize and speed up “for loops?

A "multi-threaded iteration" can be done with dispatch_apply():

let outerCount = 100    // # of concurrent block iterations
let innerCount = 10000 // # of iterations within each block

let the_queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_apply(UInt(outerCount), the_queue) { outerIdx -> Void in
for innerIdx in 1 ... innerCount {
// ...
}
}

(You have to figure out the best relation between outer and inner counts.)

There are two things to notice:

  • arc4random() uses an internal mutex, which makes it extremely slow when called
    from several threads in parallel, see Performance of concurrent code using dispatch_group_async is MUCH slower than single-threaded version. From the answers given there,
    rand_r() (with separate seeds for each thread) seems to be faster alternative.

  • The result variable winner must not be modified from multiple threads simultaneously.
    You can use an array instead where each thread updates its own element, and the results
    are added afterwards. A thread-safe method has been described in https://stackoverflow.com/a/26790019/1187415.

Then it would roughly look like this:

let outerCount = 100     // # of concurrent block iterations
let innerCount = 10000 // # of iterations within each block

let the_queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);

var winners = [Int](count: outerCount, repeatedValue: 0)
winners.withUnsafeMutableBufferPointer { winnersPtr -> Void in

dispatch_apply(UInt(outerCount), the_queue) { outerIdx -> Void in
var seed = arc4random() // seed for rand_r() in this "thread"

for innerIdx in 1 ... innerCount {
var points = 0
var ability = 500

for i in 1 ... 1000 {
let chance = Int(rand_r(&seed) % 1001)
if chance < (ability-points) { ++points }
else {points = points - 1}
}
if points > 0 {
winnersPtr[Int(outerIdx)] += 1
}
}
}
}

// Add results:
let winner = reduce(winners, 0, +)
println(winner)

Adding to array in parallel

As you have discovered, Swift may relocate the array for memory efficiency. When you run multiple append, it's bound to happen. My guess is that appending to the array is a very low cost operation compared to your transformation. You can create a serial queue just to extend the dst array:

let src = Array(0..<1000)
var dst = [UInt32]()

let queue = dispatch_queue_create("myqueue", DISPATCH_QUEUE_CONCURRENT)
let serialQueue = dispatch_queue_create("mySerialQueue", DISPATCH_QUEUE_SERIAL)
let serialGroup = dispatch_group_create()

dispatch_apply(src.count, queue) { i in
let result = arc4random_uniform(UInt32(i)) // Your long-running transformation here
dispatch_group_async(serialGroup, serialQueue) {
dst.append(result)
}
}

// Wait until all append operations are complete
dispatch_group_wait(serialGroup, DISPATCH_TIME_FOREVER)

Objective C — What is the fastest and most efficient way to enumerate an array?

The fastest code is the code that reaches the market first.

Seriously -- unless you have a measurable performance problem, this particular choice should occupy no more of your time than it takes to answer which of these patterns fits the most naturally with my project's style?

Note: adressing a performance problem by moving from serial to concurrent execution usually results having two problems; performance & concurrency.

Swift Parallelism: Array vs UnsafeMutablePointer in GCD

Arrays are not thread safe and, although they are bridged to Objective-C objects, they behave as value types with COW (copy on Write) logic. COW on an array will make a copy of the whole array when any element changes and the reference counter is greater than 1 (conceptually, the actual implementation is a bit more subtle).

Your thread that makes changes to the array will trigger a memory copy whenever the main thread happens to reference and element. The main thread, also makes changes so it will cause COW as well. What you end up with is the state of the last modified copy used by either thread. This will randomly leave some of the changes in limbo and explains the "missed" items.

To avoid this you would need to perform all changes in a specific thread and use sync() to ensure that COW on the array is only performed by that thread (this may actually reduce the number of memory copies and give better performance for very large arrays). There will be overhead and potential contention using this approach though. It is the price to pay for thread safeness.

The other way to approach this would be to use an array of objects (referenced types). This makes your array a simple list of pointers that are not actually changed by modifying data in the referenced objects. although, in actual programs, you would need to mind thread safeness within each object instance, there would be far less interference (and overhead) than what you get with arrays of value types.

Puzzled by when copy of vars to func params happens when using GCD

Other than making the local copy ..., is there a preferred method of handing this?

The simplest way to give that other thread its own local copy is through a capture list:

queue.async(group: group).async { [array] in
checkSquares(array)
}


Related Topics



Leave a reply



Submit