Portable Compare And Swap (atomic operations) C/C++ library?
Intel Threading Building Blocks has a nice portable atomic<T>
template which does what you want. But whether it is a small library or not can of course be debated..
How can I implement a portable pointer compare and swap?
I would use a Hardware Abstraction Layer, (HAL) that allows generic code to be common - and any portable source can be included and build for each platform.
In my opinion, this allows for better structured and more readable source.
To allow you to better understand this process I would suggest Google for finding examples and explanations.
Hopefully this brief answer helps.
[EDIT] I will attempt a simple example for Bionix, to show how to implement a HAL system...
- Mr A wants his application to run on his 'Tianhe-2' and also his 'Amiga 500'. He has the cross compilers etc and will build both binaries on his PC. He want to read keys and print to the screen.
mrAMainApplication.c contains the following...
#include "hal.h"
// This gets called every time around the main loop ...
void mainProcessLoop( void )
{
unsigned char key = 0;
// scan key ...
key = hal_ReadKey();
if ( key != 0 )
{
hal_PrintChar( key );
}
}
He then creates a header file (Remember - this is an example, not working code! )...
He creates hal.h ...
#ifndef _HAL_H_
#define _HAL_H_
unsigned char hal_ReadKey( void );
unsigned char hal_PrintChar( unsigned char pKey );
#endif // _HAL_H_
Now Mr A needs two separate source files, one for his 'Tianhe-2' system and another for his Amiga 500...
hal_A500.c
void hal_ReadKey( void )
{
// Amiga related code for reading KEYBOARD
}
void hal_PrintChar( unsigned char pKey )
{
// Amiga related code for printing to a shell...
}
hal_Tianhe2_VERYFAST.c
void hal_ReadKey( void )
{
// Tianhe-2 related code for reading KEYBOARD
}
void hal_PrintChar( unsigned char pKey )
{
// Tianhe-2 related code for printing to a shell...
}
Mr A then - when building for the Amiga - builds mrAmainApplication.c and hal_A500.c
When building for the Tianhe-2 - he uses hal_Tianhe2_VERYFAST.c instead of hal_A500.c
Right - I've written this example with some humour, this is not ear-marked at anyone, just I feel it makes the example more interesting and hopefully aids in understanding.
Neil
How to handle integer overflows with atomic increment and compare-and-swap operations?
You can build everything from compare-and-swap. The general algorithm is
int DoSomeAtomicCalculation()
{
static int value;
int capture, newValue;
do {
capture = value;
newValue = some_function_based_on(capture);
} while (!compare_and_swap(value, capture, newValue));
return newValue;
}
(I'm assuming that compare_and_swap
takes a variable, a comparison value, and a swap value, and it returns true
if the comparison succeeded (and the swap occurred).
UNIX Portable Atomic Operations
As of C11 there is an optional Atomic library which provides atomic operations. This is portable to whatever platform that has a C11 compiler (like gcc-4.9) with this optional feature.
The presence of the atomic can be checked with __STDC_NO_ATOMICS__
and the presence of <stdatomic.h>
atomic.c
#include <stdio.h>
#include <stdlib.h>
#ifndef __STDC_NO_ATOMICS__
#include <stdatomic.h>
#endif
int main(int argc, char**argv) {
_Atomic int a;
atomic_init(&a, 42);
atomic_store(&a, 5);
int b = atomic_load(&a);
printf("b = %i\n", b);
return EXIT_SUCCESS;
}
Compiler invocations
clang -std=c11 atomic.c
gcc -std=c11 atomic.c
Atomic Operations in C on Linux
Well, nothing is there to stop you from using OSAtomic
operations on other platforms. The sources for OSAtomic
operations for ARM, x86 and PPC are a part of Apple's libc which is opensource. Just make sure you are not using OSSpinLock
as that is specific to Mac OS X, but this can be easily replaced by Linux futexes.
See these:
http://opensource.apple.com/source/Libc/Libc-594.1.4/i386/sys/OSAtomic.s
http://opensource.apple.com/source/Libc/Libc-594.1.4/ppc/sys/OSAtomic.s
http://opensource.apple.com/source/Libc/Libc-594.1.4/arm/sys/OSAtomic.s
Alternatively, you can use the sync_*
family, which I believe should work on most platforms, which I believe are described here: http://gcc.gnu.org/wiki/Atomic
Atomic operations - C
Atomic operations are very special and provide only limited support. Applying them to two variables sounds impossible for me.
Note, that it is even not garanteed that an atomic operation is really done with a resp. actomic operation (i.e. machine code command).
Out of the gcc doc. 5.47 Built-in functions for atomic memory access:
Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning will be generated and a call an external function will be generated. The external function will carry the same name as the builtin, with an additional suffix '_n' where n is the size of the data type.
The external function probably emulates the atomic operation using a mutex.
But I guess, it would be possible with a "dirty hack" and only with certain limitations:
If 16 bit unsigned counters are sufficient, you could put two of them in one 32 bit variable where c1c2 += 0x00000001
increments one, c1c2 += 0x00010000
increments the other, and c1c2 += 0x00010001
increments both
or using the atomic operations:
/* combined counters c1 and c2 */
static uint32_t c1c2 = 0;
/* count c1 atomically */
__sync_fetch_and_add(&c1c2, 0x00000001);
/* count c2 atomically */
__sync_fetch_and_add(&c1c2, 0x00010000);
/* count c1 AND c2 atomically */
__sync_fetch_and_add(&c1c2, 0x00010001);
This has to be combined with the appropriate bit shifting and masking to access the invidiual counter values.
Of course, counter overflow could be an issue. The same might work for two 32 bit counters on a 64 bit platform (considering that atomic operations are usually only available for "machine word" width).
Btw. while googling for background info, I stumbled over this: Why does __sync_add_and_fetch work for a 64 bit variable on a 32 bit system?. I found the hint that atomic operations may require sufficient alignment of variables to work properly (which I found worth to mention).
This might be the reason why the C11 Atomic Library provides dedicated types for atomic variables (e.g. atomic_uint_least32_t
).
C99 atomic load in baremetal portable library
Are you running on any systems that have uint32_t
larger than a single assembly instruction word read/write size? If not, the IO to memory should be a single instructions and therefore atomic (assuming the bus is also word sized...) You get in trouble when the compiler breaks it up into multiple smaller read/writes. Otherwise, I've always had to resort to DI/EI. You could have the user configure your library such that it has information if atomic instructions or minimum 32-bit word size are available to prevent interrupt twiddling. If you have these guarantees, you don't need to verification code.
To answer the question though, on a system that must split the read/writes, your code is not safe. Imagine a case where you read your value in correctly in the "do" part, but the value gets split during the "while" part check. Further, in an extreme case, this is an infinite loop. For complete safety, you'd need a retry count and error condition to prevent that. The loop case is extreme for sure, but I'd want it just in case. That of course makes the run time longer.
Let's show a failure case for examples - will use 16-bit numbers on a machine that reads 8-bit values at a time to make it easier to follow:
- Value to read from memory *var is 0x1234
- Read 8-bit 0x12
- *var becomes 0x5678
- Read 8-bit 0x78 - value is now 0x1278 (invalid)
- *var becomes 0x1234
- Verification step reads 8-bit 0x12
- *var becomes 0x5678
- Verification reads 8-bit 0x78
Value confirmed correct 0x1278, but this is an error as *var was only 0x1234 and 0x5678.
Another failure case would be when *var just happens to change at the same frequency as your code is running, which could lead to an infinite loop as each verification fails. Or even if it did break out eventually, this would be a very hard to track performance bug.
Related Topics
Undefined Symbols for Architecture X86_64: Compiling Problems
How to Validate That a String Is a Valid Ipv4 Address in C++
Exception Running Boost Asio Ssl Example
What's an Expression and Expression Statement in C++
Copying Non Null-Terminated Unsigned Char Array to Std::String
Where in Memory Is Vtable Stored
Multiple Inheritance + Virtual Function Mess
How to Define Sealed Class in C++
Why Do Linked Lists Use Pointers Instead of Storing Nodes Inside of Nodes
How to Use the _Attribute_((Visibility("Default")))
Window C/C++ Crypto API Examples and Tips
Fatal Error: iOStream.H No Such File or Directory
Default Move Constructor/Assignment and Deleted Copy Constructor/Assignment