What Does "Atomic" Mean in Programming

What does atomic mean in programming?

Here's an example: Suppose foo is a variable of type long, then the following operation is not an atomic operation (in Java):

foo = 65465498L;

Indeed, the variable is written using two separate operations: one that writes the first 32 bits, and a second one which writes the last 32 bits. That means that another thread might read the value of foo, and see the intermediate state.

Making the operation atomic consists in using synchronization mechanisms in order to make sure that the operation is seen, from any other thread, as a single, atomic (i.e. not splittable in parts), operation. That means that any other thread, once the operation is made atomic, will either see the value of foo before the assignment, or after the assignment. But never the intermediate value.

A simple way of doing this is to make the variable volatile:

private volatile long foo;

Or to synchronize every access to the variable:

public synchronized void setFoo(long value) {
    this.foo = value;
}

public synchronized long getFoo() {
    return this.foo;
}
// no other use of foo outside of these two methods, unless also synchronized

Or to replace it with an AtomicLong:

private AtomicLong foo;

What are atomic operations for newbies?

Pretty much, yes. "Atom" comes from greek "atomos" = "uncuttable", and has been used in the sense "indivisible smallest unit" for a very long time (till physicists found that, in fact, there are smaller things than atoms). In concurrent programming, it means that there will be no context switch during it - nothing can affect the execution of atomic command.

An example: a web poll, open-ended questions, but we want to sum up how many people give the same answer. You have a database table where you insert answers and counts of that answer. The code is straightforward:

get the row for the given answer
if the row didn't exist:
  create the row with answer and count 1
else:
  increment count
  update the row with new count

Or is it? See what happens when multiple people do it at the same time:

user A answers "ham and eggs"       user B answers "ham and eggs"
get the row: count is 1             get the row: count is 1
okay, we're updating!               okay, we're updating!
count is now 2                      count is now 2
store 2 for "ham and eggs"          store 2 for "ham and eggs"

"Ham and eggs" only jumped by 1 even though 2 people voted for it! This is clearly not what we wanted. If only there was an atomic operation "increment if it exists or make a new record"... for brevity, let's call it "upsert" (for "update or insert")

user A answers "ham and eggs"       user B answers "ham and eggs"
upsert by incrementing count        upsert by incrementing count

Here, each upsert is atomic: the first one left count at 2, the second one left it at 3. Everything works.

Note that "atomic" is contextual: in this case, the upsert operation only needs to be atomic with respect to operations on the answers table in the database; the computer can be free to do other things as long as they don't affect (or are affected by) the result of what upsert is trying to do.

Are atomic and synchronous synonyms in programming?

Not quite the same. An atomic operation is one that can't be subdivided into smaller parts. So, in Java, assigning to an int is atomic: nothing can interrupt it, it either completes or doesn't.

A synchronous operation is one that simulates being atomic through some programming mechanism you invoke using the synchronized keyword. The implementation of that can vary.So in a synchronized block, the run time system enforces what's called a critical region in which only one thread of control can pass at the same time.

In C#, what does atomic mean?

Atomic operations are ones that cannot be interrupted partway through, such as by threading. Take for instance the statement

_value++;

If you have two threads executing this code at once with a starting value of 0, you may have the following

Thread A reads _value, 0
Thread A adds 1, 1
Thread B reads _value, 0
Thread B adds 1, 1
Thread A assigns to _value, 1
Thread B assigns to _value, 1

so now, even though we've called an increment twice, the final value in _value is 1, not the expected 2. This is because increment operators are not atomic.

The function Interlocked.Increment, however, is atomic, so replacing the above code with

Interlocked.Increment(ref _value);

Would solve the given race condition.

EDIT: As a point of etymology, "atomic" usually means "indivisible" - the chemistry term we're familiar with is a misnomer held over from the belief that atoms were indivisible, only for later discoveries to break them down further into subatomic, quark, and quanta levels.

What exactly is std::atomic?

Each instantiation and full specialization of std::atomic<> represents a type that different threads can simultaneously operate on (their instances), without raising undefined behavior:

Objects of atomic types are the only C++ objects that are free from data races; that is, if one thread writes to an atomic object while another thread reads from it, the behavior is well-defined.

In addition, accesses to atomic objects may establish inter-thread synchronization and order non-atomic memory accesses as specified by std::memory_order.

std::atomic<> wraps operations that, in pre-C++ 11 times, had to be performed using (for example) interlocked functions with MSVC or atomic bultins in case of GCC.

Also, std::atomic<> gives you more control by allowing various memory orders that specify synchronization and ordering constraints. If you want to read more about C++ 11 atomics and memory model, these links may be useful:

C++ atomics and memory ordering
Comparison: Lockless programming with atomics in C++ 11 vs. mutex and RW-locks
C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?
Concurrency in C++11

Note that, for typical use cases, you would probably use overloaded arithmetic operators or another set of them:

std::atomic<long> value(0);
value++; //This is an atomic op
value += 5; //And so is this

Because operator syntax does not allow you to specify the memory order, these operations will be performed with std::memory_order_seq_cst, as this is the default order for all atomic operations in C++ 11. It guarantees sequential consistency (total global ordering) between all atomic operations.

In some cases, however, this may not be required (and nothing comes for free), so you may want to use more explicit form:

std::atomic<long> value {0};
value.fetch_add(1, std::memory_order_relaxed); // Atomic, but there are no synchronization or ordering constraints
value.fetch_add(5, std::memory_order_release); // Atomic, performs 'release' operation

Now, your example:

a = a + 12;

will not evaluate to a single atomic op: it will result in a.load() (which is atomic itself), then addition between this value and 12 and a.store() (also atomic) of final result. As I noted earlier, std::memory_order_seq_cst will be used here.

However, if you write a += 12, it will be an atomic operation (as I noted before) and is roughly equivalent to a.fetch_add(12, std::memory_order_seq_cst).

As for your comment:

A regular int has atomic loads and stores. Whats the point of wrapping it with atomic<>?

Your statement is only true for architectures that provide such guarantee of atomicity for stores and/or loads. There are architectures that do not do this. Also, it is usually required that operations must be performed on word-/dword-aligned address to be atomic std::atomic<> is something that is guaranteed to be atomic on every platform, without additional requirements. Moreover, it allows you to write code like this:

void* sharedData = nullptr;
std::atomic<int> ready_flag = 0;

// Thread 1
void produce()
{
    sharedData = generateData();
    ready_flag.store(1, std::memory_order_release);
}

// Thread 2
void consume()
{
    while (ready_flag.load(std::memory_order_acquire) == 0)
    {
        std::this_thread::yield();
    }

    assert(sharedData != nullptr); // will never trigger
    processData(sharedData);
}

Note that assertion condition will always be true (and thus, will never trigger), so you can always be sure that data is ready after while loop exits. That is because:

store() to the flag is performed after sharedData is set (we assume that generateData() always returns something useful, in particular, never returns NULL) and uses std::memory_order_release order:

memory_order_release

A store operation with this memory order performs the release
operation: no reads or writes in the current thread can be reordered
after this store. All writes in the current thread are visible in
other threads that acquire the same atomic variable

sharedData is used after while loop exits, and thus after load() from flag will return a non-zero value. load() uses std::memory_order_acquire order:

std::memory_order_acquire

A load operation with this memory order performs the acquire operation
on the affected memory location: no reads or writes in the current
thread can be reordered before this load. All writes in other threads
that release the same atomic variable are visible in the current
thread.

This gives you precise control over the synchronization and allows you to explicitly specify how your code may/may not/will/will not behave. This would not be possible if only guarantee was the atomicity itself. Especially when it comes to very interesting sync models like the release-consume ordering.

What Does "Atomic" Mean in Programming