Are C++ Reads and Writes of an Int Atomic

Are C++ Reads and Writes of an int Atomic?

At first one might think that reads and writes of the native machine size are atomic but there are a number of issues to deal with including cache coherency between processors/cores. Use atomic operations like Interlocked* on Windows and the equivalent on Linux. C++0x will have an "atomic" template to wrap these in a nice and cross-platform interface. For now if you are using a platform abstraction layer it may provide these functions. ACE does, see the class template ACE_Atomic_Op.

Are Reads and Writes of an int in C++ Atomic on x86-64 multi-core machine

The other question talks about variables "properly aligned". If it crosses a cache-line, the variable is not properly aligned. An int will not do that unless you specifically ask the compiler to pack a struct, for example.

You also assume that using volatile int is better than atomic<int>. If volatile int is the perfect way to sync variables on your platform, surely the library implementer would also know that and store a volatile x inside atomic<x>.

There is no requirement that atomic<int> has to be extra slow just because it is standard. :-)

gcc atomic read and writes

Define "safely".

If you just use a regular read, on x86, for naturally aligned 32-bit or smaller data, the read is atomic, so you will always read a valid value rather than one containing some bytes written by one thread and some by another. If any of those things are not true (not x86, not naturally aligned, larger than 32 bits...) all bets are off.

That said, you have no guarantee whatsoever that the value read will be particularly fresh, or that the sequence of values seen over multiple reads will be in any particular order. I have seen naive code using volatile to defeat the compiler optimising away the read entirely but no other synchronisation mechanism, literally never see an updated value due to CPU caching.

If any of these things matter to you, and they really should, you should explicitly make the read atomic and use the appropriate memory barriers. The intrinsics you refer to take care of both of these things for you: you could call one of the atomic intrinsics in such a way that there is no side effect other than returning the value:

__sync_val_compare_and_swap(ptr, 0, 0)

__sync_add_and_fetch(ptr, 0)

__sync_sub_and_fetch(ptr, 0)

or whatever

Do I need an atomic if a value is only written?

According to §5.1.2.4 ¶25 and ¶4 of the ISO C11 standard, two different threads writing to the same memory location using non-atomic operations in an unordered fashion causes undefined behavior. The ISO C standard makes no exception to this rule if all threads are writing the same value.

Although writing a 32-bit integer to a 4-byte aligned address is guaranteed to be atomic by the Intel/AMD specifications for x86/x64 CPUs, such an operation is not guaranteed to be atomic by the ISO C standard, unless you are using a data type that is guaranteed to be atomic by the ISO C standard (such as atomic_int_least32_t). Therefore, even if your threads write a value of type int32_t to a 4-byte aligned address, according to the ISO C standard, your program will still cause undefined behavior.

However, for practical purposes, it is probably safe to assume that the compiler is generating assembly instructions that perform the operation atomically, provided that the alignment requirements are met.

Even if the memory writes were not aligned and the CPU wouldn't execute the write instructions atomically, it is likely that your program will still work as intended. It should not matter if a write operation is split up into two write operations, because all threads are writing the exact same value.

If you decide not to use an atomic variable, then you should at least declare the variable as volatile. Otherwise, the compiler may emit assembly instructions that cause the variable to be only stored in a CPU register, so that the other CPUs may never see any changes to that variable.

So, to answer your question: It is probably not necessary to declare your variable as atomic. However, it is still highly recommended. Generally, all operations on variables that are accessed by several threads should either be atomic or be protected by a mutex. The only exception to this rule is if all threads are performing read-only operations on this variable.

Playing around with undefined behavior can be dangerous and is generally not recommended. In particular, if the compiler detects code that causes undefined behavior, it is allowed to treat that code as unreachable and optimize it away. In certain situations, some compilers actually do that. See this very interesting post by Microsoft Blogger Raymond Chen for more information.

Also, beware that several threads writing to the same location (or even the same cache line) can disrupt the CPU pipeline, because the x86/x64 architecture guarantees strong memory ordering which must be enforced. If the CPU's cache coherency protocol detects a possible memory order violation due to another CPU writing to the same cache line, the whole CPU pipeline may have to be cleared. For this reason, it may be more efficient for all threads to write to different memory locations (in different cache lines, at least 64 bytes apart) and to analyze the written data after all threads have been synchronized.

ARM: Is writing/reading from int atomic?

It should be atomic, EXCEPT if that int is stored on a non-aligned address.

Are reads and writes to properties atomic in C#?

You need to distinguish between "atomic" and "thread-safe" more closely. As you say, writes are atomic for most built-in value types and references.

However, that doesn't mean they're thread-safe. It just means that if values "A" and "B" are both written, a thread will never see something in between. (e.g. a change from 1 to 4 will never show 5, or 2, or any value other than 1 or 4.) It doesn't mean that one thread will see value "B" as soon as it's been written to the variable. For that, you need to look at the memory model in terms of volatility. Without memory barriers, usually obtained through locking and/or volatile variables, writes to main memory may be delayed and reads may be advanced, effectively assuming that the value hasn't changed since the last read.

If you had a counter and you asked it for its latest value but never received the latest value because of a lack of memory barriers, I don't think you could reasonably call that thread-safe even though each operation may well be atomic.

This has nothing to do with properties, however - properties are simply methods with syntactic sugar around them. They make no extra guarantees around threading. The .NET 2.0 memory model does have more guarantees than the ECMA model, and it's possible that it makes guarantees around method entry and exit. Those guarantees should apply to properties as well, although I'd be nervous around the interpretation of such rules: it can be very difficult to reason about memory models sometimes.

Atomic reads in C

In general, a simple atomic fetch isn't provided by atomic operations libraries because it's rarely used; you read the value and then do something with it, and the lock needs to be held during that something so that you know that the value you read hasn't changed. So instead of an atomic read, there is an atomic test-and-set of some kind (e.g. gcc's __sync_fetch_and_add()) which performs the lock, then you perform normal unsynchronized reads while you hold the lock.

The exception is device drivers where you may have to actually lock the system bus to get atomicity with respect to other devices on the bus, or when implementing the locking primitives for atomic operations libraries; these are inherently machine-specific, and you'll have to delve into assembly language. On x86 processors, there are various atomic instructions, plus a lock prefix that can be applied to most operations that access memory and hold a bus lock for the duration of the operation; other platforms (SPARC, MIPS, etc.) have similar mechanisms, but often the fine details differ. You will have to know the CPU you're programming for and quite probably have to know something about the machine's bus architecture in this case. And libraries for this rarely make sense, because you can't hold bus or memory locks across function entry/exit, and even with a macro library one has to be careful because of the implication that one could intersperse normal operations between macro calls when in fact that could break locking. It's almost always better to just code the entire critical section in assembly language.

Are C/C++ fundamental types atomic?

No, fundamental data types (e.g., int, double) are not atomic, see std::atomic.

Instead you can use std::atomic<int> or std::atomic<double>.

Note: std::atomic was introduced with C++11 and my understanding is that prior to C++11, the C++ standard didn't recognize the existence of multithreading at all.

As pointed out by @Josh, std::atomic_flag is an atomic boolean type. It is guaranteed to be lock-free, unlike the std::atomic specializations.

The quoted documentation is from: http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/n4567.pdf. I'm pretty sure the standard is not free and therefore this isn't the final/official version.

1.10 Multi-threaded executions and data races

Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one reads or modifies the same memory location.

The library defines a number of atomic operations (Clause 29) and operations on mutexes (Clause 30) that are specially identified as synchronization operations. These operations play a special role in making assignments in one thread visible to another. A synchronization operation on one or more memory locations is either a consume operation, an acquire operation, a release operation, or both an acquire and release operation. A synchronization operation without an associated memory location is a fence and can be either an acquire fence, a release fence, or both an acquire and release fence. In addition, there are relaxed atomic operations, which are not synchronization operations, and atomic read-modify-write operations, which have special characteristics.

Two actions are potentially concurrent if

(23.1) — they are performed by different threads, or

(23.2) — they are unsequenced, and at least one is performed by a signal handler.

The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.

29.5 Atomic types

There shall be explicit specializations of the atomic template for the integral types ``char, signed char, unsigned char, short, unsigned short, int, unsigned int, long, unsigned long, long long, unsigned long long, char16_t, char32_t, wchar_t, and any other types needed by the typedefs in the header <cstdint>. For each integral type integral, the specialization atomic<integral> provides additional atomic operations appropriate to integral types. There shall be a specialization atomic<bool> which provides the general atomic operations as specified in 29.6.1..

There shall be pointer partial specializations of the atomic class template. These specializations shall have standard layout, trivial default constructors, and trivial destructors. They shall each support aggregate initialization syntax.

29.7 Flag type and operations

Operations on an object of type atomic_flag shall be lock-free. [ Note: Hence the operations should also be address-free. No other type requires lock-free operations, so the atomic_flag type is the minimum hardware-implemented type needed to conform to this International standard. The remaining types can be emulated with atomic_flag, though with less than ideal properties. — end note ]

Are C++ Reads and Writes of an Int Atomic