How-To Ensure That Compiler Optimizations Don't Introduce a Security Risk

How-to ensure that compiler optimizations don't introduce a security risk?

Yes, your concerns are legitimate. You need to use specifically designed function like SecureZeroMemory() to prevent optimizations from modifying your code behavior.

Don't forget that the string class should have been specifically designed for handling passwords. For example, if the class reallocates the buffer to hold a longer string it has to erase the buffer before retunring it to the memory allocator. I'm not sure, but it's likely std::string doesn't do that (at least by default). Using an unsuitable string handling class makes all your concerns worthless - you'll have the password copied all over the program memory befoe you even know.

frame pointer omitting ? Any risk?

If your software produces stack traces when it crashes, omitting the frame pointer can prevent that from working.

Is it possible to guarantee code doing memory writes is not optimized away in C++?

There is no portable solution. If it wants to, the compiler could have made copies of the data while you were using it in multiple places in memory and any zero function could zero only the one it's using at that time. Any solution will be non-portable.

Are compiler optimizations safe?

I don't have any data (and haven't heard of anyone that does ...) but ...

I'd choose which compiler I would use before I'd choose to disable optimizations. In other words, I wouldn't use any compiler I couldn't trust the optimizations on.

The linux kernel is compiled with -Os. That's a lot more convincing to me than any bugzilla analysis.

Personally, I'd be okay with any version of gcc linux is okay with.

As another data point, Apple's been converting from gcc to llvm, with and without clang. llvm has traditionally had issues with some C++ and while llvm-gcc is now a lot better, there still seem to be issues with clang++. But that just kind of proves the pattern: while Apple (purportedly) now compiles OS X and iOS with clang, they don't use much if any C++ and Objective C++. So for pure C and Objective C, I'd trust clang, but I still don't yet trust clang++.

Can compiler optimization introduce bugs?

Compiler optimizations can introduce bugs or undesirable behaviour. That's why you can turn them off.

One example: a compiler can optimize the read/write access to a memory location, doing things like eliminating duplicate reads or duplicate writes, or re-ordering certain operations. If the memory location in question is only used by a single thread and is actually memory, that may be ok. But if the memory location is a hardware device IO register, then re-ordering or eliminating writes may be completely wrong. In this situation you normally have to write code knowing that the compiler might "optimize" it, and thus knowing that the naive approach doesn't work.

Update: As Adam Robinson pointed out in a comment, the scenario I describe above is more of a programming error than an optimizer error. But the point I was trying to illustrate is that some programs, which are otherwise correct, combined with some optimizations, which otherwise work properly, can introduce bugs in the program when they are combined together. In some cases the language specification says "You must do things this way because these kinds of optimizations may occur and your program will fail", in which case it's a bug in the code. But sometimes a compiler has a (usually optional) optimization feature that can generate incorrect code because the compiler is trying too hard to optimize the code or can't detect that the optimization is inappropriate. In this case the programmer must know when it is safe to turn on the optimization in question.

Another example:
The linux kernel had a bug where a potentially NULL pointer was being dereferenced before a test for that pointer being null. However, in some cases it was possible to map memory to address zero, thus allowing the dereferencing to succeed. The compiler, upon noticing that the pointer was dereferenced, assumed that it couldn't be NULL, then removed the NULL test later and all the code in that branch. This introduced a security vulnerability into the code, as the function would proceed to use an invalid pointer containing attacker-supplied data. For cases where the pointer was legitimately null and the memory wasn't mapped to address zero, the kernel would still OOPS as before. So prior to optimization the code contained one bug; after it contained two, and one of them allowed a local root exploit.

CERT has a presentation called "Dangerous Optimizations and the Loss of Causality" by Robert C. Seacord which lists a lot of optimizations that introduce (or expose) bugs in programs. It discusses the various kinds of optimizations that are possible, from "doing what the hardware does" to "trap all possible undefined behaviour" to "do anything that's not disallowed".

Some examples of code that's perfectly fine until an aggressively-optimizing compiler gets its hands on it:

Checking for overflow

// fails because the overflow test gets removed
if (ptr + len < ptr || ptr + len > max) return EINVAL;

Using overflow artithmetic at all:

// The compiler optimizes this to an infinite loop
for (i = 1; i > 0; i += i) ++j;

Clearing memory of sensitive information:

// the compiler can remove these "useless writes"
memset(password_buffer, 0, sizeof(password_buffer));

The problem here is that compilers have, for decades, been less aggressive in optimization, and so generations of C programmers learn and understand things like fixed-size twos complement addition and how it overflows. Then the C language standard is amended by compiler developers, and the subtle rules change, despite the hardware not changing. The C language spec is a contract between the developers and compilers, but the terms of the agreement are subject to change over time and not everyone understands every detail, or agrees that the details are even sensible.

This is why most compilers offer flags to turn off (or turn on) optimizations. Is your program written with the understanding that integers might overflow? Then you should turn off overflow optimizations, because they can introduce bugs. Does your program strictly avoid aliasing pointers? Then you can turn on the optimizations that assume pointers are never aliased. Does your program try to clear memory to avoid leaking information? Oh, in that case you're out of luck: you either need to turn off dead-code-removal or you need to know, ahead of time, that your compiler is going to eliminate your "dead" code, and use some work-around for it.

Secure deallocation of boost::asio::const_buffer

Short answer: Asio buffers don't own their memory, so they should not be responsible for disposing of it either.

First off, you should not use

std::memset(p, 0, n * sizeof(T));

Use a function like SecureZeroMemory instead: How-to ensure that compiler optimizations don't introduce a security risk?

I realize you had volatile there for this reason, but it might not always be honoured like you expect:

Your secure_memset function might not be sufficient. According to http://open-std.org/jtc1/sc22/wg14/www/docs/n1381.pdf there are optimizing compilers that will only zero the first byte – Daniel Trebbien Nov 9 '12 at 12:50

Background reading:

https://cryptocoding.net/index.php/Coding_rules#Clean_memory_of_secret_data
http://blog.quarkslab.com/a-glance-at-compiler-internals-keep-my-memset.html

On to ASIO

Make sure you fully realize that Boost Asio buffers have no ownership semantics. They only ever reference data owned by another object.

More importantly than the question posed, you might want to check that you keep around the buffer data long enough. A common pitfall is to pass a local as a buffer:
std::string response = "OK\r\n\r\n";
asio::async_write(sock_, asio::buffer(response), ...); // OOOPS!!!
This leads to Undefined Behaviour immediately.

IOW const_buffer is a concept. There are a gazillion ways to construct it on top of (your own) objects:

documentation

A buffer object represents a contiguous region of memory as a 2-tuple consisting of a pointer and size in bytes. A tuple of the form {void*, size_t} specifies a mutable (modifiable) region of memory. Similarly, a tuple of the form {const void*, size_t} specifies a const (non-modifiable) region of memory. These two forms correspond to the classes mutable_buffer and const_buffer, respectively

So, let's assume you have your buffer type

struct SecureBuffer
{
     ~SecureBuffer() { shred(); }
     size_t      size() const { return length_; }
     char const* data() const { return data_; }

     // ...
   private:
     void shred(); // uses SecureZeroMemory etc.

     std::array<char, 1024> data_ = {0};
     size_t length_ = 0u;
};

Then you can simply pass it where you want to use it:

SecureBuffer secret; // member variable (lifetime exceeds async operation)
// ... set data
boost::asio::async_write(sock_,
     boost::asio::buffer(secret.data(), secret.size()),
     /*...*/
    );

Is using an outdated C compiler a security risk?

Actually I would argue the opposite.

There are a number of cases where behaviour is undefined by the C standard but where it is obvious what would happen with a "dumb compiler" on a given platform. Cases like allowing a signed integer to overflow or accessing the same memory though variables of two different types.

Recent versions of gcc (and clang) have started treating these cases as optimisation opportunities not caring if they change how the binary behaves in the "undefined behaviour" condition. This is very bad if your codebase was written by people who treated C like a "portable assembler". As time went on the optimisers have started looking at larger and larger chunks of code when doing these optimisations increasing the chance the binary will end up doing something other than "what a binary built by a dumb compiler" would do.

There are compiler switches to restore "traditional" behaviour (-fwrapv and -fno-strict-aliasing for the two I mentioned above) , but first you have to know about them.

While in principle a compiler bug could turn compliant code into a security hole I would consider the risk of this to be negligable in the grand scheme of things.

Does using pointer to volatile prevent compiler optimizations at all times?

The compiler is free to optimize your code out because buffer is not a volatile object.

The Standard only requires a compiler to strictly adhere to semantics for volatile objects. Here is what C++03 says

The least requirements on a conforming implementation are:

At sequence points, volatile objects are stable in the sense that previous evaluations are complete and
subsequent evaluations have not yet occurred.
[...]

and

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and
calls to library I/O functions

In your example, what you have are reads and writes using volatile lvalues to non-volatile objects. C++0x removed the second text I quoted above, because it's redundant. C++0x just says

The least requirements on a conforming implementation are:

Access to volatile objects are evaluated strictly according to the rules of the abstract machine.[...]

These collectively are referred to as the observable behavior of the program.

While one may argue that "volatile data" could maybe mean "data accessed by volatile lvalues", which would still be quite a stretch, the C++0x wording removed all doubts about your code and clearly allows implementations to optimize it away.

But as people pointed out to me, It probably does not matter in practice. A compiler that optimizes such a thing will most probably go against the programmers intention (why would someone have a pointer to volatile otherwise) and so would probably contain a bug. Still, I have experienced compiler vendors that cited these paragraphs when they were faced with bugreports about their over-aggressive optimizations. In the end, volatile is inherent platform specific and you are supposed to double check the result anyway.

How-to write a password-safe class?

Yes, first define a custom allocator:

template <class T> class SecureAllocator : public std::allocator<T>
{
public:
    template<class U> struct rebind { typedef SecureAllocator<U> other; };

    SecureAllocator() throw() {}
    SecureAllocator(const SecureAllocator&) throw() {}
    template <class U> SecureAllocator(const SecureAllocator<U>&) throw() {}

    void deallocate(pointer p, size_type n)
    {
        std::fill_n((volatile char*)p, n*sizeof(T), 0);
        std::allocator<T>::deallocate(p, n);
    }
};

This allocator zeros the memory before deallocating. Now you typedef:

typedef std::basic_string<char, std::char_traits<char>, SecureAllocator<char>> SecureString;

However there is a small problem, std::string may use small string optimization and store some data inside itself, without dynamic allocation. So you must explicitly clear it on destruction or allocate on the heap with our custom allocator:

int main(int, char**)
{
    using boost::shared_ptr;
    using boost::allocate_shared;
    shared_ptr<SecureString> str = allocate_shared<SecureString>(SecureAllocator<SecureString>(), "aaa");

}

This guarantees that all the data is zeroed before deallocation, including the size of the string, for example.

How-To Ensure That Compiler Optimizations Don't Introduce a Security Risk