Constants and Compiler Optimization in C++

What kind of optimization does const offer in C/C++?

^Source

Case 1:

When you declare a const in your program,

int const x = 2;

Compiler can optimize away this const by not providing storage for this variable; instead it can be added to the symbol table. So a subsequent read just needs indirection into the symbol table rather than instructions to fetch value from memory.

Note: If you do something like:

const int x = 1;
const int* y = &x;

Then this would force compiler to allocate space for x. So, that degree of optimization is not possible for this case.

In terms of function parameters const means that parameter is not modified in the function. As far as I know, there's no substantial performance gain for using const; rather it's a means to ensure correctness.

Case 2:

"Does declaring the parameter and/or the return value as const help the compiler to generate more optimal code?"

const Y& f( const X& x )
{
    // ... do something with x and find a Y object ...
    return someY;
}

What could the compiler do better? Could it avoid a copy of the parameter or the return value?

No, as argument is already passed by reference.

Could it put a copy of x or someY into read-only memory?

No, as both x and someY live outside its scope and come from and/or are given to the outside world. Even if someY is dynamically allocated on the fly within f() itself, it and its ownership are given up to the caller.

What about possible optimizations of code that appears inside the body of f()? Because of the const, could the compiler somehow improve the code it generates for the body of f()?

Even when you call a const member function, the compiler can't assume that the bits of object x or object someY won't be changed. Further, there are additional problems (unless the compiler performs global optimization): The compiler also may not know for sure that no other code might have a non-const reference that aliases the same object as x and/or someY, and whether any such non-const references to the same object might get used incidentally during the execution of f(); and the compiler may not even know whether the real objects, to which x and someY are merely references, were actually declared const in the first place.

Case 3:

void f( const Z z )
{
    // ...
}

Will there be any optimization in this?

Yes because the compiler knows that z truly is a const object, it could perform some useful optimizations even without global analysis. For example, if the body of f() contains a call like g( &z ), the compiler can be sure that the non-mutable parts of z do not change during the call to g().

Constants and compiler optimization in C++

Let's disregard methods and look only at const objects; the compiler has much more opportunity for optimization here. If an object is declared const, then (ISO/IEC 14882:2003 7.1.5.1(4)):

Except that any class member declared
mutable (7.1.1) can be modified, any
attempt to modify a const object
during its lifetime (3.8) results in
undefined behavior.

Lets disregard objects that may have mutable members - the compiler is free to assume that the object will not be modified, therefore it can produce significant optimizations. These optimizations can include things like:

incorporating the object's value directly into the machines instruction opcodes
complete elimination of code that can never be reached because the const object is used in a conditional expression that is known at compile time
loop unrolling if the const object is controlling the number of iterations of a loop

Note that this stuff applies only if the actual object is const - it does not apply to objects that are accessed through const pointers or references because those access paths can lead to objects that are not const (it's even well-defined to change objects though const pointers/references as long as the actual object is non-const and you cast away the constness of the access path to the object).

In practice, I don't think there are compilers out there that perform any significant optimizations for all kinds of const objects. but for objects that are primitive types (ints, chars, etc.) I think that compilers can be quite aggressive in optimizing
the use of those items.

Does the compiler optimize references to constant variables?

It mostly depends on the level of optimization and which compiler you are using.

With maximum optimizations, the compiler will indeed probably just replace your whole code with char erm = '3';. GCC -O3 does this anyway.

But then of course it depends on what you do with that variable. The compiler might not even allocate the variable, but just use the raw number in the operation where the variable occurs.

const char compiler optimization

This kind of optimization is called string interning.

GCC sets the -fmerge-constants flag by default:

Attempt to merge identical constants (string constants and floating-point constants) across compilation units.

This option is the default for optimized compilation if the assembler and linker support it. Use -fno-merge-constants to inhibit this behavior.

Enabled at levels -O, -O2, -O3, -Os.

Let's make an executable with a 3rd file named f.c to reference the strings:

#include <stdio.h>

// For proposition#1
extern const char foo1[], foo2[];

// For proposition#2
//extern const char *foo1, *foo2;

int main(void) {

  printf("%s\n", foo1);
  printf("%s\n", foo2);

  return 0;
}

When you define the following respectively in f1.c and f2.c (proposition#1):

const char foo1[] = "SAME_VALUE";

const char foo2[] = "SAME_VALUE";

This results in 2 different memory spaces into which the string "SAME_VALUE" is stored. So, the string is duplicated:

$ gcc f.c f1.c f2.c
$ objdump -D a.out
[...]
0000000000001060 <main>:
    1060:   f3 0f 1e fa             endbr64 
    1064:   48 83 ec 08             sub    $0x8,%rsp
    1068:   48 8d 3d 99 0f 00 00    lea    0xf99(%rip),%rdi <-- foo1@2008
    106f:   e8 dc ff ff ff          callq  1050 <puts@plt>
    1074:   48 8d 3d 9d 0f 00 00    lea    0xf9d(%rip),%rdi <-- foo2@2018
    107b:   e8 d0 ff ff ff          callq  1050 <puts@plt>
[...]
0000000000002008 <foo1>:
    2008:   53        'S'  <-- 1 string @ 2008
    2009:   41        'A'
    200a:   4d        'M' 
    200b:   45 5f     'E' '_'
    200d:   56        'V'
    200e:   41        'A'
    200f:   4c 55     'L' 'U'
    2011:   45        'E'
    ...

0000000000002018 <foo2>:
    2018:   53        'S'  <-- Another string @ 2018
    2019:   41        'A' 
    201a:   4d        'M' 
    201b:   45 5f     'E' '_'
    201d:   56        'V'
    201e:   41        'A'
    201f:   4c 55     'L' 'U' 
    2021:   45        'E'

But if you define the following respectively in f1.c and f2.c (proposition#2):

const char *foo1 = "SAME_VALUE";

const char *foo2 = "SAME_VALUE";

You define two pointers which will point on the same string. In this case, "SAME_VALUE" will likely not be duplicated. In the following raw disassembly, the string is at address 2004 and both foo1 and foo2 point to it:

$ gcc f.c f1.c f2.c
$ objdump -D a.out
[...]
    2004:   53        'S'    <-- 1 string @ 2004
    2005:   41        'A'
    2006:   4d        'M'
    2007:   45 5f     'E' '_'
    2009:   56        'V'
    200a:   41        'A'
    200b:   4c 55     'L' 'U'
    200d:   45        'E'
[...]
0000000000001060 <main>:
    1060:   f3 0f 1e fa             endbr64 
    1064:   48 83 ec 08             sub    $0x8,%rsp
    1068:   48 8b 3d a1 2f 00 00    mov    0x2fa1(%rip),%rdi <-- 106f+2fa1=foo1@4010 
    106f:   e8 dc ff ff ff          callq  1050 <puts@plt>
    1074:   48 8b 3d 9d 2f 00 00    mov    0x2f9d(%rip),%rdi <-- 107b+2f9d=foo2@4018 
[...]
0000000000004010 <foo1>:
    4010:   04 20         <-- foo1 = @2004
[...]
0000000000004018 <foo2>:
    4018:   04 20         <-- foo2 = @2004

To avoid the duplication with proposition#1, GCC provides -fmerge-all-constants:

Attempt to merge identical constants and identical variables.

This option implies -fmerge-constants. In addition to -fmerge-constants this considers e.g. even constant initialized arrays or initialized constant variables with integral or floating-point types. Languages like C or C++ require each variable, including multiple instances of the same variable in recursive calls, to have distinct locations, so using this option results in non-conforming behavior.

Let's rebuild proposition#1 with this flag. We can see that foo2 is optimized out, only foo1 is kept and referenced:

$ gcc -fmerge-all-constants f.c f1.c f2.c
$ objdump -D a.out
[...]
0000000000001149 <main>:
    1149:   f3 0f 1e fa             endbr64 
    114d:   55                      push   %rbp
    114e:   48 89 e5                mov    %rsp,%rbp
    1151:   48 8d 3d b0 0e 00 00    lea    0xeb0(%rip),%rdi <-- 1158(RIP) + eb0 = 2008 <foo1>
    1158:   e8 f3 fe ff ff          callq  1050 <puts@plt>
    115d:   48 8d 3d a4 0e 00 00    lea    0xea4(%rip),%rdi <-- 1164(RIP) + ea4 = 2008 <foo1>
    1164:   e8 e7 fe ff ff          callq  1050 <puts@plt>
    1169:   b8 00 00 00 00          mov    $0x0,%eax
[...]
0000000000002008 <foo1>:
    2008:   53    'S' <--- foo2 optimized out, only foo1 defined
    2009:   41    'A'
    200a:   4d    'M'
    200b:   45 5f 'E' '_'
    200d:   56    'V'
    200e:   41    'A'
    200f:   4c 55 'L' 'U'
    2011:   45    'E'

Does compiler optimize operation on const variable and literal const number?

(This question was the subject of my blog in October 2015; thanks for the interesting question!)

You have some good answers already that answer your factual question: No, the C# compiler does not generate the code to do a single multiplication by 86. It generates a multiplication by 43 and a multiplication by 2.

There are some subtleties here that no one has gone into though.

Multiplication is "left associative" in C#. That is,

x * y * z

must be computed as

(x * y) * z

And not

x * (y * z)

Now, is it the case that you ever get different answers for those two computations? If the answer is "no" then the operation is said to be an "associative operation" -- that is, it does not matter where we put the parentheses, and therefore can do optimizations to put the parentheses in the best place. (Note: I made an error in a previous edit of this answer where I said "commutative" when I meant "associative" -- a commutative operation is one where x * y is equal to y * x.)

In C#, string concatenation is an associative operation. If you say

myString + "hello" + "world" + myString

then you get the same result for

((myString + "hello") + "world") + myString

And

(myString + ("hello" + "world")) + myString

and therefore the C# compiler can do an optimization here; it can do a computation at compile time and generate the code as though you had written

(myString + "helloworld") + myString

which is in fact what the C# compiler does. (Fun fact: implementing that optimization was one of the first things I did when I joined the compiler team.)

Is a similar optimization possible for multiplication? Only if multiplication is associative. But it is not! There are a few ways in which it is not.

Let's look at a slightly different case. Suppose we have

x * 0.5 * 6.0

Can we just say that

(x * 0.5) * 6.0

is the same as

x * (0.5 * 6.0)

and generate a multiplication by 3.0? No. Suppose x is so small that x multiplied by 0.5 is rounded to zero. Then zero times 6.0 is still zero. So the first form can give zero, and the second form can give a non-zero value. Since the two operations give different results, the operation is not associative.

The C# compiler could have smarts added to it -- like I did for string concatenation -- to figure out in which cases multiplication is associative and do the optimization, but frankly it is simply not worth it. Saving on string concatenations is a huge win. String operations are expensive in time and memory. And it is very common for programs to contain very many string concatenations where constants and variables are mixed together. Floating point operations are very cheap in time and memory, it is hard to know which ones are associative, and it is rare to have long chains of multiplications in realistic programs. The time and energy it would take to design, implement and test that optimization would be better spent writing other features.

Is `if(CONSTANT) { ... }` optimized in C/C++?

This is not guaranteed but most compilers of good quality will do it.

C99 Rationale says in 6.4.9:

if (0) {
 /* code to be excluded */
}
Many modern compilers will generate no code for this if statement.

For example with gcc (in C) an assembly dump shows that dead code with either if (0) .. else or if (1) .. else is optimized out even in -O0.

Does the compiler really optimize to make these two functions the same assembly?

This function:

float a() {
    A a{55,67,12};
    return a.bar();
}

Has exactly the same observable behavior as this one:

float a() {
    return sqrt(55+67+12);
}

The same is true for b(). Further, sqrt(55+67+12) == sqrt(134) == 11.5758369028.

Binary representation of the IEEE-754 floating point value 11.5758369028 is 01000001001110010011011010100001. And that binary as integer is 1094268577.

The compiler applied the so-called as if rule to replace both functions with assembly that has the exact same observable behavior as the original code: Both functions return a float with value 11.5758369028.

C++ const compiler optimization

No, the compiler can't move it in global scope, because the variable isn't declared at global scope. Scope isn't the same thing as storage. Scope denotes where the variable can be accessed from - moving it to the global scope would mean that it can be accessed from anywhere, which isn't the case here.

The second part of the program exhibits undefined behavior. addrs contains dangling pointers after the function exits. Because std::set compares the existing pointers on insertion, this is illegal. So I'd say yes, res2 can vary, but because of the UB, not the reason you suspect.

C++ Can constant class data be optimized out of class by compiler?

The compiler is free to emit any code that preserves the "observable behavior" of the program (there is an exception with copy constructor which can be elided even if it has observable behavior, but it doesn't apply here). This is called the as if rule

struct X { int x; };

auto foo()
{
  X x{24};

  return x.x;
}

any decent compiler will optimize the above to this:

foo():                                # @foo()
        mov     eax, 24
        ret

As you can see, it doesn't have anything to do with constness (well, almost), just with observable behavior. You can try and play with the code adding in complexity and see how smart is the compiler in figuring out it can remove the code related to the class instance. Hint: it is very smart.

Is not clear to me what you mean by this:

is the compiler allowed to do this instead and make the class be
sizeof(int) smaller?:

I can tell you that: for a type X and an object x of such type sizeof(x) is always = sizeof(X) regardless of instantiations of the class. In other words the size of the class is determined when the class is defined, and as such it is not influenced by possible instantiations or lack of. The size of a class is the sum of all the sizes of its non-static data members plus padding. The padding is implementation defined and can be usually be somewhat controlled (e.g. packed structs). So no, the size of a class can never be smaller than the sum of the sizes all it's non-static non-reference data members.