How to Use C++20's Likely/Unlikely Attribute in If-Else Statement

How to use C++20's likely/unlikely attribute in if-else statement

Based on example from Jacksonville’18 ISO C++ Report the syntax is correct, but it seems that it is not implemented yet:

if (a>b) [[likely]] {

10.6.6 Likelihood attributes [dcl.attr.likelihood] draft

Can a C++20 [[likely]] or [[unlikely]] attribute be used on the condition of a do-while loop?

Can a C++20 [[likely]] or [[unlikely]] attribute be used on the condition of a do-while loop?

[[likely]] cannot be applied on "conditions". It can be applied on labels and statements.

However this looks rather strange. Is this really the correct place for the attribute?

You've applied the attribute to the return statement. If we adjust whitespace a bit, then you'll see it better:

} while (i < 42);

[[likely]] return i;

You should apply the attribute to the block statement that is the loop body:

do [[likely]] {

Simple example where [[likely]] and [[unlikely]] affect program assembly?

As it seems, there is a bug in gcc. If you have two functions which are the same, besides [[likely]] attributes, gcc folds them incorrectly.

But if you use just one function, and switch between [[likely]]/[[unlikely]], assembly changes.

So, this function:

void bar(int i) {
    if(i) [[unlikely]] {
        true_path();
    } else [[likely]] {
        false_path();
    }
}

compiles to:

bar(int):
        test    edi, edi
        jne     .L4
        jmp     false_path()
.L4:
        jmp     true_path()

And this:

void bar(int i) {
    if(i) [[likely]] {
        true_path();
    } else [[unlikely]] {
        false_path();
    }
}

compiles to:

bar(int):
        test    edi, edi
        je      .L2
        jmp     true_path()
.L2:
        jmp     false_path()

Notice, that the condition has changed: the first version jumps, if i is non-zero, while the second one jumps if i is zero.

This is in agreement with the attributes: gcc generates code, where the conditional jump happens in the unlikely path.

Why isn’t my code with C++20 likely/unlikely attributes faster?

Per godbolt, the two functions generates identical assembly under msvc

        movsd   xmm1, QWORD PTR __real@4000000000000000
        comisd  xmm0, xmm1
        jbe     SHORT $LN2@calc
        xorps   xmm1, xmm1
        ucomisd xmm1, xmm0
        ja      SHORT $LN7@calc
        sqrtsd  xmm0, xmm0
        ret     0
$LN7@calc:
        jmp     sqrt
$LN2@calc:
        jmp     pow

Since msvc is not open source, one could only guess why msvc would choose to ignore this optimization -- maybe because two branches are all function calls (it's tail call so jmp instead of call) and that's too costly for [[likely]] to make a difference.
But if clang is used, it's smart enough to optimize power 2 into x * x, so different code would be generated. Following that lead, if your code is modified into

        double calc(double x) noexcept {
            if (x > 2)
                return x + 1;
            else
                return x - 2;
        }

msvc would also output different layout.

How do the likely/unlikely macros in the Linux kernel work and what is their benefit?

They are hint to the compiler to emit instructions that will cause branch prediction to favour the "likely" side of a jump instruction. This can be a big win, if the prediction is correct it means that the jump instruction is basically free and will take zero cycles. On the other hand if the prediction is wrong, then it means the processor pipeline needs to be flushed and it can cost several cycles. So long as the prediction is correct most of the time, this will tend to be good for performance.

Like all such performance optimisations you should only do it after extensive profiling to ensure the code really is in a bottleneck, and probably given the micro nature, that it is being run in a tight loop. Generally the Linux developers are pretty experienced so I would imagine they would have done that. They don't really care too much about portability as they only target gcc, and they have a very close idea of the assembly they want it to generate.

What is the advantage of GCC's __builtin_expect in if else statements?

Imagine the assembly code that would be generated from:

if (__builtin_expect(x, 0)) {
    foo();
    ...
} else {
    bar();
    ...
}

I guess it should be something like:

  cmp   $x, 0
  jne   _foo
_bar:
  call  bar
  ...
  jmp   after_if
_foo:
  call  foo
  ...
after_if:

You can see that the instructions are arranged in such an order that the bar case precedes the foo case (as opposed to the C code). This can utilise the CPU pipeline better, since a jump thrashes the already fetched instructions.

Before the jump is executed, the instructions below it (the bar case) are pushed to the pipeline. Since the foo case is unlikely, jumping too is unlikely, hence thrashing the pipeline is unlikely.

Can I improve branch prediction with my code?

TL:DR: Yes, in C or C++ use a likely() macro, or C++20 [[likely]], to help the compiler make better asm. That's separate from influencing actual CPU branch-prediction, though. If writing in asm, lay out your code to minimize taken branches.

For most ISAs, there's no way in asm to hint the CPU whether a branch is likely to be taken or not. (Some exceptions include Pentium 4 (but not earlier or later x86), PowerPC, and some MIPS, which allow branch hints as part of conditional-branch asm instructions.)

Is it possible to tell the branch predictor how likely it is to follow the branch?

But not-taken straight-line code is cheaper than taken, so hinting high-level language to lay out code with the fast-path contiguous doesn't help branch prediction accuracy, but can help (or hurt) performance. (I-cache locality, front-end bandwidth: remember code-fetch happens in contiguous 16 or 32-byte blocks, so a taken branch means a later part of that fetch block isn't useful. Also, branch prediction throughput; some CPUs like Intel Skylake for example can't handle a predicted-taken branch at more than 1 per 2 clocks, other than loop branches. That include unconditional branches like jmp or ret.)

Taken branches are hard; not-taken branches keep the CPU on its toes, but if the prediction is accurate it's just a normal instruction for an execution unit (verifying the prediction), with nothing special for the front-end. See also Modern Microprocessors
A 90-Minute Guide! which has a section on branch prediction. (And is overall excellent.)

What exactly happens when a skylake CPU mispredicts a branch?
Avoid stalling pipeline by calculating conditional early
How does the branch predictor know if it is not correct?

Many people misunderstand source-level branch hints as branch prediction hints. That could be one effect if compiling for a CPU that supports branch hints in asm, but for most the significant effect is in layout, and deciding whether to use branchless (cmov) or not; a [[likely]] condition also means it should predict well.

With some CPUs, especially older, layout of a branch did sometimes influence runtime prediction: if the CPU didn't remember anything about the branch in its dynamic predictors, the standard static prediction heuristic is that forward conditional branches are not-taken, backward conditional are assumed taken (because that's normally the bottom of a loop. See the BTFNT section in https://danluu.com/branch-prediction/.

A compiler can lay out an if(c) x else y; either way, either matching the source with jump over x if !c as the opening thing, or swap the if and else blocks and use the opposite branch condition. Or it can put one block out-of-line (e.g. after the ret at the end of the function) so the fast path has no taken branches conditional or otherwise, while the less likely path has to jump there and then jump back.

It's easy to do more harm than good with branch hints in high-level source, especially if surrounding code changes without paying attention to them, so profile-guided optimization is the best way for compilers to learn about branch predictability and likelihood. (e.g. gcc -O3 -fprofile-generate / run with some representative inputs that exercise code-paths in relevant ways / gcc -O3 -fprofile-use)

But there are ways to hint in some languages, like C++20 [[likely]] and [[unlikely]], which are the portable version of GNU C likely() / unlikely() macros around __builtin_expect.

https://en.cppreference.com/w/cpp/language/attributes/likely C++20 [[likely]]
How to use C++20's likely/unlikely attribute in if-else statement syntax help
Is there a compiler hint for GCC to force branch prediction to always go a certain way? (to the literal question, no. To what's actually wanted, branch hints to the compiler, yes.)
How do the likely/unlikely macros in the Linux kernel work and what is their benefit? The GNU C macros using __builtin_expect, same effect but different syntax than C++20 [[likely]]
What is the advantage of GCC's __builtin_expect in if else statements? example asm output. (Also see CiroSantilli's answers to some of the other questions where he made examples.)
Simple example where [[likely]] and [[unlikely]] affect program assembly?

I don't know of ways to annotate branches for languages other than GNU C / C++, and ISO C++20.

Absent any hints or profile data

Without that, optimizing compilers have to use heuristics to guess which side of a branch is more likely. If it's a loop branch, they normally assume that the loop will run multiple times. On an if, they have some heuristics based on the actual condition and maybe what's in the blocks being controlled; IDK I haven't looked into what gcc or clang do.

I have noticed that GCC does care about the condition, though. It's not as naive as assuming that int values are uniformly randomly distributed, although I think it normally assumes that if (x == 10) foo(); is somewhat unlikely.

JIT compilers like in a JVM have an advantage here: they can potentially instrument branches in the early stages of running, to collect branch-direction information before making final optimized asm. OTOH they need to compile fast because compile time is part of total run time, so they don't try as hard to make good asm, which is a major disadvantage in terms of code quality.

How to Use C++20's Likely/Unlikely Attribute in If-Else Statement