C++:Why Bool Is 8 Bits Long

C++ : why bool is 8 bits long?

Because every C++ data type must be addressable.

How would you create a pointer to a single bit? You can't. But you can create a pointer to a byte. So a boolean in C++ is typically byte-sized. (It may be larger as well. That's up to the implementation. The main thing is that it must be addressable, so no C++ datatype can be smaller than a byte)

C++: Is bool a 1-bit variable?

No there's no such thing like a 1 bit variable.

The smallest unit that can be addressed in c++ is a unsigned char.

Is a bool[8] then 1 Byte long?

No.

Or 8 Bytes?

Not necessarily. Depends on the target machines number of bits taken for a unsigned char.

But i don't want to waste space if a bool in C++ is 8 bit long...

You can avoid wasting space when dealing with bits using std::bitset, or boost::dynamic_bitset if you need a dynamic sizing.

As pointed out by @zett42 in their comment you can also address single bits with a bitfield struct (but for reasons of cache alignement this will probably use even more space):

struct S {
    // will usually occupy 4 bytes:
    unsigned b1 : 1, 
             b2 : 1,
             b3 : 1;
};

Boolean values as 8 bit in compilers. Are operations on them inefficient?

TL:DR: current compilers still have bool missed-optimizations when doing stuff like

(a&&b) ? x : y. But the reason why is not that they don't assume 0/1, they just suck at this.

Many uses of bool are for locals, or inline functions, so booleanizing to a 0 / 1 can optimize away and branch (or cmov or whatever) on the original condition. Only worry about optimizing bool inputs / outputs when it does have to get passed/returned across something that doesn't inline, or really stored in memory.

Possible optimization guideline: combine bools from external sources (function args / memory) with bitwise operators, like a&b. MSVC and ICC do better with this. IDK if it's ever worse for local bools. Beware that a&b is only equivalent to a&&b for bool, not integer types. 2 && 1 is true, but 2 & 1 is 0 which is false. Bitwise OR doesn't have this problem.

IDK if this guideline will ever hurt for locals that were set from a comparison within the function (or in something that inlined). E.g. it might lead the compiler to actually make integer booleans instead of just using comparison results directly when possible. Also note that it doesn't seem to help with current gcc and clang.

Yes, C++ implementations on x86 store bool in a byte that's always 0 or 1 (at least across function-call boundaries where the compiler has to respect the ABI / calling convention which requires this.)

Compilers do sometimes take advantage of this, e.g. for bool->int conversion even gcc 4.4 simply zero-extends to 32-bit (movzx eax, dil). Clang and MSVC do this, too. C and C++ rules require this conversion to produce 0 or 1, so this behaviour is only safe if it's always safe to assume that a bool function arg or global variable has a 0 or 1 value.

Even old compilers typically did take advantage of it for bool->int, but not in other cases. Thus, Agner is wrong about the reason when he says:

The reason why the compiler doesn't make such an assumption is that the variables might have other values if they are uninitialized or come from unknown sources.

MSVC CL19 does make code that assumes bool function args are 0 or 1, so the Windows x86-64 ABI must guarantee this.

In the x86-64 System V ABI (used by everything other than Windows), the changelog for revision 0.98 says "Specify that _Bool (aka bool) is booleanized at the caller." I think even before that change, compilers were assuming it, but this just documents what compilers were already relying on. The current language in the x86-64 SysV ABI is:

3.1.2 Data Representation
Booleans, when stored in a memory object, are stored as single byte objects the value of which is always 0 (false) or 1 (true). When stored in integer registers (except for passing as arguments), all 8 bytes of the register are significant; any nonzero value is considered true.

The second sentence is nonsense: the ABI has no business telling compilers how to store things in registers inside a function, only at boundaries between different compilation units (memory / function args and return values). I reported this ABI defect a while ago on the github page where it's maintained.

3.2.3 Parameter passing:
When a value of type _Bool is returned or passed in a register or on the stack, bit 0 contains the truth value and bits 1 to 7 shall be zero¹⁶.
(footnote 16): Other bits are left unspecified, hence the consumer side of those values can rely on it being 0 or 1 when truncated to 8 bit.

The language in the i386 System V ABI is the same, IIRC.

Any compiler that assumes 0/1 for one thing (e.g. conversion to int) but fails to take advantage of it in other cases has a missed optimization. Unfortunately such missed-optimizations still exist, although they are rarer than when Agner wrote that paragraph about compilers always re-booleanizing.

(Source + asm on the Godbolt compiler explorer for gcc4.6 / 4.7, and clang/MSVC. See also Matt Godbolt's CppCon2017 talk What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid)

bool logical_or(bool a, bool b) { return a||b; }

 # gcc4.6.4 -O3 for the x86-64 System V ABI
    test    dil, dil            # test a against itself (for non-zero)
    mov     eax, 1
    cmove   eax, esi            # return   a ? 1 : b;
    ret

So even gcc4.6 didn't re-booleanize b, but it did miss the optimization that gcc4.7 makes: (and clang and later compilers as shown in other answers):

    # gcc4.7 -O3 to present: looks ideal to me.
    mov     eax, esi
    or      eax, edi
    ret

(Clang's or dil, sil / mov eax, edi is silly: it's guaranteed to cause a partial-register stall on Nehalem or earlier Intel when reading edi after writing dil, and it has worse code size from needing a REX prefix to use the low-8 part of edi. A better choice might be or dil,sil / movzx eax, dil if you want to avoid reading any 32-bit registers in case your caller left some arg-passing registers with "dirty" partial registers.)

MSVC emits this code that checks a then b separately, completely failing to take advantage of anything, and even using xor al,al instead of xor eax,eax. So it has a false dependency on the old value of eax on most CPUs (including Haswell/Skylake, which don't rename low-8 partial regs separately from the whole register, only AH/BH/...). This is just dumb. The only reason to ever use xor al,al is when you explicitly want to preserve the upper bytes.

logical_or PROC                     ; x86-64 MSVC CL19
    test     cl, cl                 ; Windows ABI passes args in ecx, edx
    jne      SHORT $LN3@logical_or
    test     dl, dl
    jne      SHORT $LN3@logical_or
    xor      al, al                 ; missed peephole: xor eax,eax is strictly better
    ret      0
$LN3@logical_or:
    mov      al, 1
    ret      0
logical_or ENDP

ICC18 also doesn't take advantage of the known 0/1 nature of the inputs, it just uses an or instruction to set flags according to the bitwise OR of the two inputs, and setcc to produce a 0/1.

logical_or(bool, bool):             # ICC18
    xor       eax, eax                                      #4.42
    movzx     edi, dil                                      #4.33
    movzx     esi, sil                                      #4.33
    or        edi, esi                                      #4.42
    setne     al                                            #4.42
    ret                                                     #4.42

ICC emits the same code even for bool bitwise_or(bool a, bool b) { return a|b; }. It promotes to int (with movzx), and uses or to set flags according to the bitwise OR. This is dumb compared to or dil,sil / setne al.

For bitwise_or, MSVC does just use an or instruction (after movzx on each input), but anyway doesn't re-booleanize.

Missed optimizations in current gcc/clang:

Only ICC/MSVC were making dumb code with the simple function above, but this function still gives gcc and clang trouble:

int select(bool a, bool b, int x, int y) {
    return (a&&b) ? x : y;
}

Source+asm on the Godbolt compiler explorer (Same source, different compilers selected vs. last time).

Looks simple enough; you'd hope that a smart compiler would do it branchlessly with one test/cmov. x86's test instruction sets flags according to a bitwise AND. It's an AND instruction that doesn't actually write the destination. (Just like cmp is a sub that doesn't write the destination).

# hand-written implementation that no compilers come close to making
select:
    mov     eax, edx      # retval = x
    test    edi, esi      # ZF =  ((a & b) == 0)
    cmovz   eax, ecx      # conditional move: return y if ZF is set
    ret

But even the daily builds of gcc and clang on the Godbolt compiler explorer make much more complicated code, checking each boolean separately. They know how to optimize bool ab = a&&b; if you return ab, but even writing it that way (with a separate boolean variable to hold the result) doesn't manage to hand-hold them into making code that doesn't suck.

Note that test same,same is exactly equivalent to cmp reg, 0, and is smaller, so it's what compilers use.

Clang's version is strictly worse than my hand-written version. (Note that it requires that the caller zero-extended the bool args to 32-bit, like it does for narrow integer types as an unofficial part of the ABI which it and gcc implement but only clang depends on).

select:  # clang 6.0 trunk 317877 nightly build on Godbolt
    test    esi, esi
    cmove   edx, ecx         # x = b ? y : x
    test    edi, edi
    cmove   edx, ecx         # x = a ? y : x
    mov     eax, edx         # return x
    ret

gcc 8.0.0 20171110 nightly makes branchy code for this, similar to what older gcc versions do.

select(bool, bool, int, int):   # gcc 8.0.0-pre   20171110
    test    dil, dil
    mov     eax, edx          ; compiling with -mtune=intel or -mtune=haswell would keep test/jcc together for macro-fusion.
    je      .L8
    test    sil, sil
    je      .L8
    rep ret
.L8:
    mov     eax, ecx
    ret

MSVC x86-64 CL19 makes very similar branchy code. It's targeting the Windows calling convention, where integer args are in rcx, rdx, r8, r9.

select PROC
        test     cl, cl         ; a
        je       SHORT $LN3@select
        mov      eax, r8d       ; retval = x
        test     dl, dl         ; b
        jne      SHORT $LN4@select
$LN3@select:
        mov      eax, r9d       ; retval = y
$LN4@select:
        ret      0              ; 0 means rsp += 0 after popping the return address, not C return 0.
                                ; MSVC doesn't emit the `ret imm16` opcode here, so IDK why they put an explicit 0 as an operand.
select ENDP

ICC18 also makes branchy code, but with both mov instructions after the branches.

select(bool, bool, int, int):
        test      dil, dil                                      #8.13
        je        ..B4.4        # Prob 50%                      #8.13
        test      sil, sil                                      #8.16
        jne       ..B4.5        # Prob 50%                      #8.16
..B4.4:                         # Preds ..B4.2 ..B4.1
        mov       edx, ecx                                      #8.13
..B4.5:                         # Preds ..B4.2 ..B4.4
        mov       eax, edx                                      #8.13
        ret                                                     #8.13

Trying to help the compiler by using

int select2(bool a, bool b, int x, int y) {
    bool ab = a&&b;
    return (ab) ? x : y;
}

leads MSVC into making hilariously bad code:

;; MSVC CL19  -Ox  = full optimization
select2 PROC
    test     cl, cl
    je       SHORT $LN3@select2
    test     dl, dl
    je       SHORT $LN3@select2
    mov      al, 1              ; ab = 1

    test     al, al             ;; and then test/cmov on an immediate constant!!!
    cmovne   r9d, r8d
    mov      eax, r9d
    ret      0
$LN3@select2:
    xor      al, al            ;; ab = 0

    test     al, al            ;; and then test/cmov on another path with known-constant condition.
    cmovne   r9d, r8d
    mov      eax, r9d
    ret      0
select2 ENDP

This is only with MSVC (and ICC18 has the same missed optimization of test/cmov on a register that was just set to a constant).

gcc and clang as usual don't make code as bad as MSVC; they make the same asm they do for select(), which is still not good but at least trying to help them doesn't make it worse like with MSVC.

Combine `bool` with bitwise operators helps MSVC and ICC

In my very limited testing, | and & seem to work better than || and && for MSVC and ICC. Look at the compiler output for your own code with your compiler + compile options to see what happens.

int select_bitand(bool a, bool b, int x, int y) {
    return (a&b) ? x : y;
}

Gcc still branches separately on separate tests of the two inputs, same code as the other versions of select. clang still does two separate test/cmov, same asm as for the other source versions.

MSVC comes through and optimizes correctly, beating all the other compilers (at least in the stand-alone definition):

select_bitand PROC            ;; MSVC
    test     cl, dl           ;; ZF =  !(a & b)
    cmovne   r9d, r8d
    mov      eax, r9d         ;; could have done the mov to eax in parallel with the test, off the critical path, but close enough.
    ret      0

ICC18 wastes two movzx instructions zero-extending the bools to int, but then makes the same code as MSVC

select_bitand:          ## ICC18
    movzx     edi, dil                                      #16.49
    movzx     esi, sil                                      #16.49
    test      edi, esi                                      #17.15
    cmovne    ecx, edx                                      #17.15
    mov       eax, ecx                                      #17.15
    ret                                                     #17.15

Why isn't the size of a bool data type only 1 bit in C#?

Is it because the smallest 'addressable' size of a value is a byte

Yep, exactly the same thing. In order for the CLR to be efficient, it maps its data types to the native machine data types in much the same way as the compiler does in C++ (pretty much).

In C how much space does a bool (boolean) take up? Is it 1 bit, 1 byte or something else?

If you are referring to C99 _Bool try:

printf("%zu\n", sizeof(_Bool)); /* Typically 1. */

Note the standard says:

6.2.5

An object declared as type _Bool is large enough to store the values 0
and 1.

The size cannot be smaller than one byte. But it would be legal to be larger than one byte.

Why does the boolean data type need 8 bits?

I think it may need more than 8 bits. It depends on JMV." In Oracle JVM primitive boolean needs 8 bits, the reason is limited support and lack of optimization.

Read also: What is the size of a boolean variable in Java?

After The Java Tutorials - Primitive Data Types

boolean: The boolean data type has only two possible values: true and false. Use this data type for simple flags that track true/false conditions. This data type represents one bit of information, but its "size" isn't something that's precisely defined.

After The Java® Virtual Machine Specification

Although the Java Virtual Machine defines a boolean type, it only provides
very limited support for it. There are no Java Virtual Machine instructions solely
dedicated to operations on boolean values. Instead, expressions in the Java
programming language that operate on boolean values are compiled to use values
of the Java Virtual Machine int data type.

In Oracle’s Java Virtual Machine implementation, boolean arrays in the Java
programming language are encoded as Java Virtual Machine byte arrays, using 8 bits per
boolean element.

For example Boolean type looks in memory like this

header:   8 bytes 
value:    1 byte 
padding:  7 bytes
------------------
sum:      16 bytes

As an alternative to boolean[] you can use for example java.util.BitSet.

Why is hard to store booleans as 1 bit? Read Vlad from Moscow answer. You cant address one bit of memory.

Why is a boolean 1 byte and not 1 bit of size?

Because the CPU can't address anything smaller than a byte.

Is it possible to send decimal value(8 bit) in bool ? If Yes then How?

Not in C++, no. A bool can hold true or false. There is no way to store 2 in a bool without first invoking undefined behaviour. Once you have invoked undefined behaviour, anything can happen. (Including what you expected except when demo'ing to important clients).

Also, a bool is not necessarily 8 bits long. It must be at least as large as a char (because sizeof(bool) must be at least 1), and the limits on the range of values which an unsigned char can hold means that it must be at least 8 bits. OTOH, there is nothing to stop an implementation using a bool which is larger than char, and there actually are implementations where char is 32 or 64 bits (DSP chips in the main).

Why can bool and _Bool only store 0 or 1 if they occupy 1 byte in memory?

The C language limits what can be stored in a _Bool, even if it has the capacity to hold other values besides 0 and 1.

Section 6.3.1.2 of the C standard says the following regarding conversions to _Bool:

When any scalar value is converted to _Bool, the result is 0 if the value compares equal
to 0; otherwise, the result is 1.

The C++17 standard has similar language in section 7.14:

A prvalue of arithmetic, unscoped enumeration, pointer, or pointer to member type can be converted to a
prvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false;
any other value is converted to true. For direct-initialization (11.6), a prvalue of type std::nullptr_t can
be converted to a prvalue of type bool; the resulting value is false.

So even if you attempt to assign some other value to a _Bool the language will convert the value to either 0 or 1 for C and to true or false for C++. If you attempt to bypass this by writing to a _Bool via a pointer to a different type, you invoke undefined behavior.

C++:Why Bool Is 8 Bits Long