Setting Extra Bits in a Bool Makes It True and False at the Same Time

Setting extra bits in a bool makes it true and false at the same time

In C++ the bit representation (and even the size) of a bool is implementation defined; generally it's implemented as a char-sized type taking 1 or 0 as possible values.

If you set its value to anything different from the allowed ones (in this specific case by aliasing a bool through a char and modifying its bit representation), you are breaking the rules of the language, so anything can happen. In particular, it's explicitly specified in the standard that a "broken" bool may behave as both true and false (or neither true nor false) at the same time:

Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false

(C++11, [basic.fundamental], note 47)


In this particular case, you can see how it ended up in this bizarre situation: the first if gets compiled to

    movzx   eax, BYTE PTR [rbp-33]
test al, al
je .L22

which loads T in eax (with zero extension), and skips the print if it's all zero; the next if instead is

    movzx   eax, BYTE PTR [rbp-33]
xor eax, 1
test al, al
je .L23

The test if(T == false) is transformed to if(T^1), which flips just the low bit. This would be ok for a valid bool, but for your "broken" one it doesn't cut it.

Notice that this bizarre sequence is only generated at low optimization levels; at higher levels this is generally going to boil down to a zero/nonzero check, and a sequence like yours is likely to become a single test/conditional branch. You will get bizarre behavior anyway in other contexts, e.g. when summing bool values to other integers:

int foo(bool b, int i) {
return i + b;
}

becomes

foo(bool, int):
movzx edi, dil
lea eax, [rdi+rsi]
ret

where dil is "trusted" to be 0/1.


If your program is all C++, then the solution is simple: don't break bool values this way, avoid messing with their bit representation and everything will go well; in particular, even if you assign from an integer to a bool the compiler will emit the necessary code to make sure that the resulting value is a valid bool, so your bool T = 3 is indeed safe, and T will end up with a true in its guts.

If instead you need to interoperate with code written in other languages that may not share the same idea of what a bool is, just avoid bool for "boundary" code, and marshal it as an appropriately-sized integer. It will work in conditionals & co. just as fine.



Update about the Fortran/interoperability side of the issue

Disclaimer all I know of Fortran is what I read this morning on standard documents, and that I have some punched cards with Fortran listings that I use as bookmarks, so go easy on me.

First of all, this kind of language interoperability stuff isn't part of the language standards, but of the platform ABI. As we are talking about Linux x86-64, the relevant document is the System V x86-64 ABI.

First of all, nowhere is specified that the C _Bool type (which is defined to be the same as C++ bool at 3.1.2 note †) has any kind of compatibility with Fortran LOGICAL; in particular, at 9.2.2 table 9.2 specifies that "plain" LOGICAL is mapped to signed int. About TYPE*N types it says that

The “TYPE*N” notation specifies that variables or aggregate members of type TYPE shall occupy N bytes of storage.

(ibid.)

There's no equivalent type explicitly specified for LOGICAL*1, and it's understandable: it's not even standard; indeed if you try to compile a Fortran program containing a LOGICAL*1 in Fortran 95 compliant mode you get warnings about it, both by ifort

./example.f90(2): warning #6916: Fortran 95 does not allow this length specification.   [1]

logical*1, intent(in) :: x

------------^

and by gfort

./example.f90:2:13:
logical*1, intent(in) :: x
1
Error: GNU Extension: Nonstandard type declaration LOGICAL*1 at (1)

so the waters are already muddled; so, combining the two rules above, I'd go for signed char to be safe.

However: the ABI also specifies:

The values for type LOGICAL are .TRUE. implemented as 1 and .FALSE.
implemented as 0.

So, if you have a program that stores anything besides 1 and 0 in a LOGICAL value, you are already out of spec on the Fortran side! You say:

A fortran logical*1 has same representation as bool, but in fortran if bits are 00000011 it is true, in C++ it is undefined.

This last statement is not true, the Fortran standard is representation-agnostic, and the ABI explicitly says the contrary. Indeed you can see this in action easily by checking the output of gfort for LOGICAL comparison:

integer function logical_compare(x, y)
logical, intent(in) :: x
logical, intent(in) :: y
if (x .eqv. y) then
logical_compare = 12
else
logical_compare = 24
end if
end function logical_compare

becomes

logical_compare_:
mov eax, DWORD PTR [rsi]
mov edx, 24
cmp DWORD PTR [rdi], eax
mov eax, 12
cmovne eax, edx
ret

You'll notice that there's a straight cmp between the two values, without normalizing them first (unlike ifort, that is more conservative in this regard).

Even more interesting: regardless of what the ABI says, ifort by default uses a nonstandard representation for LOGICAL; this is explained in the -fpscomp logicals switch documentation, which also specifies some interesting details about LOGICAL and cross-language compatibility:

Specifies that integers with a non-zero value are treated as true, integers with a zero value are treated as false. The literal constant .TRUE. has an integer value of 1, and the literal constant .FALSE. has an integer value of 0. This representation is used by Intel Fortran releases before Version 8.0 and by Fortran PowerStation.

The default is fpscomp nologicals, which specifies that odd integer values (low bit one) are treated as true and even integer values (low bit zero) are treated as false.

The literal constant .TRUE. has an integer value of -1, and the literal constant .FALSE. has an integer value of 0. This representation is used by Compaq Visual Fortran. The internal representation of LOGICAL values is not specified by the Fortran standard. Programs which use integer values in LOGICAL contexts, or which pass LOGICAL values to procedures written in other languages, are non-portable and may not execute correctly. Intel recommends that you avoid coding practices that depend on the internal representation of LOGICAL values.

(emphasis added)

Now, the internal representation of a LOGICAL normally shouldn't a problem, as, from what I gather, if you play "by the rules" and don't cross language boundaries you aren't going to notice. For a standard compliant program there's no "straight conversion" between INTEGER and LOGICAL; the only way I see you can shove an INTEGER into a LOGICAL seem to be TRANSFER, which is intrinsically non-portable and give no real guarantees, or the non-standard INTEGER <-> LOGICAL conversion on assignment.

The latter one is documented by gfort to always result in nonzero -> .TRUE., zero -> .FALSE., and you can see that in all cases code is generated to make this happen (even though it's convoluted code in case of ifort with the legacy representation), so you cannot seem to shove an arbitrary integer into a LOGICAL in this way.

logical*1 function integer_to_logical(x)
integer, intent(in) :: x
integer_to_logical = x
return
end function integer_to_logical
integer_to_logical_:
mov eax, DWORD PTR [rdi]
test eax, eax
setne al
ret

The reverse conversion for a LOGICAL*1 is a straight integer zero-extension (gfort), so, to be honoring the contract in the documentation linked above, it's clearly expecting the LOGICAL value to be 0 or 1.

But in general, the situation for these conversions is a bit of a mess, so I'd just stay away from them.


So, long story short: avoid putting INTEGER data into LOGICAL values, as it is bad even in Fortran, and make sure to use the correct compiler flag to get the ABI-compliant representation for booleans, and interoperability with C/C++ should be fine. But to be extra safe, I'd just use plain char on the C++ side.

Finally, from what I gather from the documentation, in ifort there is some builtin support for interoperability with C, including booleans; you may try to leverage it.

How to make bool variable only true if its being set to true?

Since the question is open to multiple interpretation, first let me summarize the problem as I understood from the description and the comments: OP wants to set variable global::Aimbot to true when some key is pressed, and set it to false when said key is released.

Answering: When function set_view_angles() is called, it changes the state of the global variable global::Aimbot to true, and it will remain true until something changes it back to false, so you may want a new function to do the job. Something like:

inline void unset_view_angles()
{
global::Aimbot = false;
}

C++ itself doesn't provide a way to know if a key is pressed or not, you must rely on some external library for that. In the code where you poll or wait for keyboard events, you must check for the state of the key, and call either unset_view_angles() or set_view_angles() accordingly. For instance, if the key is "Q" and you are polling keyboard events with SDL library, you would do:

SDL_Event event;
while (SDL_PollEvent(&event)) {
if(event.type == SDL_KEYDOWN && event.key.keysym.sym == SDLK_q)
set_view_angles();
else if(event.type == SDL_KEYUP && event.key.keysym.sym == SDLK_q)
unset_view_angles();
}

How to set a bool variable to true when a key is pushed and false when it is pushed again?

I'm not sure whether I fully understand what you are trying to accomplish. Your code contains so many problematic parts, that I choose to write a completely new code:

class Program
{
const byte VK_F8 = 0x77;
const byte VK_ESC = 0x1b;

static bool globalAppState = false;

static void Main(string[] args)
{
bool lastState = IsKeyPressed(VK_F8);
while (!IsKeyPressed(VK_ESC))
{
bool newState = IsKeyPressed(VK_F8);
if (lastState != newState)
{
if (newState)
{
Console.WriteLine("F8: pressed");
globalAppState = !globalAppState;
}
else
Console.WriteLine("F8: released");
lastState = newState;
}
}
}

static bool IsKeyPressed(byte keyCode)
{
return ((GetAsyncKeyState(keyCode) & 0x8000) != 0);
}

[DllImport("user32.dll")]
static extern short GetAsyncKeyState(int vKey);
}

Function IsKeyPressed(VK_F8) tells you always the current state (pressed/released) of the specified key.

When you need to do some action only on change (from pressed to released, or from released to pressed) replace the console output functions with your specific task.

When you need some multi-threading like processing the event in a new thread, that is a different question... (outside of this scope)

EDIT: Added change of variable on each new key pressed event. This is dirty solution...

Why can bool and _Bool only store 0 or 1 if they occupy 1 byte in memory?

The C language limits what can be stored in a _Bool, even if it has the capacity to hold other values besides 0 and 1.

Section 6.3.1.2 of the C standard says the following regarding conversions to _Bool:

When any scalar value is converted to _Bool, the result is 0 if the value compares equal
to 0; otherwise, the result is 1.

The C++17 standard has similar language in section 7.14:

A prvalue of arithmetic, unscoped enumeration, pointer, or pointer to member type can be converted to a
prvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false;
any other value is converted to true. For direct-initialization (11.6), a prvalue of type std::nullptr_t can
be converted to a prvalue of type bool; the resulting value is false.

So even if you attempt to assign some other value to a _Bool the language will convert the value to either 0 or 1 for C and to true or false for C++. If you attempt to bypass this by writing to a _Bool via a pointer to a different type, you invoke undefined behavior.

One-byte bool. Why?

Why does a bool require one byte to store true or false where just one bit is enough

Because every object in C++ must be individually addressable* (that is, you must be able to have a pointer to it). You cannot address an individual bit (at least not on conventional hardware).

How much safer is it to use the following?

It's "safe", but it doesn't achieve much.

is the above field technique really going to help?

No, for the same reasons as above ;)

but still compiler generated code to access them is bigger and slower than the code generated to access the primitives.

Yes, this is true. On most platforms, this requires accessing the containing byte (or int or whatever), and then performing bit-shifts and bit-mask operations to access the relevant bit.


If you're really concerned about memory usage, you can use a std::bitset in C++ or a BitSet in Java, which pack bits.


* With a few exceptions.

Bool member variable in class is set to True when uninitialized

While bool conceptually contains only one bit of information, the requirements of the C++ standard mean that a bool object must take up at least eight bits. There's three main ways that a compiler might represent bool at a bitwise level:

  1. All bits zeroed for false; all bits one'd for true (0x00 versus 0xFF)
  2. All bits zeroed for false; the lowest bit one'd for true (0x00 versus 0x01)
  3. All bits zeroed for false; at least one bit one'd for true(0x00 versus anything else)

(Note that this choice of representation is not ordinarily visible in the effects of a program. Regardless of how the bits are represented, a bool becomes a 0 or 1 when casted to a wider integer type. It's only relevant to the machine code being generated.)

In practice, modern x86/x64 compilers go with option 2. There are instructions which make this straightforward, it makes casting bool to int trivial, and comparing bools works without additional effort.

A side effect is that if the bits making up a bool end up set to, say, 0x37, weird stuff can happen, because the executable code isn't expecting that. For instance, both branches of an if-statement might be taken. A good debugger should loudly yell at you when it sees a bool with an unexpected bit pattern, but in practice they tend to show the value as true.

The common theme of all those options is that most random bit patterns are not the bit pattern for false. So if the allocator really did set it to a "random" value, it almost certainly would be shown as true in the debugger.

Can I assume (bool)true == (int)1 for any C++ compiler?

According to the standard, you should be safe with that assumption. The C++ bool type has two values - true and false with corresponding values 1 and 0.

The thing to watch about for is mixing bool expressions and variables with BOOL expression and variables. The latter is defined as FALSE = 0 and TRUE != FALSE, which quite often in practice means that any value different from 0 is considered TRUE.

A lot of modern compilers will actually issue a warning for any code that implicitly tries to cast from BOOL to bool if the BOOL value is different than 0 or 1.



Related Topics



Leave a reply



Submit