When to Use Bit-Fields in C

When to use bit-fields in C

Now I am curious, [are flags] the only way bitfields are used practically?

No, flags are not the only way bitfields are used. They can also be used to store values larger than one bit, although flags are more common. For instance:

typedef enum {
    NORTH = 0,
    EAST = 1,
    SOUTH = 2,
    WEST = 3
} directionValues;

struct {
    unsigned int alice_dir : 2;
    unsigned int bob_dir : 2;
} directions;

Do we need to use bitfields to save space?

Bitfields do save space. They also allow an easier way to set values that aren't byte-aligned. Rather than bit-shifting and using bitwise operations, we can use the same syntax as setting fields in a struct. This improves readability. With a bitfield, you could write

directions.alice_dir = WEST;
directions.bob_dir = SOUTH;

However, to store multiple independent values in the space of one int (or other type) without bitfields, you would need to write something like:

#define ALICE_OFFSET 0
#define BOB_OFFSET 2
directions &= ~(3<<ALICE_OFFSET); // clear Alice's bits
directions |= WEST<<ALICE_OFFSET; // set Alice's bits to WEST
directions &= ~(3<<BOB_OFFSET);   // clear Bob's bits
directions |= SOUTH<<BOB_OFFSET;  // set Bob's bits to SOUTH

The improved readability of bitfields is arguably more important than saving a few bytes here and there.

Why do we use int? How much space is occupied?

The space of an entire int is occupied. We use int because in many cases, it doesn't really matter. If, for a single value, you use 4 bytes instead of 1 or 2, your user probably won't notice. For some platforms, size does matter more, and you can use other data types which take up less space (char, short, uint8_t, etc.).

As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?

No, that is not correct. The entire unsigned int will exist, even if you're only using 8 of its bits.

Should I use bit-fields for mapping incoming serial data?

Should I use bit-fields for mapping incoming serial data?

No. Bit-fields have a lot of implementation specified behaviour that makes using them a nightmare.

Will data1 always represent the correct value as expected regardless of endianness.

Yes, but that is because uint8_t is smallest possible addressable unit: a byte. For larger data types you need to take care of the byte endianness.

Could data2 and reserved be the wrong way around, with data2 representing the upper 4 bits instead of the lower 4 bits?

Yes. They could also be on different bytes. Also, compiler doesn't have to support uint8_t for bitfields, even if it would support the type otherwise.

Is the bit endianness (generally) dependent on the byte endianness, or can they differ entirely?

The least signifact bit will always be in the least significant byte, but it's impossible to determine in C where in the byte the bit will be.

Bit shifting operators give reliable abstraction of the order that is good enough: For data type uint8_t the (1u << 0) is always the least significant and (1u << 7) the most significant bit, for all compilers and for all architectures.

Bit-fields on the other hand are so poorly defined that you cannot determine the order of bits by the order of your defined fields.

Is the bit-endianness determined by the hardware or the compiler?

Compiler dictates how datatypes map to actual bits, but hardware heavily influences it. For bit-fields, two different compilers for the same hardware can put fields in different order.

Is there a simple way to determine in the compiler which way around it is, and reserve the bit-fields entries if needed?

Not really. It depends on your compiler how to do it, if it's possible at all.

Although bit-fields are the neatest way, code-wise, to map the incoming data, I suppose I am just wondering if it's a lot safer to just abandon them, and use something like:

Definitely abandon bit-fields, but I would also recommend abandoning structures altogether for this purpose, because:

You need to use compiler extensions or manual work to handle byte order.
You need to use compiler extensions to disable padding to avoid gaps due to alignment restrictions. This affects member access performance on some systems.
You cannot have variable width or optional fields.
It's very easy to have strict aliasing violations if you are unaware of those issues. If you define byte array for the data frame and cast that to pointer to structure and then dereference that, you have problems in many cases.

Instead I recommend doing it manually. Define byte array and then write each field into it manually by breaking them apart using bit shifting and masking when necessary. You can write a simple reusable conversion functions for the basic data types.

Can or should I make bools bit fields?

Can ... I make bools bit fields?

Yes. It is one of 3 well defined choices.

A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type. It is implementation-defined whether atomic types are permitted. C17dr § 6.7.2.1 5

.... should I make bools bit fields?

Yes, if it makes code more clear.

Note: this is one place to not use int x:1 as it is implementation defined if x has values [0,1] or [-1,0]. Use signed int x:1 or unsigned x:1 or _Bool x:1 for [-1,0], [0,1], [0,1] respectively.

For x:1, bool does have a clearer functionally specification than signed int when assigning an out-of-range value. See comment. For unsigned, just the LSbit is copied.

Bit manipulations good practices

The problem with bit fields is that the C standard does not dictate that the order in which they are defined is the same as the order that they are implemented. So you may not be setting the bits you think you are.

Section 6.7.2.1p11 of the C standard states:

An implementation may allocate any addressable storage unit large
enough to hold a bit- field. If enough space remains, a bit-field
that immediately follows another bit-field in a structure shall be
packed into adjacent bits of the same unit. If insufficient space
remains, whether a bit-field that does not fit is put into
the next unit or overlaps adjacent units is
implementation-defined. The order of allocation of bit-fields within
a unit (high-order to low-order or low-order to high-order) is
implementation-defined. The alignment of the addressable storage
unit is unspecified.

As an example, look at the definition of struct iphdr, which represents an IP header, from the /usr/include/netinet/ip.h file file on Linux:

struct iphdr
  {
#if __BYTE_ORDER == __LITTLE_ENDIAN
    unsigned int ihl:4;
    unsigned int version:4;
#elif __BYTE_ORDER == __BIG_ENDIAN
    unsigned int version:4;
    unsigned int ihl:4;
#else
# error "Please fix <bits/endian.h>"
#endif
    u_int8_t tos;
    ...

You can see here that the bitfields are placed in a different order depending on the implementation. You also shouldn't use this specific check because this behavior is system dependent. It is acceptable for this file because it is part of the system. Other systems may implement this in different ways.

So don't use a bitfield.

The best way to do this is to set the required bits. However, it would make sense to define named constants for each bit and to perform a bitwise OR of the constants you want to set. For example:

const uint8_t BIT_BYTE =     0x1;
const uint8_t BIT_HW   =     0x2;
const uint8_t BIT_WORD =     0x4;
const uint8_t BIT_GO   =     0x8;
const uint8_t BIT_I_EN =     0x10;
const uint8_t BIT_REEN =     0x20;
const uint8_t BIT_WEEN =     0x40;
const uint8_t BIT_LEEN =     0x80;

DMA_base_ptr[DMA_CONTROL_OFFS] = BIT_LEEN | BIT_GO | BIT_WORD;

Does union of bit fields make any sense

Is the compiler required to use the same first 2 bits of the same unsigned for all fields (despite what the comments say)?

No. C 2018 6.7.2.1 says “An implementation may allocate any addressable storage unit large enough to hold a bit-field… The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined…”

It does not say the addressable storage unit will be the same for all bit-fields of the same size. If it did, then all the union bit-field members of the same size would have to use the same bits, and certainly any reasonable C implementation would do so.

However, consider bit-fields of different sizes. It is reasonable that a compiler would allocate a one-byte storage unit for a bit-field of 2 bits and a four-byte storage unit for a bit-field of 17 bits. If it is a little-endian system and puts the bits in high-order to low-order, then the 2-bit field would be in bits 2⁷ and 2⁶ of byte 0, and the 17-bit field would be in all bits of bytes 3 and 2 (bits 2³¹ to 2¹⁶ of the four-byte little-endian storage unit) and bit 2⁷ of byte 1 (bit 2¹⁵ of the storage unit). So there would be no overlap between these two union members.

Is there any real usage for a union of bit fields? I cannot think of any situation where this would make sense.

Sure, I might have some field in a data structure that sometimes needs to store a 17-bit fromitz number and other times needs to store a 13-bit gizmo number. Unions were originally for storing one thing or another, not for reinterpreting bits of one type as another type.

When to Use Bit-Fields in C