Practical Use of Zero-Length Bitfields

Practical Use of Zero-Length Bitfields

You use a zero-length bitfield as a hacky way to get your compiler to lay out a structure to match some external requirement, be it another compiler's or architecture's notion of the layout (cross-platform data structures, such as in a binary file format) or a bit-level standard's requirements (network packets or instruction opcodes).

A real-world example is when NeXT ported the xnu kernel from the Motorola 68000 (m68k) architecture to the i386 architecture. NeXT had a working m68k version of their kernel. When they ported it to i386, they found that the i386's alignment requirements differed from the m68k's in such a way that an m68k machine and an i386 machine did not agree on the layout of the NeXT vendor-specific BOOTP structure. In order to make the i386 structure layout agree with the m68k, they added an unnamed bitfield of length zero to force the NV1 structure/nv_U union to be 16-bit aligned.

Here are the relevant parts from the Mac OS X 10.6.5 xnu source code:

/* from xnu/bsd/netinet/bootp.h */
/*
* Bootstrap Protocol (BOOTP). RFC 951.
*/
/*
* HISTORY
*
* 14 May 1992 ? at NeXT
* Added correct padding to struct nextvend. This is
* needed for the i386 due to alignment differences wrt
* the m68k. Also adjusted the size of the array fields
* because the NeXT vendor area was overflowing the bootp
* packet.
*/
/* . . . */
struct nextvend {
u_char nv_magic[4]; /* Magic number for vendor specificity */
u_char nv_version; /* NeXT protocol version */
/*
* Round the beginning
* of the union to a 16
* bit boundary due to
* struct/union alignment
* on the m68k.
*/
unsigned short :0;
union {
u_char NV0[58];
struct {
u_char NV1_opcode; /* opcode - Version 1 */
u_char NV1_xid; /* transcation id */
u_char NV1_text[NVMAXTEXT]; /* text */
u_char NV1_null; /* null terminator */
} NV1;
} nv_U;
};

What is zero-width bit field

From this first hit on a Google search:

Bit fields with a length of 0 must be unnamed. Unnamed bit fields cannot be referenced or initialized. A zero-width bit field can cause the next field to be aligned on the next container boundary where the container is the same size as the underlying type of the bit field.

As for the second part of your question, you set some of the bitfields in your struct to all 1s, and since these fields are signed then this results in a negative value for these fields. You can see this more effectively if you set the entire struct to 1s and look at the values in both signed and unsigned representations, e.g.

int main()
{
struct foo f;
memset(&f, 0xff, sizeof(f));
printf("a=%d\nb=%d\nc=%d\nd=%d\n", f.a, f.b, f.c, f.d); // print fields as signed
printf("a=%u\nb=%u\nc=%u\nd=%u\n", f.a, f.b, f.c, f.d); // print fields as unsigned
return 0;
}

Different types for zero length bit fields in c?

Update, after reading the text in context:

The result of your example (corrected to use char):

struct bar {
unsigned char x:5;
unsigned int :0;
unsigned char y:7;
}

would look like this (assuming 16-bit int):

 char pad pad      int boundary
| | | |
v v v v
xxxxx000 00000000 yyyyyyy0

(I'm ignoring endian).

The zero-length bitfield causes the position to move to next int boundary. You defined int to be 16-bit, so 16 minus 5 gives 11 bits of padding.

It does not insert an entire blank int. The example on the page you link demonstrates this (but using 32-bit integers).

What does an unnamed zero length bit-field mean in C?

First of all, let's see chapter §6.7.2.1, Structure and union specifiers, P11. It says,

An implementation may allocate any addressable storage unit large enough to hold a bitfield.
If enough space remains, a bit-field that immediately follows another bit-field in a
structure shall be packed into adjacent bits of the same unit.
[...]

But, in case, we explicitly want two consecutive bit-field members, which "might be" packed into a single memory location to reside on separate memory location (i.e., addressable storage unit ), the above is the way to force it.

The next paragraph, P12, mentions,

A bit-field declaration with no declarator, but only a colon and a width, indicates an
unnamed bit-field.126) As a special case, a bit-field structure member with a width of 0 indicates that no further bit-field is to be packed into the unit in which the previous bit-field, if any, was placed.

following your example, this makes sure that the two bit-field members surrounding the :0 will be residing in separate memory location (not inside a single addressable storage unit, even if sufficient memory remains to pack them into one). This has the similar effect of having a non-bit-field member in between two bit-fields, to force the separation of the memory location.

Quoting C11, chapter §3.14, NOTE 2 (emphasis mine)

A bit-field and an adjacent non-bit-field member are in separate memory locations. The same
applies to two bit-fields
, if one is declared inside a nested structure declaration and the other is not, or if the
two are separated by a zero-length bit-field declaration,
or if they are separated by a non-bit-field member
declaration.

Also, regarding the usage ("why it is needed" part)

[...] The bit-fields b and c cannot be concurrently
modified, but b and a, for example, can be.


Addendum:

Regarding the concurrency part, from NOTE 1

Two threads of execution can update and access separate memory locations without interfering
with each other.

and, from chapter §5.1.2.4/P1,

Under a hosted implementation, a program can have more than one thread of execution
(or thread) running concurrently. [...]

So, this is a theoretically viable option, as per the standard.

What are the practical uses of bit-fields in language C?

This is especially useful in the Hardware where you want to create a Register of a module.

Now, Register has many bit-fields in it which can be of varying size in bits.
So, you create a structure to represent the Register and the bit-fields. Basically registers in hardware are the structures which store the information about the module.

For Ex, for USB module the registers inside the USB hardware store the information about the status of the USB device and many other things.

By limiting the length of data members inside the struct to bits instead of reserving uints(or any other primitive data types) for the bit-fields as it occupies very less memory.

Also, the dummy declaration unsigned int : 6; is used to pad the structure so that the structure objects and the accesses are word aligned according to the machine architecture. Hence the access to the register objects doesn't consume more time if the accesses are aligned to the word boundary of the processor. Basically if the word, half-word or a byte is aligned at the address which is the multiples of the processor's word size then it is accessible very efficiently at a single stretch.

For example, in your case the Register is of 16-bits and it has 3 bit-fields: mask, privilege and ov. Whereas the remaining 6-bits are reserved for future use.
Here it is how Register looks like,

bit-position     15  14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
---------------------------------------
|ov | reserved | privilege |mask|
---------------------------------------

So, by making the structure of size 16-bits the objects of this structure can be easily accessible on a processor with 8,16,32 .... bit ALU's

Bitfields in C without variable name

It's called unnamed bit field. C11 standard states about it and possible usage:

6.7.2.1 Structure and union specifiers

A bit-field declaration with no declarator, but only a colon
and a width, indicates an unnamed bit-field. 106) As a special
case, a bit-field structure member with a width of 0 indicates that no
further bit-field is to be packed into the unit in which the previous
bitfeld, if any, was placed.

106) An unnamed bit-field structure member is useful for
padding to conform to externally imposed layouts.

Usability of bitfields

[...] bitfield layouts are implementation-defined.

Some aspects are implementation-defined. Others are unspecified, such as the size of the addressible storage unit reserved for a bitfield.

Is this practically a problem?

It depends on what you're trying to do. Many of the same issues that apply more broadly to structure types apply in microcosm to bitfields. Among them,

  • Like structures generally, structures containing bitfields will be interpreted consistently by any given implementation, but
  • Like structures generally, structures containing bitfields may be interpreted differently by different implementations -- possibly affecting only the bitfield members.
  • Like with structure member layout, implementations are afforded more freedom to choose bitfield layout than some programmers assume.

I've noticed the SysV ABI for x86-64,
for example, defines how bitfields should be laid out, so I suppose
using bitfields on this platform shouldn't be problematic even if I
mix object code generated by different compilers.

Using bitfields does not present an interoperability problem when mixing code that can be relied upon to produce and use identical bitfield layouts.

Using bitfields does not present a portability problem for code that avoids depending on details of bitfield layout.

Those are conflicting concerns, because interoperability requires consistent layout, but relying on layout details creates a portability problem.

Are bitfields similarly standardized on other platforms too? (I'm
mainly interested in Linux (SysV ABI), MacOs, and CygWin.)

Generally speaking, for hosted implementations (including all your examples) there will be a platform ABI defining bitfield layout, for software interoperability within a platform. ABI is not particularly relevant to standalone implementations, but many, if not all, such implementations do specify full details of bitfield layouts. If your concern is about whether you can link bitfield-using code compiled with different C implementations for the same platform and get a correctly-working program, then the answer is almost certainly "yes".



Related Topics



Leave a reply



Submit