Memset() or Value Initialization to Zero Out a Struct

memset() or value initialization to zero out a struct?

Those two constructs a very different in their meaning. The first one uses a memset function, which is intended to set a buffer of memory to certain value. The second to initialize an object. Let me explain it with a bit of code:

Lets assume you have a structure that has members only of POD types ("Plain Old Data" - see What are POD types in C++?)

struct POD_OnlyStruct
{
int a;
char b;
};

POD_OnlyStruct t = {}; // OK

POD_OnlyStruct t;
memset(&t, 0, sizeof t); // OK as well

In this case writing a POD_OnlyStruct t = {} or POD_OnlyStruct t; memset(&t, 0, sizeof t) doesn't make much difference, as the only difference we have here is the alignment bytes being set to zero-value in case of memset used. Since you don't have access to those bytes normally, there's no difference for you.

On the other hand, since you've tagged your question as C++, let's try another example, with member types different from POD:

struct TestStruct
{
int a;
std::string b;
};

TestStruct t = {}; // OK

{
TestStruct t1;
memset(&t1, 0, sizeof t1); // ruins member 'b' of our struct
} // Application crashes here

In this case using an expression like TestStruct t = {} is good, and using a memset on it will lead to crash. Here's what happens if you use memset - an object of type TestStruct is created, thus creating an object of type std::string, since it's a member of our structure. Next, memset sets the memory where the object b was located to certain value, say zero. Now, once our TestStruct object goes out of scope, it is going to be destroyed and when the turn comes to it's member std::string b you'll see a crash, as all of that object's internal structures were ruined by the memset.

So, the reality is, those things are very different, and although you sometimes need to memset a whole structure to zeroes in certain cases, it's always important to make sure you understand what you're doing, and not make a mistake as in our second example.

My vote - use memset on objects only if it is required, and use the default initialization x = {} in all other cases.

How to initialize a struct to 0 in C++

Before we start:

  1. Let me point out that a lot of the confusion around this syntax comes because in C and C++ you can use the = {0} syntax to initialize all members of a C-style array to zero! See here: https://en.cppreference.com/w/c/language/array_initialization. This works:

    // z has type int[3] and holds all zeroes, as: `{0, 0, 0}`
    int z[3] = {0};

    But, that syntax does not work the same for structs, which are entirely different animals than C-style arrays.

  2. See also my follow-up question I asked after writing this answer below: Why doesn't initializing a C++ struct to = {0} set all of its members to 0?


Back to the answer:

I figured it out: to get it to compile, just delete the zero:

# does NOT work
myStruct _m1 = {0};

# works!
myStruct _m1 = {};

It now compiles. However, I ran a bunch of tests to check some things in my struct_initialization.cpp file in my eRCaGuy_hello_world repo, and that does NOT initialize all elements of the struct to zero! Rather, it initializes the struct to its default values. To run my tests and see for yourself, clone my repo above and run eRCaGuy_hello_world/cpp/run_struct_initialization.sh.

Assuming you have this struct:

typedef struct
{
int num1 = 100;
int num2 = -100;
int num3;
int num4 = 150;
} data_t;

Note: the typedef above is a carry-over from when I was testing this stuff in C instead of C++ (although the default struct values are not allowed in C, of course). For C++, this is preferred instead:

struct data_t
{
int num1 = 100;
int num2 = -100;
int num3;
int num4 = 150;
};

So please ignore it wherever I unnecessarily use typedef to define the structs below.

Anyway, if I declare one of the above data_t structs, and then do this:

data_t d2 = {};
printf("d2.num1 = %i\nd2.num2 = %i\nd2.num3 = %i\nd2.num4 = %i\n\n",
d2.num1, d2.num2, d2.num3, d2.num4);

...the output will be:

d2.num1 = 100
d2.num2 = -100
d2.num3 = 0
d2.num4 = 150

And I'm not even sure if d2.num3 is zero because it was initialized to zero or because it was left uninitialized, and that memory location happened to contain zero.

As explained here: https://en.cppreference.com/w/cpp/language/zero_initialization, you can also do this:

myStruct _m1{};

In the example above, this code:

data_t d2{};
printf("d2.num1 = %i\nd2.num2 = %i\nd2.num3 = %i\nd2.num4 = %i\n\n",
d2.num1, d2.num2, d2.num3, d2.num4);

...would produce output identical to what I showed above.

Even in cases where setting the struct to = {0} DOES work, such as this:

// Does NOT do what I expected! Only sets the FIRST value in the struct to zero! 
// The rest seem to use default values.
data_t d3 = {0};
printf("d3.num1 = %i\nd3.num2 = %i\nd3.num3 = %i\nd3.num4 = %i\n\n",
d3.num1, d3.num2, d3.num3, d3.num4);

...the output is still not what I expected, as it only sets the first value to zero! (I don't understand why):

d3.num1 = 0
d3.num2 = -100
d3.num3 = 0
d3.num4 = 150

On C-style arrays, however (NOT structs), these semantics work fine. Refer to this answer here (How to initialize all members of an array to the same value?). The following lines, therefore, both set all elements of the C-style array to zero when using C++:

uint8_t buffer[100] = {0}; // sets all elements to 0 in C OR C++
uint8_t buffer[100] = {}; // sets all elements to 0 in C++ only (won't compile in C)

So, after much experimentation, it looks like the following several ways are the ONLY ways to zero-initialize a struct, PERIOD. If you know differently, please comment and/or leave your own answer here.

The only ways possible to zero-initialize a struct in C++ are:

  1. Be explicit:

     // C-style typedef'ed struct
    typedef struct
    {
    int num1 = 100;
    int num2 = -100;
    int num3;
    int num4 = 150;
    } data_t;

    // EXPLICITLY set every value to what you want!
    data_t d1 = {0, 0, 0, 0};
    // OR (using gcc or C++20 only)
    data_t d2 = {.num1 = 0, .num2 = 0, .num3 = 0, .num4 = 0};
  2. Use memset() to force all bytes to zero:

     data_t d3;
    memset(&d3, 0, sizeof(d3));
  3. Set all default values to zero in the first place:

     // C-style typedef'ed struct
    typedef struct
    {
    int num1 = 0;
    int num2 = 0;
    int num3 = 0;
    int num4 = 0;
    } data_t;

    // Set all values to their defaults, which are zero in
    // this case
    data_t d4 = {};
    // OR
    data_t d5{}; // same thing as above in C++

    // Set the FIRST value only to zero, and all the rest
    // to their defaults, which are also zero in this case
    data_t d6 = {0};
  4. Write a constructor for the C++ struct

     // 1. Using an initializer list
    struct data
    {
    int num1;
    int num2;
    int num3;
    int num4;

    data() :
    num1(0),
    num2(0),
    num3(0),
    num4(0) {}
    };

    data d7; // all values are zero

    // OR: 2. manually setting the values inside the constructor
    struct data
    {
    int num1;
    int num2;
    int num3;
    int num4;

    data()
    {
    num1 = 0;
    num2 = 0;
    num3 = 0;
    num4 = 0;
    }
    };

    data d8; // all values are zero
  5. Use a struct with no default values, and make your object you create from it static

     typedef struct
    {
    int num1;
    int num2;
    int num3;
    int num4;
    } data_t;

    // `static` forces a default initialization of zero for each
    // value when no other default values are set
    static data_t d9;
  6. So, if you have a struct with non-zero default values, and you want to zero all values, you must do it EXPLICITLY! Here are some more ways:

     // 1. Have a `constexpr` copy of the struct that you use to
    // reset other struct objects. Ex:

    struct data
    {
    int num1 = 1;
    int num2 = 7;
    int num3 = -10;
    int num4 = 55;
    };

    constexpr data DATA_ALL_ZEROS = {0, 0, 0, 0};

    // Now initialize d13 to all zeros using the above `constexpr` struct
    // object
    data d13 = DATA_ALL_ZEROS;


    // OR 2. Use a `zero()` member function to zero the values:

    struct data
    {
    int num1 = 1;
    int num2 = 7;
    int num3 = -10;
    int num4 = 55;

    zero()
    {
    num1 = 0;
    num2 = 0;
    num3 = 0;
    num4 = 0;
    }
    };

    data d14;
    d14.zero();

The big take-away here is that NONE of these: data_t d{}, data_t d = {}, and data_t d = {0}, actually set all members of a struct to zero!

  1. data_t d{} sets all values to their defaults defined in the struct.
  2. data_t d = {} also sets all values to their defaults.
  3. And data_t d = {0} sets only the FIRST value to zero, and all other values to their defaults.

SO, BE EXPLICIT!

Note that the above key take-aways I wrote seem to contradict this documentation on cppreference.com, so it has led me to ask this follow-up question listed just below, which has proven VERY helpful to my understanding!

Going further

  1. MOST USEFUL: Follow-up question of mine: Why doesn't initializing a C++ struct to = {0} set all of its members to 0?

References:

  1. VERY USEFUL:
    1. https://en.cppreference.com/w/cpp/language/zero_initialization
    2. https://en.cppreference.com/w/cpp/language/aggregate_initialization
    3. https://en.cppreference.com/w/cpp/language/value_initialization
  2. VERY USEFUL: Initializing all members of an array (not struct) to the same value:
    1. How to initialize all members of an array to the same value?
    2. [gcc only] How to initialize all members of an array to the same value?
  3. https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/cpp/struct_initialization.cpp
    1. Clone this repo and run the code yourself with cpp/run_struct_initialization.sh

Related:

  1. Initializing default values in a struct
  2. *****[my own answer, which demonstrate this sort of struct modification/aggregate member reassignment within any function: leds[0] = {10, 20, 30, 40, 50};] Arduino Stack Exchange: Initializing Array of structs

Is it good style to memset a struct before using it?

As you've already been told, trying to read values from uninitialized members of a structure leads to undefined behaviour. That is unconditionally bad. Therefore, it is incumbent upon you to ensure that all fields are initialized before they're read.

If you know all the elements of the structure and are going to initialize them explicitly, then the memset() is not necessary. This can be manageable if the structure is under your control — you just have to remember to ensure that all the places where initialization takes place are updated when you add new members to the structure. If you write a function to do that (think 'C analogue of C++ constructor', to a first approximation), then the memset() can be left out. If you set the values ad hoc in many places, you've probably got problems if the structure changes.

In the case of something like struct sigaction, it is from a system-defined header, and different systems can (and do) add extra fields to the structure — over and above the ones you plan to initialize. Note that POSIX only specifies the fields that must be present; it does not dictate the order of the fields, nor does it mandate that there are no other fields in the structure. However, the functions using the extra (non-POSIX) elements of the structure should not do so unless the user indicates that those members are initialized, usually with some explicit flag, so you shouldn't run into problems — but it is better safe than sorry.

Consequently, in contexts where you don't have control over the structure, the memset() approach is readily defensible: it is guaranteed to zero all of the structure, even the bits you don't know about — even if the structure definition changes (grows) after the code is written.

You may be able to use struct sigaction sa = { 0 }; or struct sigaction *sap = calloc(sizeof(*sap), 1); to zero the structure instead — it depends in part on how fussy a set of compiler options you use (and also the version of the compiler you use; GCC has changed its behaviour over time, for example).

You might want to look up macros such as PTHREAD_MUTEX_INITIALIZER in the POSIX standard — or you may prefer to ignore their existence altogether.

If I zero initialize a struct at the start of a loop using {0}, will it be zeroed out every iteration?


Will every compiler actually zero out the struct at the start of every loop?

Any compiler that conforms to the C Standard will do this. From this Draft C11 Standard (bold emphasis mine):

6.8 Statements and blocks



3    A block allows a set of declarations and statements to be grouped into
one syntactic unit. The initializers of objects that have automatic
storage duration
, and the variable length array declarators of
ordinary identifiers with block scope, are evaluated and the values
are stored in the objects
(including storing an indeterminate value in
objects without an initializer) each time the declaration is reached
in the order of execution
, as if it were a statement, and within each
declaration in the order that declarators appear.

In the case of a for or while loop, a declaration/initializer inside the loop's scope block is reached repeatedly on each and every iteration of the loop.

When is memset to 0 nonportable?

memset(p, 0, n) sets to all-bits-0.

An initializer of { 0 } sets to the value 0.

On just about any machine you've ever heard of, the two concepts are equivalent.

However, there have been machines where the floating-point value 0.0 was not represented by a bit pattern of all-bits-0. And there have been machines where a null pointer was not represented by a bit pattern of all-bits-0, either. On those machines, an initializer of { 0 } would always get you the zero initialization you wanted, while memset might not.

See also question 7.31 and question 5.17 in the C FAQ list.


Postscript: One other difference, as pointed out by @ryker: memset will set any "holes" in a padded structure to 0, while setting that structure to { 0 } might not.

Initialize/reset struct to zero/null

Define a const static instance of the struct with the initial values and then simply assign this value to your variable whenever you want to reset it.

For example:

static const struct x EmptyStruct;

Here I am relying on static initialization to set my initial values, but you could use a struct initializer if you want different initial values.

Then, each time round the loop you can write:

myStructVariable = EmptyStruct;

Struct zero initialization methods

The first one is the best way by a country mile, as it guarantees that the struct members are initialised as they would be for static storage. It's also clearer.

There's no guarantee from a standards perspective that the two ways are equivalent, although a specific compiler may well optimise the first to the second, even if it ends up clobbering parts of memory discarded as padding.

(Note that in C++, the behaviour of the second way could well be undefined. Yes C is not C++ but a fair bit of C code does tend to end up being ported to C++.)

initializing a structure array using memset

Either

memset(&dev_sys, 0, sizeof dev_sys);

or

memset(dev_sys, 0, NUM_DEVICES * sizeof(struct device_sys));

Or, if you prefer

memset(dev_sys, 0, NUM_DEVICES * sizeof *dev_sys);

but not what you have in your original variant.

Note, that in your specific case in all variants you can use either &dev_sys or dev_sys as the first argument. The effect will be the same. However, &dev_sys is more appropriate in the first variant, since if follows the memset(ptr-to-object, object-size) idiom. In the second and third variants it is more appropriate to use dev_sys (or &dev_sys[0]), since it follows the memset(ptr-to-first-element, number-of-elements * element-size) idiom.

P.S. Of course, instead of using all that hackish memset trickery, in your particular case you should have just declared your array with an initializer

struct device_sys dev_sys[NUM_DEVICES] = { 0 };

No memset necessary.

Speed of memset Vs direct assignment to zero

It is a quality of implementation issue.

(BTW, in pure theory, an implementation might have a NULL pointer which is not an all zero bits word; for such cases the semantics of your §3 is different than those of §1 or §2; but in practice, most common processors today have a linear virtual address space and have their NULL pointer be a word of all zero bits)

Recent GCC compilers (at least on usual x86-64 processors), with optimizations enabled (e.g. gcc -O2) are likely to produce the same (or very similar) machine code (because memset gets expanded as __builtin_memset which gets specifically compiled and often inlined), so using memset is not slower in practice (and could even become faster because of vectorization, e.g. AVX machine instructions)

You could look at the assembler code produced with e.g. gcc -S -fverbose-asm -O2 -march=native

(in some cases, notably when struct xyz has hundreds of fields, the compiler would even synthesize a call to memset for your case 1 and 2!)

In general, will initialization by method 1 or 2 above be faster than method 3?

In practice the answer is generally no (so prefer the most readable approach). If you care that much, benchmark your code.

(don't forget that development time also costs money; in many cases your human time is worth more than the few CPU nanoseconds you might win, and generally won't)



Related Topics



Leave a reply



Submit