Sizeof a Union in C/C++

sizeof a union in C/C++

The Standard answers all questions in section 9.5 of the C++ standard, or section 6.5.2.3 paragraph 5 of the C99 standard (or paragraph 6 of the C11 standard, or section 6.7.2.1 paragraph 16 of the C18 standard):

In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time. [Note: one special guarantee is made in order to simplify the use of unions: If a POD-union contains several POD-structs that share a common initial sequence (9.2), and if an object of this POD-union type contains one of the POD-structs, it is permitted to inspect the common initial sequence of any of POD-struct members; see 9.2. ] The size of a union is sufficient to contain the largest of its data members. Each data member is allocated as if it were the sole member of a struct.

That means each member share the same memory region. There is at most one member active, but you can't find out which one. You will have to store that information about the currently active member yourself somewhere else. Storing such a flag in addition to the union (for example having a struct with an integer as the type-flag and an union as the data-store) will give you a so called "discriminated union": An union which knows what type in it is currently the "active one".

One common use is in lexers, where you can have different tokens, but depending on the token, you have different informations to store (putting line into each struct to show what a common initial sequence is):

struct tokeni {
    int token; /* type tag */
    union {
        struct { int line; } noVal;
        struct { int line; int val; } intVal;
        struct { int line; struct string val; } stringVal;
    } data;
};

The Standard allows you to access line of each member, because that's the common initial sequence of each one.

There exist compiler extensions that allow accessing all members disregarding which one currently has its value stored. That allows efficient reinterpretation of stored bits with different types among each of the members. For example, the following may be used to dissect a float variable into 2 unsigned shorts:

union float_cast { unsigned short s[2]; float f; };

That can come quite handy when writing low-level code. If the compiler does not support that extension, but you do it anyway, you write code whose results are not defined. So be certain your compiler has support for it if you use that trick.

Union size in C

I know that the size of union in C is the size of the largest member of the union.

If you know that, then I have some bad news for you :-) It's not necessarily true.

C implementations are allowed to insert padding in between members of structures, and after the final member of both structures and unions, in order to meet alignment requirements. The reason for the latter is to ensure, if you create an array of the union type, all elements of the array will be correctly aligned.

In this case (keeping in mind this is an example, as the sizes and alignment requirements may vary per platform), the most stringent requirement is probably that of uint32_t, in that it will prefer to be on a four-octet boundary.

That, combined with the fact that sTest is five octets in size, means that the structure size will be eight octets, not five.

`sizeof` struct in union definition

You cannot take the size of an an anonymous type, so simply make it not anonymous:

union 
{
    struct range// <<< give it a tag here
    { 
        char hi;
        char lo;
    } by_name;

    char as_bytes[sizeof(struct range)]; // <<< Take sizeof here

} U2;

You can also create a nested typedef, though it serves little purpose perhaps:

union 
{
    typedef struct 
    {
        char hi;
        char lo;
    } range ; 

    range by_name;

    char as_bytes[sizeof(range)];
} U2;

sizeof(struct) and sizeof(union)

Even though, for the most part, the answers are implementation-defined, a programmer for debugging and trouble-shooting needs to know what the (likely) answers are, on his/her platform:

a. What is the sizeof(a) and sizeof(b)?

sizeof a = 6 * 4 + 12 * 2 = 48. Important: if you change 12 to 13, this calculation would likely be wrong, as padding would typically be added, probably 2 bytes, and so the size of the struct would not be the sum of the sizes of its elements.

sizeof b = max(6 * 4, 12 * 2) = 24, because in this union, x and y are overlayed. Again if you change 12 to 13, there's likely padding.

b. if &a = 0x00320000, what is &a.y?

&a.y = 0x00320000 + 6 * 4 = 0x00320018

c. if &b = 0x00320400, what is &b.y?

&b.y = &b (guaranteed)

Why sizeof of a structure or union in C need a variable?

There is nothing in your program called person or u_type. Sure you have struct person and union u_type but there are no typedefs to let you just use person and u_type

Try sizeof(struct person)

To answer the questions in the comments:

struct person { ... }; Gives you a type: struct person- where person is a tag. You need to use struct person to use the type. In this case sizeof(struct person).

struct { ... } person; Gives you a variable called person that is a struct (but you can't reuse the type). In this case sizeof(person)

The most common use is typedef struct { ... } Person which gives you a type Person - much like the first case, but you can use Person instead of struct person. In this case sizeof(Person)

Sizeof a Union in C/C++