Structure padding and packing
Padding aligns structure members to "natural" address boundaries - say, int
members would have offsets, which are mod(4) == 0
on 32-bit platform. Padding is on by default. It inserts the following "gaps" into your first structure:
struct mystruct_A {
char a;
char gap_0[3]; /* inserted by compiler: for alignment of b */
int b;
char c;
char gap_1[3]; /* -"-: for alignment of the whole struct in an array */
} x;
Packing, on the other hand prevents compiler from doing padding - this has to be explicitly requested - under GCC it's __attribute__((__packed__))
, so the following:
struct __attribute__((__packed__)) mystruct_A {
char a;
int b;
char c;
};
would produce structure of size 6
on a 32-bit architecture.
A note though - unaligned memory access is slower on architectures that allow it (like x86 and amd64), and is explicitly prohibited on strict alignment architectures like SPARC.
Memory alignment in C-structs
At least on most machines, a type is only ever aligned to a boundary as large as the type itself [Edit: you can't really demand any "more" alignment than that, because you have to be able to create arrays, and you can't insert padding into an array]. On your implementation, short
is apparently 2 bytes, and int
4 bytes.
That means your first struct is aligned to a 2-byte boundary. Since all the members are 2 bytes apiece, no padding is inserted between them.
The second contains a 4-byte item, which gets aligned to a 4-byte boundary. Since it's preceded by 6 bytes, 2 bytes of padding is inserted between v3
and i
, giving 6 bytes of data in the short
s, two bytes of padding, and 4 more bytes of data in the int
for a total of 12.
Data Alignment for C struct
The answer depends on the compiler, platform and compile options. Some examples:
https://godbolt.org/z/4tAzB_
The author of the book does not understand the topic I afraid.
Structure memory alignment in C
The compiler pads structure members to keep things aligned properly for each data type.
See this link for information on how structure padding actually works. There is a lot of very detailed information there. See also this link for alignment information specific to the intel processor for the various data types.
In essence, because you have a 10-byte char array followed by an int, the compiler has padded the char array with an extra 2 bytes so that the int will be aligned properly on a 4-byte boundary (that is, an address evenly divisible by 4).
It is as if you declared your structure like this:
struct a
{
char arr[10];
char _padding[2];
int i;
float b;
};
Out of habit, I usually allocate char arrays with sizes that are evenly divisible by 4. That way the compiler doesn't have to do it for me, and it makes it easier to visualize what the data structure looks like in memory.
x86 Memory Alignment of struct vs. cache line?
I think the part you might be missing is the alignment requirement that the compiler imposes for various types.
Integer types are generally aligned to a multiple of their own size (e.g. a 64-bit integer will be aligned to 8 bytes); so-called "natural alignment". This is not a strict architectural requirement of x86; unaligned loads and stores still work, but since they are less efficient, the compiler prefers to avoid them.
An aggregate, like a struct
, is aligned according to the highest alignment requirement of its members, and padding will be inserted between members if needed to ensure that each one is properly aligned. Padding will also be added at the end so that the overall size of the struct
is a multiple of its required alignment.
So in your example, struct Obj
has alignment 8, and its size will be rounded up to 48 (with 6 bytes of padding at the end). So there is no need for 24 bytes of padding to be inserted after c[4]
(I think you meant to write the padding at addresses 40-63); your obj
can be placed at address 40. d
can then be placed at address 88.
Note that none of this has anything to do with the cache line size. Objects are not by default aligned to cache lines, though "natural alignment" will ensure that no integer load or store ever has to cross a cache line.
sizeof(), alignment in C structs:
Your struct must be 8*N
bytes long, since it has a member with 8 bytes (double
). That means the struct sits in the memory at an address (A) divisible by 8 (A%8 == 0
), and its end address will be (A + 8N
) which will also be divisible by 8.
From there, you store 2 4-bytes variables (int
+ float
) meaning you now occupy the memory area [A,A+8)
. Now you store an 8-byte variable (double
). There is no need for padding since (A+8) % 8 == 0
[since A%8 == 0
]. So, with no padding you get the 4+4+8 == 16
.
If you change the order to int -> double -> float
you'll occupy 24 bytes since the double
variable original address will not be divisible by 8 and it will have to pad 4 bytes to get to a valid address (and also the struct will have padding at the end).
|--------||--------||--------||--------||--------||--------||--------||--------|
| each || cell || here ||represen||-ts 4 || bytes || || |
|--------||--------||--------||--------||--------||--------||--------||--------|
A A+4 A+8 A+12 A+16 A+20 A+24 [addresses]
|--------||--------||--------||--------||--------||--------||--------||--------|
| int || float || double || double || || || || | [content - basic case]
|--------||--------||--------||--------||--------||--------||--------||--------|
first padding to ensure the double sits on address that is divisble by 8
last padding to ensure the struct size is divisble by the largest member's size (8)
|--------||--------||--------||--------||--------||--------||--------||--------|
| int || padding|| double || double || float || padding|| || | [content - change order case]
|--------||--------||--------||--------||--------||--------||--------||--------|
memory alignment into structure - alignment size equal to largest member size
The OVERALL alignment of the structure should be that of the element with the greatest alignment requirement. This is required for the purpose of ensuring that, for example, an array of structures is always aligned. If you didn't have that, the size of struct { int x; char c; };
would have the first element aligned, but the next three would have x
unaligned.
It is often possible to convince the compiler to generate a "packed" data structure (with no alignment padding) and use that to get a packed array, but it's a bad idea to use that in all but very special cases, because at best it's slower, at worst it causes execution to stop due to "unaligned access trap" in the processor.
If the size of int
is four bytes [it is for all compilers I'm aware of - long
is either 4 or 8 bytes, depends on the compiler], both on a 32- and 64-bit (at least x86) will be 4 byte aligned.
If you want to have a 7 byte "gap" in a struct, this would work:
struct X {
char c;
uint64_t x;
};
which would of course have:
struct X {
char c;
char padding[7];
uint64_t x;
};
Memory Alignment in C/C++
The examples given in the book are highly dependent on the used compiler and computer architecture. If you test them in your own program you may get totally different results than the author. I will assume a 64-bit architecture, because the author does also, from what I've read in the description.
Lets look at the examples one by one:
ReallySlowStruct
IF the used compiler supports non-byte aligned struct members, the start of "d" will be at the seventh bit of the first byte of the struct. Sounds very good for memory saving. The problem with this is, that C does not allow bit-adressing. So to save newValue to the "d" member, the compiler must do a whole lot of bit shifting operations: Save the first two bits of "newValue" in byte0, shifted 6 bits to the right. Then shift "newValue" two bits to the left and save it starting at byte 1. Byte 1 is a non-aligned memory location, that means the bulk memory transfer instructions won't work, the compiler must save every byte at a time.
SlowStruct
It gets better. The compiler can get rid of all the bit-fiddling. But writing "d" will still require writing every byte at a time, because it is not aligned to the native "int" size. The native size on a 64-bit system is 8. so every memory address not divisable by 8 can only be accessed one byte at a time. And worse, if I switch off packing, I will waste a lot of memory space: every member which is followed by an int will be padded with enough bytes to let the integer start at a memory location divisable by 8. In this case: char a and c will both take up 8 bytes.
FastStruct
this is aligned to the size of int on the target machine. "d" takes up 8 bytes as it should. Because the chars are all bundled at one place, the compiler does not pad them and does not waste space. chars are only 1 byte each, so we do not need to pad them. The complete structure adds up to an overall size of 16 bytes. Divisable by 8, so no padding needed.
In most scenarios, you never have to be concerned with alignment because the default alignment is already optimal. In some cases however, you can achieve significant performance improvements, or memory savings, by specifying a custom alignment for your data stuctures.
In terms of memory space, the compiler pads the structure in a way that naturally aligns each element of the structure.
struct x_
{
char a; // 1 byte
int b; // 4 bytes
short c; // 2 bytes
char d; // 1 byte
} bar[3];
struct x_
is padded by the compiler and thus becomes:
// Shows the actual memory layout
struct x_
{
char a; // 1 byte
char _pad0[3]; // padding to put 'b' on 4-byte boundary
int b; // 4 bytes
short c; // 2 bytes
char d; // 1 byte
char _pad1[1]; // padding to make sizeof(x_) multiple of 4
} bar[3];
Source: https://docs.microsoft.com/en-us/cpp/cpp/alignment-cpp-declarations?view=vs-2019
Struct alignment may waste memory?
All instances of st2
have the same layout. Where the object is stored may not affect that layout. Whether the object is a member or not may not affect that layout. The compiler cannot simply pick one of the members of mySt
and store it outside of the object.
Consider passing st2
into a function:
void some_function(st2&);
That function cannot know where that object comes from. But it has to be able to access the members of the object. The layout of the object has to be known at compile time. If one instance of st2
would have different layout from another, how would the function know?
Data alignment of structure using malloc
The question erroneously assumes that
struct st *p = malloc(sizeof(*p));
is the same as
struct st *p = malloc(13);
It is not. To test,
printf ("Size of st is %d\n", sizeof (*p));
which prints 24
, not 13.
The proper way to allocate and manage structures is with sizeof(X)
, and not by assuming anything about how the elements are packed or aligned.
Related Topics
Using Qsocketnotifier to Select on a Char Device
How to Compile SQLite with Icu
Colour Output of Program Run Under Bash
Memory Stability of a C++ Application in Linux
How to Know If One Shared Library Depends on Another Shared Library or Not
How to Make a Function Async-Signal-Safe
Libstdc++ Static Linking in Dynamic Library
/Usr/Lib64/Libstdc++.So.6: Version 'Glibcxx_3.4.15' Not Found
How to Get a List of Installed True Type Fonts on Linux Using C or C++
Pthread Condition Variables Not Signalling Even Though Set to Pthread_Process_Shared
How to Update an Existing Element of Std::Set
How to Solve "Unresolved Inclusion: <Iostream>" in a C++ File in Eclipse Cdt
Double or Float, Which Is Faster