Structure of a C++ Object in Memory VS a Struct

Structure of a C++ Object in Memory Vs a Struct

The C++ standard guarantees that memory layouts of a C struct and a C++ class (or struct -- same thing) will be identical, provided that the C++ class/struct fits the criteria of being POD ("Plain Old Data"). So what does POD mean?

A class or struct is POD if:

  • All data members are public and themselves POD or fundamental types (but not reference or pointer-to-member types), or arrays of such
  • It has no user-defined constructors, assignment operators or destructors
  • It has no virtual functions
  • It has no base classes

About the only "C++-isms" allowed are non-virtual member functions, static members and member functions.

Since your class has both a constructor and a destructor, it is formally speaking not of POD type, so the guarantee does not hold. (Although, as others have mentioned, in practice the two layouts are likely to be identical on any compiler that you try, so long as there are no virtual functions).

See section [26.7] of the C++ FAQ Lite for more details.

C# struct or struct[] as member of struct contiguous in memory

would otherArray data be stored contiguously after x or is otherArray always going to be a reference type either way

otherArray is always going to be a reference to an array object, even if it only has one element. The struct layout is a property of the struct type, not a property of particular struct values.

is a struct as a member of struct stored contiguously with the struct or is it simply stored as some sort of address to the actual data of the other struct?

Structs are value types, so there is no "some sort of address to the actual data of the other struct". That's what reference types would do. It's not contiguous necessarily, (but it is contiguous in the case of Other and Simple) - it will follow the default alignment rules if you don't specify an explicit Pack or LayoutKind. See here for more info.

Let's consider:

struct Simple
{
public int x;
public Other other;
}

struct Other
{
public int y;
}

and the value:

var s = new Simple();
s.x = unchecked((int)0xabcdefab);
s.other.y = 0x12345678;

You'd expect the size of s be 8 bytes, and its value will contain the numbers 0xabcdefab and 0x12345678:

// prints 8
Console.WriteLine(sizeof(Simple));
// in an unsafe block, prints 12345678ABCDEFAB
Console.WriteLine(Marshal.ReadInt64(new IntPtr(&s)).ToString("X"));

You can try adding more fields to Other, and see that sizeof(Simple) increases

Compare this to a reference type:

struct Simple
{
public int x;
public OtherRef other;
}

class OtherRef
{
public int y;
}

You can't use & to get the address now, so here's a sharplab link to show that the other field is indeed an address, rather than the value you set it.

What is the memory lay out of a C++ structure?

When it comes to structs and classes, The layout is implementation-defined between compilers, os, and architecture... Some will use automatic alignment, others will use padding, some may even auto arrange it's members. If you need to know the size of a struct, use sizeof(Your Struct).

Here's a code snippet...

#include <iostream>

struct A {
char a;
float b;
int c;
};

struct B {
float a;
int b;
char c;
};

int main() {
std::cout << "Sizeof(A) = " << sizeof(A) << '\n';
std::cout << "Sizeof(B) = " << sizeof(B) << '\n';
return 0;
}

Output:

Sizeof(A) = 12
Sizeof(B) = 12

For my particular machine, I'm running Windows 7 - 64bit, It is an Intel Core2 Quad Extreme, and I'm using Visual Studio 2017 running it with C++17.

With my particular setup, both structures are being generated with a different layout, but have the same size in bytes.

In A's case...

char a; // 1 byte
// 3 bytes of padding
float b; // 4 bytes
int c; // 4 bytes (int is 32bit even on x64).

In B's case...

float a; // 4 bytes
int b; // 4 bytes
char c; // 1 byte
// 3 bytes of padding.

Also, your compiler flags and optimizations may have an effect. This isn't always guaranteed, as it is implementation-defined as stated in the standard.


--Edit--

Also, if you don't want this exact behavior there are some pragmas directives and macros such as pragma pack and alignas() that can be used to modify your implementation details. Here are a few references.

  • How to use alignas to replace pragma pack?
  • https://en.cppreference.com/w/cpp/preprocessor/impl
  • https://en.cppreference.com/w/cpp/language/alignas
  • https://learn.microsoft.com/en-us/cpp/cpp/alignment-cpp-declarations?view=msvc-160
  • https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.cbclx01/pragma_pack.htm
  • https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.cbclx01/packqua.htm
  • https://www.iditect.com/how-to/57426535.html
  • https://downloads.ctfassets.net/oxjq45e8ilak/1LriV4eAdhNlu9Zv06H9NJ/53576095f772b5f6cddbbedccb7ebd8a/Alexander_Titov_Know_your_hardware_CPU_memory_hierarchy.pdf
  • https://cpc110.blogspot.com/2020/10/vs2019-alignas-in-struct-definition.html

Struct vs class memory overhead

For me already when researching deeper into this I had the following assumptions (they may be inexact; i'm getting old for a programmer). A class has extra memory consumption because a reference is required to address it. Store the reference and an Int32 sized pointer is needed on a 32bit compile. Allocated always on the heap (can't remember if C++ has other possibilities, i would venture yes?)

The short answer, found in this article, Object has a 12bytes basic footprint + 4 possibly unused bytes depending on your class (has no doubt something to do with padding).

http://www.codeproject.com/Articles/231120/Reducing-memory-footprint-and-object-instance-size

Other issues you'll run into is Arrays also have an overhead. A possibility would be to manage your own offset into a larger array or arrays. Which in turn is getting closer to something a more efficient language would be better suited for.

I'm not sure if there are libraries that may provide Storage for small objects in an efficient manner. Probably are.

My take on it, use Structs, manage your own offset in a large array, and use proper packing instructions if it serves you (although i suspect this comes at a cost at runtime of a few extra instructions each time you address unevenly packed data)

[StructLayout(LayoutKind.Sequential, Pack = 1)]

Struct memory layout in C

In C, the compiler is allowed to dictate some alignment for every primitive type. Typically the alignment is the size of the type. But it's entirely implementation-specific.

Padding bytes are introduced so every object is properly aligned. Reordering is not allowed.

Possibly every remotely modern compiler implements #pragma pack which allows control over padding and leaves it to the programmer to comply with the ABI. (It is strictly nonstandard, though.)

From C99 §6.7.2.1:

12 Each non-bit-field member of a
structure or union object is aligned
in an implementation- defined manner
appropriate to its type.

13 Within a
structure object, the non-bit-field
members and the units in which
bit-fields reside have addresses that
increase in the order in which they
are declared. A pointer to a structure
object, suitably converted, points to
its initial member (or if that member
is a bit-field, then to the unit in
which it resides), and vice versa.
There may be unnamed padding within a
structure object, but not at its
beginning.

Memory Allocation for an Array of Struct and Class Object

Perhaps the difference between an array of a reference type and an array of a value type is easier to understand with an illustration:

Array of a reference type

Array of a reference type

Each Point as well as the array is allocated on the heap and the array stores references to each Point. In total you need N + 1 allocations where N is the number of points. You also need an extra indirection to access a field of a particular Point because you have to go through a reference.

Array of a value type

Array of a value type

Each Point is stored directly in the array. There is only one allocation on the heap. Accessing a field does not involve indirection. The memory address of the field can be computed directly from the memory address of the array, the index of the item in the array and the location of the field inside the value type.

What's the difference between an object and a struct in OOP?

Obviously you can blur the distinctions according to your programming style, but generally a struct is a structured piece of data. An object is a sovereign entity that can perform some sort of task. In most systems, objects have some state and as a result have some structured data behind them. However, one of the primary functions of a well-designed class is data hiding — exactly how a class achieves whatever it does is opaque and irrelevant.

Since classes can be used to represent classic data structures such as arrays, hash maps, trees, etc, you often see them as the individual things within a block of structured data.

An array is a block of unstructured data. In many programming languages, every separate thing in an array must be of the same basic type (such as every one being an integer number, every one being a string, or similar) but that isn't true in many other languages.

As guidelines:

  • use an array as a place to put a large group of things with no other inherent structure or hierarchy, such as "all receipts from January" or "everything I bought in Denmark"
  • use structured data to compound several discrete bits of data into a single block, such as you might want to combine an x position and a y position to describe a point
  • use an object where there's a particular actor or thing that thinks or acts for itself

The implicit purpose of an object is therefore directly to associate tasks with the data on which they can operate and to bundle that all together so that no other part of the system can interfere. Obeying proper object-oriented design principles may require discipline at first but will ultimately massively improve your code structure and hence your ability to tackle larger projects and to work with others.

Memory layout differences in structs

The C++ standard guarantees that memory layouts of a C struct and a C++ class (or struct -- same thing) will be identical, provided that the C++ class/struct fits the criteria of being POD ("Plain Old Data"). So what does POD mean?

A class or struct is POD if:

All data members are public and themselves POD or fundamental types (but not reference or pointer-to-member types), or arrays of such

  • It has no user-defined constructors, assignment operators or destructors
  • It has no virtual functions
  • It has no base classes

So yes in your case, the memory layout is the same.

Source: Structure of a C++ Object in Memory Vs a Struct

Overhead of Class vs Structure in C#?

I recommend to never use a struct unless you have a very specific use-case in mind and know exactly how the struct will benefit the system.

While C# structs do allow members, they work a good bit different then classes (can't be subtyped, no virtual dispatch, may live entirely on the stack) and the behavior changes depending upon lifting, etc. (Lifting is the process of promoting a value type to the heap -- surprise!)

So, to answer the question: I think one of the biggest misnomers in C# is using structs "for performance". The reason for this is 'overhead' can't be truly measured without seeing how it interacts with the rest of the system and the role, if anything of note, it plays. This requires profiling and can't be summed up with such a trivial statement as "less overhead".

There are some good cases for struct value types -- one example is a composite RGB value stored in an array for an image. This is because the RGB type is small, there can be very many in an image, value types can be packed well in arrays, and may help to keep better memory locality, etc.



Related Topics



Leave a reply



Submit