Class Contiguous Data

Class contiguous data

That depends on your compiler.

You can use #pragma pack(1) with e.g. MSVC and gcc, or #pragma pack 1 with aCC.

For example, assuming MSVC/gcc:

#pragma pack(1)
class FourFloats
{
float f1, f2, f3, f4;
};

Or better:

#pragma pack(push, 1)
class FourFloats
{
float f1, f2, f3, f4;
};
#pragma pack(pop)

That basically disables padding and guarantees that the floats are contiguous. However, to ensure that the size of your class is actually 4 * sizeof(float), it must not have a vtbl, which means virtual members are off-limits.

How to make contiguous data in class C++

From: C++ - Class contiguous data

You can use #pragma pack(1) with e.g. MSVC and gcc, or #pragma pack 1 with aCC.

As said in the answer #pragma pack(1) will disables padding and guarantees that the members are contiguous:

That basically disables padding and guarantees that the floats are contiguous. However, to ensure that the size of your class is actually 4 * sizeof(float), it must not have a vtbl, which means virtual members are off-limits.

See also the official C++ documentation: Implementation defined behavior control

The answer gives two ways:

#pragma pack(1)
class FourFloats
{
float f1, f2, f3, f4;
};

and

#pragma pack(push, 1)
class FourFloats
{
float f1, f2, f3, f4;
};
#pragma pack(pop)

C++ Contiguous memory access for object data members

  • The standard guarantees that they will be in that order in memory

    (if you took their addresses they would increment).

  • It does not guarantee that they will be in contiguous memory; but if that is the most optimal layout then they probably will be. Compiler is allowed to add padding between members (it usually does this to make access more efficient (sacrifice space for speed)). If all the members are the same size this is unlikely.

Note: Introducing public/private/protected between them complicates things and may change the order.

Are you better of using an array?

That depends. Would you normally accesses them via an index or via the name? I would say thay 99% of the time the first version you have is better but I can imagine use cases where std::array<> could be useful (members accessed via an index).

It was suggested in the comments std::vector<> could also be used. This is true and the standard guarantees the members are in contiguous location, but they may not be local to the object (in a std::array<> they are local to the object).

Are members in a c++ class guaranteed be contiguous?

Because your class is not a polymorphic type, has no base class, and all the members are public, the address of x is guaranteed to be the address of the class.

Also, the address of y is guaranteed to be after the address of x, although there could be an arbitrary amount of padding between them. So yes, your result is a coincidence.

If your class is polymorphic, i.e. has a virtual function somewhere in either it or a base class, or the members are protected or private, then all bets are off.

So in your case, (void*)&(this->x) is (void*)this, and the address of this->y must be higher than the address of this->x.

Finally if you need the class members to be contiguous, and mutually reachable by pointer arithmetic, use

int x[2];

instead as the member.

Are class members guaranteed to be contiguous in memory?

It is guaranteed that they appear with increasing addresses in the order declared. This is true in general of data members without intervening access specifiers, so if there are other data members in the class then the only way they could intervene is if there are access specifiers in there.

I don't think it's guaranteed to be safe to modify padding bytes. I don't think it's guaranteed that the implementation won't put "something important" in between data members, although I can't immediately think of anything an implementation would want to put in there. Type information for a strangely-designed accurate-marking GC? Recognizable values to test for buffer overruns?

It's not guaranteed that all-bits-zero represents a null function pointer.

You could deal with the issue of the all-bits-zero representation using something like:

std::fill(&func1, &func4 + 1, (void(*)(void))0);

but that would still leave the issue of padding. You're guaranteed no padding in an array, but not (by the standard) in a class. The ABI used by your implementation might specify struct layout to the degree necessary to ensure that your class above is laid out the same as an array of 4 function pointers.

An alternative is to do the following:

struct function_pointers {
void (*func1)();
void (*func2)();
void (*func3)();
void (*func4)();
};

class C : private function_pointers
{
public:
C() : function_pointers() {}
};

The initializer function_pointers() dictates that (since it doesn't have a user-declared constructor) the members of function_pointers are zero-initialized even if the instance of C itself is only default-initialized. function_pointers could be a data member rather than a base class, if you prefer to type a bit more to access func1 etc.

Note that C is now non-POD in C++03. In C++11 C remains standard-layout after this change, but would not be standard-layout if there were any data members defined in C, and it is not a trivial class. So if you were relying on POD/standard/trivial-ness then don't do this. Instead leave the definition of C as it is and use aggregate initialization (C c = {0};) to zero-initialize instances of C.

In Java, are all members of a class stored in contiguous memory?

But when studying Go in detail, I often read about this being a bad
idea unless you really need them to be pointers, because it means the
values for each member can be spread anywhere across dynamic memory,
hurting performance because it's less able to take advantage of
spatial locality in the CPU.

You have no choice in Java.

class MyClass {
String firstMember;
int secondMember;
}

The String-valued member is, and can only be, a reference (i.e., effectively a pointer). The int-valued member is a primitive value (i.e., not a pointer).

The Java world is divided into primitive values and objects (of some class, or arrays, and so on). Variables for the former types are not references, variables for the latter types are references.

The Java Language Specification does not talk about object layout at all; that's not a concept that appears in the language.

The JVM Specification specifically says

The Java Virtual Machine does not mandate any particular internal
structure for objects.

Pragmatically, you might guess that the body of a class instance is a single piece of memory, but that still leaves open the questions of alignment, padding, and ordering or members (no reason to keep source-code order that I can see, and some reasons to reorder).

What is the meaning of contiguous memory in C++?

It means that memory is allocated as a single chunk. This is most often used when talking about containers.

For instance, the vector and string classes use a contiguous chunk of memory. This means that if you have a vector that contains the int elements 123, 456, 789, then you can be assured that if you get the pointer to the first element of the vector, by incrementing this pointer, you'll access the second element (456), and by incrementing it again you'll access the last element (789).

std::vector<int> vec = {123, 456, 789};

int* ptr = &vec[0];
*ptr++ == 123; // is true
*ptr++ == 456; // is true
*ptr++ == 789; // is true

The deque class, on the other hand, does not guarantee contiguous storage. This means that if you have a deque that contains the same elements (123, 456, 789), and that you get a pointer to the first element, you cannot be certain that you'll access the second element by incrementing the pointer, or the third by incrementing it again.

std::deque<int> deque = {123, 456, 789};

int* ptr = &deque[0];
*ptr++ == 132; // true
*ptr++ == 456; // not necessarily true and potentially dangerous
*ptr++ == 789; // not necessarily true and potentially dangerous

Another example of a non-contiguous data structure would be the linked list. With a linked list, it's almost unthinkable that incrementing the head pointer could return the second element.

It is rarely relevant, assuming you use C++ good practices such as using iterators instead of pointers as much as you can, because it lets collections manage how they store their items without having to worry about how they do it. Usually, you'll need memory to be contiguous if you have to call C code from your C++ code, as most C functions were designed to work with contiguous memory because that's the simplest way to do it.

How does std::vector support contiguous memory for custom objects of unknown size

Your concept of size is flawed. A std::vector<type> has a compile time known size of space it is going to take up. It also has a run time size that it may use (this is allocated at run time and the vector holds a pointer to it). You can picture it laid out like

+--------+
| |
| Vector |
| |
| |
+--------+
|
|
v
+-------------------------------------------------+
| | | | | |
| Element | Element | Element | Element | Element |
| | | | | |
+-------------------------------------------------+

So when you have a vector of things that have a vector in them, each Element becomes the vector and then those point of to their own storage somewhere else like

+--------+
| |
| Vector |
| |
| |
+----+---+
|
|
v
+----+----+---------+---------+
| Object | Object | Object |
| with | with | with |
| Vector | Vector | Vector |
+----+----+----+----+----+----+
| | | +---------+---------+---------+---------+---------+
| | | | | | | | |
| | +--->+ Element | Element | Element | Element | Element |
| | | | | | | |
| | +-------------------------------------------------+
| | +-------------------------------------------------+
| | | | | | | |
| +--->+ Element | Element | Element | Element | Element |
| | | | | | |
| +-------------------------------------------------+
| +-------------------------------------------------+
| | | | | | |
+--->+ Element | Element | Element | Element | Element |
| | | | | |
+---------+---------+---------+---------+---------+

This way all of the vectors are next to each other, but the elements the vectors have can be anywhere else in memory. It is for this reason you don't want to use a std:vector<std::vector<int>> for a matrix. All of the sub vectors get memory to wherever so there is no locality between the rows.


Do note that this applies to all of the allocator aware containers as they do not store the elements inside the container directly. This is not true for std::array as, like a raw array, the elements are part of the container. If you have an std::array<int, 20> then it is at least sizeof(int) * 20 bytes in size.



Related Topics



Leave a reply



Submit