How to Initialize Std::Vector Over Already Allocated Memory

Is it possible to initialize std::vector over already allocated memory?

Like this.. Containers in the standard usually take an allocator. Using c++11's allocator traits, it is very easy to create an allocator as you don't have to have all the members in the allocator. However if using an older version of C++, you will need to implement each member and do the rebinding as well!

For Pre-C++11, you can use the following:

#include <iterator>
#include <vector>
#include <iostream>

template<typename T>
class PreAllocator
{
private:
T* memory_ptr;
std::size_t memory_size;

public:
typedef std::size_t size_type;
typedef ptrdiff_t difference_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef T value_type;

PreAllocator(T* memory_ptr, std::size_t memory_size) throw() : memory_ptr(memory_ptr), memory_size(memory_size) {};
PreAllocator (const PreAllocator& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};

template<typename U>
PreAllocator (const PreAllocator<U>& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};

template<typename U>
PreAllocator& operator = (const PreAllocator<U>& other) {return *this;}
PreAllocator<T>& operator = (const PreAllocator& other) {return *this;}
~PreAllocator() {}

pointer address (reference value) const {return &value;}
const_pointer address (const_reference value) const {return &value;}

pointer allocate (size_type n, const void* hint = 0) {return memory_ptr;}
void deallocate (T* ptr, size_type n) {}

void construct (pointer ptr, const T& val) {new (ptr) T (val);}

template<typename U>
void destroy (U* ptr) {ptr->~U();}
void destroy (pointer ptr) {ptr->~T();}

size_type max_size() const {return memory_size;}

template<typename U>
struct rebind
{
typedef PreAllocator<U> other;
};
};

int main()
{
int my_arr[100] = {0};
std::vector<int, PreAllocator<int> > my_vec(PreAllocator<int>(&my_arr[0], 100));
my_vec.push_back(1024);
std::cout<<"My_Vec[0]: "<<my_vec[0]<<"\n";
std::cout<<"My_Arr[0]: "<<my_arr[0]<<"\n";

int* my_heap_ptr = new int[100]();
std::vector<int, PreAllocator<int> > my_heap_vec(PreAllocator<int>(&my_heap_ptr[0], 100));
my_heap_vec.push_back(1024);
std::cout<<"My_Heap_Vec[0]: "<<my_heap_vec[0]<<"\n";
std::cout<<"My_Heap_Ptr[0]: "<<my_heap_ptr[0]<<"\n";

delete[] my_heap_ptr;
my_heap_ptr = NULL;
}

For C++11, you can use the following:

#include <cstdint>
#include <iterator>
#include <vector>
#include <iostream>

template <typename T>
class PreAllocator
{
private:
T* memory_ptr;
std::size_t memory_size;

public:
typedef std::size_t size_type;
typedef T* pointer;
typedef T value_type;

PreAllocator(T* memory_ptr, std::size_t memory_size) : memory_ptr(memory_ptr), memory_size(memory_size) {}

PreAllocator(const PreAllocator& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};

template<typename U>
PreAllocator(const PreAllocator<U>& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};

template<typename U>
PreAllocator& operator = (const PreAllocator<U>& other) { return *this; }
PreAllocator<T>& operator = (const PreAllocator& other) { return *this; }
~PreAllocator() {}

pointer allocate(size_type n, const void* hint = 0) {return memory_ptr;}
void deallocate(T* ptr, size_type n) {}

size_type max_size() const {return memory_size;}
};

int main()
{
int my_arr[100] = {0};
std::vector<int, PreAllocator<int>> my_vec(0, PreAllocator<int>(&my_arr[0], 100));
my_vec.push_back(1024);
std::cout<<"My_Vec[0]: "<<my_vec[0]<<"\n";
std::cout<<"My_Arr[0]: "<<my_arr[0]<<"\n";

int* my_heap_ptr = new int[100]();
std::vector<int, PreAllocator<int>> my_heap_vec(0, PreAllocator<int>(&my_heap_ptr[0], 100));
my_heap_vec.push_back(1024);
std::cout<<"My_Heap_Vec[0]: "<<my_heap_vec[0]<<"\n";
std::cout<<"My_Heap_Ptr[0]: "<<my_heap_ptr[0]<<"\n";

delete[] my_heap_ptr;
my_heap_ptr = nullptr;
}

Notice the difference between the two allocators! This will work with both heap buffers/arrays and stack buffer/arrays. It will also work with most containers. It is safer to use the Pre-C++11 version because it will be backwards compatible and work with more containers (ie: std::List).

You can just place the allocator in a header and use it as much as you want in any projects. It is good if you want to use SharedMemory or any buffer that is already allocated.

WARNING:
DO NOT use the same buffer for multiple containers at the same time! A buffer can be reused but just make sure no two containers use it at the same time.

Example:

int my_arr[100] = {0};
std::vector<int, PreAllocator<int> > my_vec(PreAllocator<int>(&my_arr[0], 100));
std::vector<int, PreAllocator<int> > my_vec2(PreAllocator<int>(&my_arr[0], 100));

my_vec.push_back(1024);
my_vec2.push_back(2048);

std::cout<<"My_Vec[0]: "<<my_vec[0]<<"\n";
std::cout<<"My_Arr[0]: "<<my_arr[0]<<"\n";

The output of the above is 2048! Why? Because the last vector overwrote the values of the first vector since they share the same buffer.

Making a C++ vector that points to already allocated memory

It is not possible to allocate memory to a vector while keeping the previous content of the memory.

A working approach:

  • Don't use malloc at all.
  • Create a vector with default allocator, with necessary size.
  • Load the binary file directly into the vector.

In a hypothetical case where you cannot touch the allocation part because it is somewhere deep in the library: Just don't use a vector. You already have a dynamic array. Iterator based algorithms work just fine with pointers. For range based algorithms, you need something like std::span (C++20) or similar.

But using vector for the allocation would be safer and thus better.

If your files are up to 10 GB, then I would suggest trying out memory mapping the file instead. Mapped memory also cannot be used as storage of a vector, so the approach of not using a vector should be taken. Unfortunately though, there is no standard way to memory map files.

memset of allocated memory after std::vector::reserve

  1. is it safe to 0-initialize the memory using memset after reserve?

Maybe it works, but you'd better not. Accessing a nonexistent element through [] is UB.


  1. is it guaranteed that I still have 0-initialized memory in the example of ChunkT being struct {size_t keys[512]; size_t values[512];}; after fetching my chunk with ChunkT* newChunk &myChunks.emplace_back()?

Yes. In your situation, what emplace_back() do is construct a Chunk via placement-new, and POD-classes will be zero-initialized.
ref: POD class initialized with placement new default initialized?

So, you don't have to worry about memset the allocated memory to zero. Please correct me if I am wrong.

Is it possible to initialize new std::vector in one line?

I just wonder if is possible to new and initialize a std::vector at
the same time, something like, do the two things in one line?

Yes, you can, via std::initializer_list constructor10 of std::vector

constexpr vector( std::initializer_list<T> init,
const Allocator& alloc = Allocator() ); (since C++20)

With you can write

std::vector<int>* vec = new std::vector<int>{3, 4};


Because I need a vector that create on heap!

The terms we use in C++ are automatic and dynamic storage. In most of the cases, you do not require the std::vector<int> to be allocated dynamically, rather the elements to be there. For this, you need simply a vector of integers.

std::vector<int> vec {3, 4};

However, if you're meant for a multidimensional vector, then I will suggest having a vector of vector of inters:

std::vector<std::vector<int>> vec{ {3, 4} };

When the inner vector has the same number of length, keep a single std::vector and manipulate the indexes for acting as a two-dimensional array.

In both cases, the std::vector in the background does the memory management for you.

Is it possible? std::vector double my_vec(sz); which is allocated but not initialized or filled

HOORAY! Richard Critten to the rescue! His comment under the question leads directly to the answer.

The zero-spewing culprit is the default allocator template, namely std::allocator. So we replace it, or modify it with an allocator adapter.

I tidied up the code a little, and expanded the comments. Bill, please feel free to post a more comprehensive answer. But the following does the trick very nicely.

// Allocator adapter
// Given an allocator A, (std::allocator by default), this adapter
// will, when feasible, override A::construct() with a version that
// employs default construction rather than value-initialization.
// "Feasible" means the object (U *ptr) is default-constructable and
// the default constructor cannot throw exceptions.
//
// Thus it thwarts gratuitous initializations to zeros or whatever.

template <typename T, typename A = std::allocator<T>>
class default_init_allocator : public A {
typedef std::allocator_traits<A> a_t;
public:
// http://en.cppreference.com/w/cpp/language/using_declaration
using A::A; // Inherit constructors from A

template <typename U> struct rebind {
using other =
default_init_allocator
< U, typename a_t::template rebind_alloc<U> >;
};

template <typename U>
void construct(U* ptr)
noexcept(std::is_nothrow_default_constructible<U>::value) {
::new(static_cast<void*>(ptr)) U;
}

template <typename U, typename...Args>
void construct(U* ptr, Args&&... args) {
a_t::construct(static_cast<A&>(*this),
ptr, std::forward<Args>(args)...);
}
};

Initialize std::vector with given array without std::allocator

Magic

This answer is magic, dependent on the implementation of the compiler.

We can forcibly access the container of a vector.

Take g++ as an example. It uses three protected pointers, _M_start, _M_finish, and _M_end_of_storage to handle storage. So we can create a derived class that sets/resets the pointers to the return of vaule bar() in the constructor and destructor.

Example code for g++:

static_assert(__GNUC__ == 7 && __GNUC_MINOR__ == 5 && __GNUC_PATCHLEVEL__ == 0);

class Dmy: public std::vector<int>
{
public:
Dmy(int *b, int *e)
{
_M_impl._M_start = b;
_M_impl._M_finish = e;
_M_impl._M_end_of_storage = _M_impl._M_finish;
}

~Dmy()
{
_M_impl._M_start = 0;
_M_impl._M_finish = 0;
_M_impl._M_end_of_storage = 0;
}
};

foo(Dmy(data, end_of_data));


Related Topics



Leave a reply



Submit