How Is Push_Back Implemented in Stl Vector

Implementing push_back(T&& c) in custom VectorT class

std::vector would normally be implemented by allocating memory via operator new or some allocator passed to the vector class and constructing new objects in that storage. Here some demonstration for the case of calling operator new directly without use of an allocator. (The real std::vector always uses an allocator, std::allocator by default.):

_vector_data = static_cast<T*>(::operator new(sizeof(T)*new_capacity));

This allocates memory, but does not explicitly create any objects. (Note that this is memory is only correctly aligned for non-overaligned types, but that should be an acceptable limitation.)

Then to place a new object into that storage, you would use a non-allocating placement-new:

template <typename T>
void Vector<T>::push_back(T&& c)
{
    ::new(static_cast<void*>(_vector_data+_vector_size)) T(std::move(c));
    ++_vector_size;
}

(only that you need to verify that the capacity is sufficient first)

If you don't care about some edge cases relating to operator new class overloads, you can use

new(_vector_data+_vector_size) T(std::move(c));

instead of

::new(static_cast<void*>(_vector_data+_vector_size)) T(std::move(c));

and

operator new

instead of

::operator new

In order to remove elements you would then call their destructors explicitly with e.g.

(_vector_data+_vector_size)->~T();

and you would free the memory by calling the operator delete matching the form you used for operator new.

This technically has undefined behavior before C++20, because pointer arithmetic as in _vector_data+_vector_size is only allowed if _vector_data point to an element of an array, but we never created an array. Since C++20 this array is created implicitly.

There is no way to implement a std::vector that has the proper array semantics on pointers to its elements with the requirements you have given before C++20 without relying on semantics that are undefined behavior according to the standard, although this undefined behavior is more of a technicality and would probably never cause issues in practice.

why std::vector::push_back much slower than a manual implementation?

The difference is mainly due to less favorable codegen in the std::vector version. We can see it if we compare the generated assembly (godbolt link).

Your loop (skipping the reallocation part):

$LL4@test:
        xor     esi, esi
        xor     edi, edi
$LL7@test:
        mov     r15, rbx
        cmp     rdi, rbx
        jne     SHORT $LN27@test
        <...skip...>
$LN27@test:
        mov     DWORD PTR [r14+rdi*4], esi
        inc     rdi
        inc     esi
        cmp     esi, 100000                         ; 000186a0H
        jl      $LL7@test
        sub     r12, r13
        jne     $LL4@test

std::vector::push_back loop (again, skipping the reallocation part):

$LL4@test:
        xor     ebx, ebx
        mov     DWORD PTR i$1[rsp], ebx
$LL7@test:
        cmp     rcx, QWORD PTR v$[rsp+16]
        je      SHORT $LN26@test
        mov     DWORD PTR [rcx], ebx
        mov     rcx, QWORD PTR v$[rsp+8]
        add     rcx, 4
        mov     QWORD PTR v$[rsp+8], rcx
        jmp     SHORT $LN5@test
$LN26@test:
        <...skip...>
$LN5@test:
        inc     ebx
        mov     DWORD PTR i$1[rsp], ebx
        cmp     ebx, 100000                         ; 000186a0H
        jl      SHORT $LL7@test
        mov     rcx, QWORD PTR v$[rsp]
        mov     QWORD PTR v$[rsp+8], rcx
        sub     rdi, 1
        jne     SHORT $LL4@test

Clearly we can see more code (11 vs. 8 instructions in the hot path) and more indirection to memory (5 vs. 1). So it's no surprise that it's slower.

More generally, more complex code == slower code.

Could the two versions be optimized identically? I see no reason why not. MSVC 19.28 just can't do it.

Vector push_back move implementation

x is an rvalue reference, but the rule of thumb you must follow is: if it has a name, it's an lvalue. Therefore you must apply std::move to convert its type to an rvalue. If you left out the std::move then x would be copied instead of moved into its destination. More information can be found in Rvalue References Explained.

Details of C++ Vector push_back()

std::vector stores its elements in an array. An array always has fixed size, so if you keep adding elements to a std::vector, its underlying array will eventually fill up. When the array is full and you add another element (via push_back or another member function that adds new elements), it must:

Create a new, larger array,
Copy or move^(*) the elements from the old array to the new array,
Insert the new element into the new array, and
Destroy the old array

This process is called reallocation. A correct implementation of std::vector should resize the array exponentially. The Visual C++ std::vector implementation uses a growth factor of 1.5x; other implementations may use a different growth factor.

^{(*) C++11 adds support for moving objects.}

Is it possible to implement two push_back(..) methods of std::vector in a single universal reference method?

The second version is not equivalent to the first one.

In the first you take an argument of type "reference to T", and in the second U can be of any type, not related to T.

So the second version is actually more an emplace_back than a push_back. It will take any value and try to construct an instance of T using that (though normally emplace_back takes a pack of arguments, for more flexibility).

You could restrict U to a type compatible with T using SFINAE, but then the code might become more complicated than just having two separate overloads.

template<class T>
template<class U, typename std::enable_if_t<std::is_convertible_v<std::decay_t<U>, T>, int> = 0>
void Vector<T>::push_back(U&& value)
{
    // . . .

stdlibc++ for example has two separate overloads for vector::push_back, but there push_back(T&&) simply delegates to emplace_back. So maybe you could do the same.