Why Emplace_Back Is Faster Than Push_Back

Why emplace_back is faster than push_back?

Your test case isn't very helpful. push_back takes a container element and copies/moves it into the container. emplace_back takes arbitrary arguments and constructs from those a new container element. But if you pass a single argument that's already of element type to emplace_back, you'll just use the copy/move constructor anyway.

Here's a better comparison:

Foo x; Bar y; Zip z;

v.push_back(T(x, y, z));  // make temporary, push it back
v.emplace_back(x, y, z);  // no temporary, directly construct T(x, y, z) in place

The key difference, however, is that emplace_back performs explicit conversions:

std::vector<std::unique_ptr<Foo>> v;
v.emplace_back(new Foo(1, 'x', true));  // constructor is explicit!

This example will be mildly contrived in the future, when you should say v.push_back(std::make_unique<Foo>(1, 'x', true)). However, other constructions are very nice with emplace, too:

std::vector<std::thread> threads;
threads.emplace_back(do_work, 10, "foo");    // call do_work(10, "foo")
threads.emplace_back(&Foo::g, x, 20, false);  // call x.g(20, false)

Why would I ever use push_back instead of emplace_back?

I have thought about this question quite a bit over the past four years. I have come to the conclusion that most explanations about push_back vs. emplace_back miss the full picture.

Last year, I gave a presentation at C++Now on Type Deduction in C++14. I start talking about push_back vs. emplace_back at 13:49, but there is useful information that provides some supporting evidence prior to that.

The real primary difference has to do with implicit vs. explicit constructors. Consider the case where we have a single argument that we want to pass to push_back or emplace_back.

std::vector<T> v;
v.push_back(x);
v.emplace_back(x);

After your optimizing compiler gets its hands on this, there is no difference between these two statements in terms of generated code. The traditional wisdom is that push_back will construct a temporary object, which will then get moved into v whereas emplace_back will forward the argument along and construct it directly in place with no copies or moves. This may be true based on the code as written in standard libraries, but it makes the mistaken assumption that the optimizing compiler's job is to generate the code you wrote. The optimizing compiler's job is actually to generate the code you would have written if you were an expert on platform-specific optimizations and did not care about maintainability, just performance.

The actual difference between these two statements is that the more powerful emplace_back will call any type of constructor out there, whereas the more cautious push_back will call only constructors that are implicit. Implicit constructors are supposed to be safe. If you can implicitly construct a U from a T, you are saying that U can hold all of the information in T with no loss. It is safe in pretty much any situation to pass a T and no one will mind if you make it a U instead. A good example of an implicit constructor is the conversion from std::uint32_t to std::uint64_t. A bad example of an implicit conversion is double to std::uint8_t.

We want to be cautious in our programming. We do not want to use powerful features because the more powerful the feature, the easier it is to accidentally do something incorrect or unexpected. If you intend to call explicit constructors, then you need the power of emplace_back. If you want to call only implicit constructors, stick with the safety of push_back.

An example

std::vector<std::unique_ptr<T>> v;
T a;
v.emplace_back(std::addressof(a)); // compiles
v.push_back(std::addressof(a)); // fails to compile

std::unique_ptr<T> has an explicit constructor from T *. Because emplace_back can call explicit constructors, passing a non-owning pointer compiles just fine. However, when v goes out of scope, the destructor will attempt to call delete on that pointer, which was not allocated by new because it is just a stack object. This leads to undefined behavior.

This is not just invented code. This was a real production bug I encountered. The code was std::vector<T *>, but it owned the contents. As part of the migration to C++11, I correctly changed T * to std::unique_ptr<T> to indicate that the vector owned its memory. However, I was basing these changes off my understanding in 2012, during which I thought "emplace_back does everything push_back can do and more, so why would I ever use push_back?", so I also changed the push_back to emplace_back.

Had I instead left the code as using the safer push_back, I would have instantly caught this long-standing bug and it would have been viewed as a success of upgrading to C++11. Instead, I masked the bug and didn't find it until months later.

push_back is more efficient than emplace_back?

push_back is not more efficient, and the results you observe are due to the vector resizing itself.

When you call emplace after push_back, the vector has to resize itself to make room for the second element. This means that it has to move the A that was originally inside the vector, making emplace appear more complex.

If you reserve enough space in the vector beforehand, this doesn't happen. Notice the call to va.reserve(2) after va's creation:

#include <iostream>     
#include <vector>    

class A
{
    public:
    A() {std::cout << "A const" << std::endl;}
    ~A() {std::cout << "A dest" << std::endl;}
    A(const A& a) {std::cout << "A copy const" << std::endl;}
    A(A&& a) {std::cout << "A move const" << std::endl;}
    A& operator=(const A& a) {std::cout << "A copy operator=" << std::endl; return *this; }
    A& operator=(A&& a) {std::cout << "A move operator=" << std::endl;  return *this; }
};

int main () {
    std::vector<A> va;
    // Now there's enough room for two elements
    va.reserve(2);
    std::cout <<"push:" << std::endl;
    va.push_back(A());
    std::cout <<std::endl<< "emplace:" << std::endl;
    va.emplace_back(A());

    std::cout <<std::endl<< "end:" << std::endl;

    return 0;
}

The corresponding output is:

push:
A const
A move const
A dest

emplace:
A const
A move const
A dest

end:
A dest
A dest

Can we make things even more efficient? Yes! emplace_back takes whatever arguments you provide it, and forwards them to A's constructor. Because A has a constructor that takes no arguments, you can also use emplace_back with no arguments! In other words, we change

va.emplace_back(A());

va.emplace_back(); // No arguments necessary since A is default-constructed

This results in no copy, and no move:

push:
A const
A move const
A dest

emplace:
A const

end:
A dest
A dest

A note on vectors resizing: It's important to note that the implementation of std::vector is smart. If A had been a trivially copyable type, std::vector might have been able resize in-place without additional copying using a system function similar to realloc. However because As constructors and destruction contain code, realloc can't be used here.

push_back vs emplace_back

In addition to what visitor said :

The function void emplace_back(Type&& _Val) provided by MSCV10 is non conforming and redundant, because as you noted it is strictly equivalent to push_back(Type&& _Val).

But the real C++0x form of emplace_back is really useful: void emplace_back(Args&&...);

Instead of taking a value_type it takes a variadic list of arguments, so that means that you can now perfectly forward the arguments and construct directly an object into a container without a temporary at all.

That's useful because no matter how much cleverness RVO and move semantic bring to the table there is still complicated cases where a push_back is likely to make unnecessary copies (or move). For example, with the traditional insert() function of a std::map, you have to create a temporary, which will then be copied into a std::pair<Key, Value>, which will then be copied into the map :

std::map<int, Complicated> m;
int anInt = 4;
double aDouble = 5.0;
std::string aString = "C++";

// cross your finger so that the optimizer is really good
m.insert(std::make_pair(4, Complicated(anInt, aDouble, aString))); 

// should be easier for the optimizer
m.emplace(4, anInt, aDouble, aString);

So why didn't they implement the right version of emplace_back in MSVC? Actually, it bugged me too a while ago, so I asked the same question on the Visual C++ blog. Here is the answer from Stephan T Lavavej, the official maintainer of the Visual C++ standard library implementation at Microsoft.

Q: Are beta 2 emplace functions just some kind of placeholder right now?

A: As you may know, variadic templates
aren't implemented in VC10. We
simulate them with preprocessor
machinery for things like
make_shared<T>(), tuple, and the new
things in <functional>. This
preprocessor machinery is relatively
difficult to use and maintain. Also,
it significantly affects compilation
speed, as we have to repeatedly
include subheaders. Due to a
combination of our time constraints
and compilation speed concerns, we
haven't simulated variadic templates
in our emplace functions.

When variadic templates are
implemented in the compiler, you can
expect that we'll take advantage of
them in the libraries, including in
our emplace functions. We take
conformance very seriously, but
unfortunately, we can't do everything
all at once.

It's an understandable decision. Everyone who tried just once to emulate variadic template with preprocessor horrible tricks knows how disgusting this stuff gets.

Is emplace_back ever better than push_back when adding temporary objects?

suppose we want to insert an object of type T into a container holding type T objects. Would emplace be better in any case?

No; there would be no practical difference in that case.

Examples where std::vector::emplace_back is slower than std::vector::push_back?

Essentially this boils down to std implementations. Theoretically, emplace should always be as fast or faster, except the reality is that no standard library implementation takes full advantage of that.

He gave a talk on this exact issue a few years ago: https://www.youtube.com/watch?t=3427&v=smqT9Io_bKo

Check out the first 1 hour of the talk for a more detailed explanation. The Q&A at the end of the talk is relevant as well.

c++ vector emplace_back is faster?

Is there ANY performance difference with these two lines?

No, both will initialise the new element using the copy constructor.

emplace_back can potentially give a benefit when constructing with more (or less) than one argument:

output.push_back(foo{bar, wibble}); // Constructs and moves a temporary
output.emplace_back(bar, wibble);   // Initialises directly

The true benefit of emplace is not so much in performance, but in allowing non-copyable (and in some cases non-movable) elements to be created in the container.

Does vector emplace compile faster than push_back?

There many questions here since you are comparing emplace, push_back and construction during run-time and compilation. Let's take compilation first.

Compilation involves translation the source code into assembly instructions. Before emitting any assembly instructions there are usually 2 phases: lexing and parsing. The lexing is checking syntax and the parsing is checking and handling semantics (again, this is very simplified).

Given the two statements:

my_vector.push_back(value);
my_vector.emplace(value);

The work required to scan, parse and evaluate should be nearly identical. Thus there should not be any differences in compilation times.

Also, with the processing speed of most compilers, if there is a difference in compilation times, it is negligible compared to the time starting the compiler and the time performing I/O. At best, I would expect saving a millisecond. If you have over 1000 of these you may save a second of compilation time. Human reaction time to the compiler finishing is more than 1 second. Thus the savings is still negligible or not worthwhile.

As far as run-time goes, you would have to profile. Let's say for example that emplace is 1 ms faster than push_back. You would need to execute over 1000 emplace functions to gain 1 second (if your program is continuously running and not interrupted). You would need to execute over 60000 to save a minute. All this savings may be lost by waiting for I/O or other tasks to finish. More likely, you'll be saving nanoseconds, not milliseconds.

Focus your efforts on correct and robust code. Only worry about optimizations after your program is correct and has no faults and doesn't crash. Only optimize if the User says its slow, it doesn't meet critical timing events or it doesn't fit in memory. Think about how much code you could have written in the time you are contemplating this micro-optimization.

push_back vs emplace_back to a std::vectorstd::string

I take a while to really understand what the advantage to use std::vector::emplace as aschepler said.

I found out the better scenario to use that is when we have our own class that receive some data when it is construct.

To make more clear, let's suppose we have:

A vector of MyObject
MyObject needs to receive 3 arguments to be constructed
The functions get1stElem(), get2ndElem() and get3rdElem() provide the elements necessary to construct a MyObject instance

Then we can have a line like this:

vVec.emplace(get1stElem(), get2ndElem(), get3rdElem());

Then the std::vector::emplace will construct MyObject in place more efficiently than std::vector::push_back.

Why Emplace_Back Is Faster Than Push_Back