Why emplace_back is faster than push_back?
Your test case isn't very helpful. push_back
takes a container element and copies/moves it into the container. emplace_back
takes arbitrary arguments and constructs from those a new container element. But if you pass a single argument that's already of element type to emplace_back
, you'll just use the copy/move constructor anyway.
Here's a better comparison:
Foo x; Bar y; Zip z;
v.push_back(T(x, y, z)); // make temporary, push it back
v.emplace_back(x, y, z); // no temporary, directly construct T(x, y, z) in place
The key difference, however, is that emplace_back
performs explicit conversions:
std::vector<std::unique_ptr<Foo>> v;
v.emplace_back(new Foo(1, 'x', true)); // constructor is explicit!
This example will be mildly contrived in the future, when you should say v.push_back(std::make_unique<Foo>(1, 'x', true))
. However, other constructions are very nice with emplace
, too:
std::vector<std::thread> threads;
threads.emplace_back(do_work, 10, "foo"); // call do_work(10, "foo")
threads.emplace_back(&Foo::g, x, 20, false); // call x.g(20, false)
Why would I ever use push_back instead of emplace_back?
I have thought about this question quite a bit over the past four years. I have come to the conclusion that most explanations about push_back
vs. emplace_back
miss the full picture.
Last year, I gave a presentation at C++Now on Type Deduction in C++14. I start talking about push_back
vs. emplace_back
at 13:49, but there is useful information that provides some supporting evidence prior to that.
The real primary difference has to do with implicit vs. explicit constructors. Consider the case where we have a single argument that we want to pass to push_back
or emplace_back
.
std::vector<T> v;
v.push_back(x);
v.emplace_back(x);
After your optimizing compiler gets its hands on this, there is no difference between these two statements in terms of generated code. The traditional wisdom is that push_back
will construct a temporary object, which will then get moved into v
whereas emplace_back
will forward the argument along and construct it directly in place with no copies or moves. This may be true based on the code as written in standard libraries, but it makes the mistaken assumption that the optimizing compiler's job is to generate the code you wrote. The optimizing compiler's job is actually to generate the code you would have written if you were an expert on platform-specific optimizations and did not care about maintainability, just performance.
The actual difference between these two statements is that the more powerful emplace_back
will call any type of constructor out there, whereas the more cautious push_back
will call only constructors that are implicit. Implicit constructors are supposed to be safe. If you can implicitly construct a U
from a T
, you are saying that U
can hold all of the information in T
with no loss. It is safe in pretty much any situation to pass a T
and no one will mind if you make it a U
instead. A good example of an implicit constructor is the conversion from std::uint32_t
to std::uint64_t
. A bad example of an implicit conversion is double
to std::uint8_t
.
We want to be cautious in our programming. We do not want to use powerful features because the more powerful the feature, the easier it is to accidentally do something incorrect or unexpected. If you intend to call explicit constructors, then you need the power of emplace_back
. If you want to call only implicit constructors, stick with the safety of push_back
.
An example
std::vector<std::unique_ptr<T>> v;
T a;
v.emplace_back(std::addressof(a)); // compiles
v.push_back(std::addressof(a)); // fails to compile
std::unique_ptr<T>
has an explicit constructor from T *
. Because emplace_back
can call explicit constructors, passing a non-owning pointer compiles just fine. However, when v
goes out of scope, the destructor will attempt to call delete
on that pointer, which was not allocated by new
because it is just a stack object. This leads to undefined behavior.
This is not just invented code. This was a real production bug I encountered. The code was std::vector<T *>
, but it owned the contents. As part of the migration to C++11, I correctly changed T *
to std::unique_ptr<T>
to indicate that the vector owned its memory. However, I was basing these changes off my understanding in 2012, during which I thought "emplace_back
does everything push_back
can do and more, so why would I ever use push_back
?", so I also changed the push_back
to emplace_back
.
Had I instead left the code as using the safer push_back
, I would have instantly caught this long-standing bug and it would have been viewed as a success of upgrading to C++11. Instead, I masked the bug and didn't find it until months later.
push_back is more efficient than emplace_back?
push_back
is not more efficient, and the results you observe are due to the vector resizing itself.
When you call emplace
after push_back
, the vector has to resize itself to make room for the second element. This means that it has to move the A
that was originally inside the vector, making emplace
appear more complex.
If you reserve enough space in the vector beforehand, this doesn't happen. Notice the call to va.reserve(2)
after va
's creation:
#include <iostream>
#include <vector>
class A
{
public:
A() {std::cout << "A const" << std::endl;}
~A() {std::cout << "A dest" << std::endl;}
A(const A& a) {std::cout << "A copy const" << std::endl;}
A(A&& a) {std::cout << "A move const" << std::endl;}
A& operator=(const A& a) {std::cout << "A copy operator=" << std::endl; return *this; }
A& operator=(A&& a) {std::cout << "A move operator=" << std::endl; return *this; }
};
int main () {
std::vector<A> va;
// Now there's enough room for two elements
va.reserve(2);
std::cout <<"push:" << std::endl;
va.push_back(A());
std::cout <<std::endl<< "emplace:" << std::endl;
va.emplace_back(A());
std::cout <<std::endl<< "end:" << std::endl;
return 0;
}
The corresponding output is:
push:
A const
A move const
A dest
emplace:
A const
A move const
A dest
end:
A dest
A dest
Can we make things even more efficient? Yes! emplace_back
takes whatever arguments you provide it, and forwards them to A
's constructor. Because A
has a constructor that takes no arguments, you can also use emplace_back
with no arguments! In other words, we change
va.emplace_back(A());
to
va.emplace_back(); // No arguments necessary since A is default-constructed
This results in no copy, and no move:
push:
A const
A move const
A dest
emplace:
A const
end:
A dest
A dest
A note on vectors resizing: It's important to note that the implementation of std::vector
is smart. If A
had been a trivially copyable type, std::vector
might have been able resize in-place without additional copying using a system function similar to realloc
. However because A
s constructors and destruction contain code, realloc
can't be used here.
push_back vs emplace_back
In addition to what visitor said :
The function void emplace_back(Type&& _Val)
provided by MSCV10 is non conforming and redundant, because as you noted it is strictly equivalent to push_back(Type&& _Val)
.
But the real C++0x form of emplace_back
is really useful: void emplace_back(Args&&...)
;
Instead of taking a value_type
it takes a variadic list of arguments, so that means that you can now perfectly forward the arguments and construct directly an object into a container without a temporary at all.
That's useful because no matter how much cleverness RVO and move semantic bring to the table there is still complicated cases where a push_back is likely to make unnecessary copies (or move). For example, with the traditional insert()
function of a std::map
, you have to create a temporary, which will then be copied into a std::pair<Key, Value>
, which will then be copied into the map :
std::map<int, Complicated> m;
int anInt = 4;
double aDouble = 5.0;
std::string aString = "C++";
// cross your finger so that the optimizer is really good
m.insert(std::make_pair(4, Complicated(anInt, aDouble, aString)));
// should be easier for the optimizer
m.emplace(4, anInt, aDouble, aString);
So why didn't they implement the right version of emplace_back in MSVC? Actually, it bugged me too a while ago, so I asked the same question on the Visual C++ blog. Here is the answer from Stephan T Lavavej, the official maintainer of the Visual C++ standard library implementation at Microsoft.
Q: Are beta 2 emplace functions just some kind of placeholder right now?
A: As you may know, variadic templates
aren't implemented in VC10. We
simulate them with preprocessor
machinery for things like
make_shared<T>()
, tuple, and the new
things in<functional>
. This
preprocessor machinery is relatively
difficult to use and maintain. Also,
it significantly affects compilation
speed, as we have to repeatedly
include subheaders. Due to a
combination of our time constraints
and compilation speed concerns, we
haven't simulated variadic templates
in our emplace functions.When variadic templates are
implemented in the compiler, you can
expect that we'll take advantage of
them in the libraries, including in
our emplace functions. We take
conformance very seriously, but
unfortunately, we can't do everything
all at once.
It's an understandable decision. Everyone who tried just once to emulate variadic template with preprocessor horrible tricks knows how disgusting this stuff gets.
Is emplace_back ever better than push_back when adding temporary objects?
suppose we want to insert an object of type T into a container holding type T objects. Would emplace be better in any case?
No; there would be no practical difference in that case.
Examples where std::vector::emplace_back is slower than std::vector::push_back?
Essentially this boils down to std implementations. Theoretically, emplace should always be as fast or faster, except the reality is that no standard library implementation takes full advantage of that.
He gave a talk on this exact issue a few years ago: https://www.youtube.com/watch?t=3427&v=smqT9Io_bKo
Check out the first 1 hour of the talk for a more detailed explanation. The Q&A at the end of the talk is relevant as well.
c++ vector emplace_back is faster?
Is there ANY performance difference with these two lines?
No, both will initialise the new element using the copy constructor.
emplace_back
can potentially give a benefit when constructing with more (or less) than one argument:
output.push_back(foo{bar, wibble}); // Constructs and moves a temporary
output.emplace_back(bar, wibble); // Initialises directly
The true benefit of emplace
is not so much in performance, but in allowing non-copyable (and in some cases non-movable) elements to be created in the container.
Does vector emplace compile faster than push_back?
There many questions here since you are comparing emplace
, push_back
and construction
during run-time and compilation. Let's take compilation first.
Compilation involves translation the source code into assembly instructions. Before emitting any assembly instructions there are usually 2 phases: lexing and parsing. The lexing is checking syntax and the parsing is checking and handling semantics (again, this is very simplified).
Given the two statements:
my_vector.push_back(value);
my_vector.emplace(value);
The work required to scan, parse and evaluate should be nearly identical. Thus there should not be any differences in compilation times.
Also, with the processing speed of most compilers, if there is a difference in compilation times, it is negligible compared to the time starting the compiler and the time performing I/O. At best, I would expect saving a millisecond. If you have over 1000 of these you may save a second of compilation time. Human reaction time to the compiler finishing is more than 1 second. Thus the savings is still negligible or not worthwhile.
As far as run-time goes, you would have to profile. Let's say for example that emplace
is 1 ms faster than push_back
. You would need to execute over 1000 emplace
functions to gain 1 second (if your program is continuously running and not interrupted). You would need to execute over 60000 to save a minute. All this savings may be lost by waiting for I/O or other tasks to finish. More likely, you'll be saving nanoseconds, not milliseconds.
Focus your efforts on correct and robust code. Only worry about optimizations after your program is correct and has no faults and doesn't crash. Only optimize if the User says its slow, it doesn't meet critical timing events or it doesn't fit in memory. Think about how much code you could have written in the time you are contemplating this micro-optimization.
push_back vs emplace_back to a std::vectorstd::string
I take a while to really understand what the advantage to use std::vector::emplace as aschepler said.
I found out the better scenario to use that is when we have our own class that receive some data when it is construct.
To make more clear, let's suppose we have:
- A vector of MyObject
- MyObject needs to receive 3 arguments to be constructed
- The functions get1stElem(), get2ndElem() and get3rdElem() provide the elements necessary to construct a MyObject instance
Then we can have a line like this:
vVec.emplace(get1stElem(), get2ndElem(), get3rdElem());
Then the std::vector::emplace will construct MyObject in place more efficiently than std::vector::push_back.
Related Topics
How to Alter Qt Widgets in Winapi Threads
Inheriting and Overriding Functions of a Std::String
Undefined Reference to Template Members
How to Set Pointer to a Memory to Null Using Memset
How Does Calling Srand More Than Once Affect the Quality of Randomness
What Is the Status on Dynarrays
Why Is Using Exit() Considered Bad
How to Declare Std::Unique_Ptr and What Is the Use of It
Read Func Interp of a Z3 Array from the Z3 Model
Is There a Safe Navigation Operator for C++
Skipping Expected Characters Like Scanf() with Cin
How to Change Directshow Filter Properties C++
Directx/C++ 3D Engine Programming: Learn Now, or Wait for Directx 12
Protected Data in Parent Class Not Available in Child Class
Understanding (Simple) C++ Partial Template Specialization
Are Lambdas Inlined Like Functions in C++