Is There a Standard Way of Moving a Range into a Vector

Is there a standard way of moving a range into a vector?

You use a move_iterator with insert:

v1.insert(v1.end(), make_move_iterator(v2.begin()), make_move_iterator(v2.end()));

The example in 24.5.3 is almost exactly this.

You'll get the optimization you want if (a) vector::insert uses iterator-tag dispatch to detect the random-access iterator and precalculate the size (which you've assumed it does in your example that copies), and (b) move_iterator preserves the iterator category of the iterator it wraps (which is required by the standard).

On an obscure point: I'm pretty sure that vector::insert can emplace from the source (which is irrelevant here, since the source is the same type as the destination, so an emplace is the same as a copy/move, but would be relevant to otherwise-identical examples). I haven't yet found a statement that it's required to do so, I've just inferred it from the fact that the requirement on the iterator pair i,j passed to insert is that T be EmplaceConstructible from *i.

Is this the most efficient way of move the contents of one std::vector to the end of another in C++11?

Performance disclaimer: Use profiling.

Performance considerations:

  • push_back has to check for each call if the capacity of the vector is sufficient to insert the element. It is unlikely that the compiler is smart enough to avoid that check within the loop, that is, for every loop iteration it'll have to check, which can also prohibit further optimizations.
  • If there's no call to reserve before, push_back has to adjust the capacity of the vector on the go, maybe multiple times within the loop, which would lead to moving the already stored elements.
  • swap is slightly different from move: move has less strict guarantees on the moved objects, which could allow optimizations
  • As GManNickG pointed out in the comments, vector::insert could reserve the necessary memory before inserting, as it gets the whole range to be inserted. This would probably require a specialization on random access iterators because std::difference for them is in O(1) (it could be applied to all bidirectional iterators, but this could be slower - two loop iterations - than not reserving).

The most efficient way I can think of is to reserve the necessary capacity and then insert the elements (either via push_back or via insert) without capacity checks.

A smart Standard Library implementation could do the call to reserve inside insert and not check for the capacity during insertion. I'm not entirely sure this would comply to the Standard, though.

If your library is smart enough, Andy Prowl's version (see comments) is sufficient:

dst.insert( dst.end(),
std::make_move_iterator(src.begin()),
std::make_move_iterator(src.end()) );

Else, you can write the call to reserve manually before invoking insert, but you cannot (AFAIK) insert/append an element w/o internal capacity checks:

template < typename T, typename FwdIt >
void append(FwdIt src_begin, FwdIt src_end, std::vector<T>& dst)
{
dst.reserve( dst.size() + std::distance(src_begin, src_end) );
// capacity checks might slow the loop inside `insert` down
dst.insert(dst.end(), src_begin, src_end);
}

Example:

int main()
{
std::vector<int> dst = { 0, 1, 2 };
std::vector<int> src = { 3, 42 };

append( std::make_move_iterator(src.begin()),
std::make_move_iterator(src.end()),
dst );
}

It might be better to implement append for different iterator types:

template < typename T, typename FwdIt >
void append(FwdIt src_begin, FwdIt src_end, std::vector<T>& dst,
std::forward_iterator_tag)
{
// let the vector handle resizing
dst.insert(dst.end(), src_begin, src_end);
}

template < typename T, typename RAIt >
void append(RAIt src_begin, RAIt src_end, std::vector<T>& dst,
std::random_access_iterator_tag)
{
dst.reserve( dst.size() + (src_end - src_begin) );
dst.insert(dst.end(), src_begin, src_end);
}

template < typename T, typename FwdIt >
void append(FwdIt src_begin, FwdIt src_end, std::vector<T>& dst)
{
append( src_begin, src_end, dst,
typename std::iterator_traits<FwdIt>::iterator_category() );
}

If a performance problem occurs because of the capacity checks inside the loop, you could try to default-construct the required additional elements first. When they exist (i.e. have been constructed) you can use the non-checked operator[] or simple iterators to move the src objects to their destination:

template < typename T, typename RAIt >
void append(RAIt src_begin, RAIt src_end, std::vector<T>& dst,
std::random_access_iterator_tag)
{
auto src_size = src_end - src_begin;

dst.resize( dst.size() + src_size );

// copy is not required to invoke capacity checks
std::copy( src_begin, src_end, dst.end() - src_size );
// ^this^ should move with the example provided above
}

Convenience wrapper:

template < typename T, typename FwdIt >
void append_move(FwdIt src_begin, FwdIt src_end, std::vector<T>& dst)
{
append( std::make_move_iterator(src_begin),
std::make_move_iterator(src_end),
dst );
}

What is the most effective way to move items within a vector?

It's always important to profile before jumping to any conclusions. The contiguity of vector's data memory may offer significant caching benefits that node-based containers don't. So, perhaps you could give the direct approach a try:

void move_range(size_t start, size_t length, size_t dst, std::vector<T> & v)
{
const size_t final_dst = dst > start ? dst - length : dst;

std::vector<T> tmp(v.begin() + start, v.begin() + start + length);
v.erase(v.begin() + start, v.begin() + start + length);
v.insert(v.begin() + final_dst, tmp.begin(), tmp.end());
}

In C++11, you'd wrap the iterators in the first and third line into std::make_move_iterator.

(The requirement is that dst not lie within [start, start + length), or otherwise the problem is not well-defined.)

How to Move certain elements of std::vector to a new index within the vector?

The above two examples had issues with different input (noted above). I worked out the algorithm to handle the different cases and it passed my unit tests. It can probably be improved for speed.

template <class CONTAINER_TYPE>
void ChangeRecordOrder( CONTAINER_TYPE IN OUT &values,
uint newIndex,
std::vector<uint> IN const &indexesToMove )
{
// Make a copy of the indexesToMove as we need to do fixups to them in certain cases
std::vector<uint> temporaryIndexes = indexesToMove;

for ( uint i=0; i<temporaryIndexes.size(); i++ )
{
uint indexToMove = temporaryIndexes[i];

if ( indexToMove < newIndex )
{
uint leftIndex = indexToMove;
uint rightIndex = newIndex -1;
uint newFirst = leftIndex + 1;
std::rotate( values.begin() + leftIndex, values.begin() + newFirst, values.begin() + rightIndex +1);

// fix up indexes
for( uint j=i+1; j<temporaryIndexes.size(); j++ )
{
uint &futureIndex = temporaryIndexes[j];
if ( futureIndex > leftIndex && futureIndex <=rightIndex )
--futureIndex;
}
}
else if ( indexToMove > newIndex )
{
uint leftIndex = newIndex;
uint rightIndex = indexToMove;
uint newFirst = indexToMove;
std::rotate( values.begin() + leftIndex, values.begin() + newFirst, values.begin() + rightIndex +1);

// fix up indexes
for( uint j=i+1; j<temporaryIndexes.size(); j++ )
{
uint &futureIndex = temporaryIndexes[j];
if ( futureIndex > leftIndex && futureIndex <=rightIndex )
++futureIndex;
}

++newIndex;
}

}
}

How to check whether elements of a range should be moved?

I understand your point.
I do think that this is a real problem.

My answer is that the community has to agree exactly what it means to move nested objected (such as containers).
In any case this needs the cooperation of the container implementors.
And, in the case of standard containers, good specifications.

I am pessimistic that standard containers can be changed to "generalize" the meaning of "move", but that can't prevent new user defined containers from taking advantage of move-idioms.
The problem is that nobody has studied this in depth as far as I know.

As it is now, std::move seem to imply "shallow" move (one level of moving of the top "value type").
In the sense that you can move the whole thing but not necessarily individual parts.
This, in turn, makes useless to try to "std::move" non-owning ranges or ranges that offer pointer/iterator stability.

Some libraries, e.g. related to std::ranges simply reject r-value of references ranges which I think it is only kicking the can.

Suppose you have a container Bag.
What should std::move(bag)[0] and std::move(bag).begin() return? It is really up to the implementation of the container decide what to return.

It is hard to think of general data structures, bit if the data structure is simple (e.g. dynamic arrays) for consistency with structs (std::move(s).field) std::move(bag)[0] should be the same as std::move(bag[0]) however the standard strongly disagrees with me already here: https://en.cppreference.com/w/cpp/container/vector/operator_at
And it is possible that it is too late to change.

Same goes for std::move(bag).begin() which, using my logic, should return a move_iterator (or something of the like that).

To make things worst, std::array<T, N> works how I would expect (std::move(arr[0]) equivalent to std::move(arr)[0]).
However std::move(arr).begin() is a simple pointer so it looses the "forwarding/move" information! It is a mess.

So, yes, to answer your question, you can check if using Type = decltype(*std::forward<Bag>(bag).begin()); is an r-value but more often than not it will not implemented as r-value.
That is, you have to hope for the best and trust that .begin and * are implemented in a very specific way.

You are in better shape by inspecting (somehow) the category of the range itself.
That is, currently you are left to your own devices: if you know that bag is bound to an r-value and the type is conceptually an "owning" value, you currently have to do the dance of using std::make_move_iterator.

I am currently experimenting a lot with custom containers that I have. https://gitlab.com/correaa/boost-multi
However, by trying to allow for this, I break behavior expected for standard containers regarding move.
Also once you are in the realm of non-owning ranges, you have to make iterators movable by "hand".

I found empirically useful to distinguish top-level move(std::move) and element wise move (e.g. bag.mbegin() or bag.moved().begin()).
Otherwise I find my self overloading std::move which should be last resort if anything at all.

In other words, in

template<class MyRange>
void f(MyRange&& r) {
std::copy(std::forward<MyRange>(r).begin(), ..., ...);
}

the fact that r is bound to an r-value doesn't necessarily mean that the elements can be moved, because MyRange can simply be a non-owning view of a larger container that was "just" generated.

Therefore in general you need an external mechanism to detect if MyRange owns the values or not, and not just detecting the "value category" of *std::forward<MyRange>(r).begin() as you propose.

I guess with ranges one can hope in the future to indicate deep moves with some kind of adaptor-like thing "std::ranges::moved_range" or use the 3-argument std::move.

STL-like vector with arbitrary index range

Deque is very much like a vector in that it supports random access and efficient insertion at the end and it also supports efficient insertion at the beginning.

Map supports access based on arbitrary keys, you can have any range you want, or even a sparsely populated array. Iteration over the collection is slow.

Unordered map (tr1) is similar to map except it supports better iteration.

As a general rule of thumb, use a vector (in your case adapt it to the behaviour you want) and only change when you have evidence that the vector is causing slowness.

Inserting into a vector of move-only type

"Efficient" is something to be measured. But if your desire is to shift the elements in one go instead of constantly moving them one item to the right, you can do it with std::rotate. Here's how

vec.resize(vec.size() + size); // Add the "null" pointers to the end.
// Obtain valid first after resize
std::rotate(first, vec.end() - size, vec.end());

Since the function of rotate is to make the middle iterator the "new first" of the range, while the iterator preceding it is the "new last", the above choice of iterators will shift the range of null pointers to their intended location (before first).

Furthermore, since you tagged C++17, you can also pass the standard algorithm an execution policy, and hopefully gain some parallelism to boot.

c++ reassign after moved

This is guaranteed. After the move, the vector is guaranteed to be empty, i.e. has no elements. It's fine to be assigned by other contents.

After the move, other is guaranteed to be empty().



Related Topics



Leave a reply



Submit