What is iterator invalidation?
Iterators are glorified pointers. Iterator invalidation is a lot like pointer invalidation; it means it suddenly points to junk data.
Because it's very natural but wrong to do things like this:
for(iterator it = map.begin(); it != map.end(); ++it) {
map.erase(it->first);
// whoops, now map has been restructured and iterator
// still thinks itself is healthy
}Because that error right there? No compiler error, no warning, you lose. You just have to be trained well enough to watch for them and prevent them. Very insidious bugs if you don't know what you're doing. One of the design philosophies of C++ is speed over safety. The runtime check that would lead iterator invalidation to an exception instead of unspecified behavior is too expensive, in the view of C++ language designers.
You should be on high alert when you are iterating over a data structure and modifying structure itself, instead of merely the objects held in them. At that point you should probably run to the documentation and check whether the operation was illegal.
Iterator invalidation rules for C++ containers
C++17 (All references are from the final working draft of CPP17 - n4659)
Insertion
Sequence Containers
vector
: The functionsinsert
,emplace_back
,emplace
,push_back
cause reallocation if the new size is greater than the old capacity. Reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence. If no reallocation
happens, all the iterators and references before the insertion point remain valid. [26.3.11.5/1]
With respect to thereserve
function, reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence. No reallocation shall take place during insertions that happen after a call toreserve()
until the time when an insertion would make the size of the vector greater than the value ofcapacity()
. [26.3.11.3/6]deque
: An insertion in the middle of the deque invalidates all the iterators and references to elements of the deque. An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque. [26.3.8.4/1]list
: Does not affect the validity of iterators and references. If an exception is thrown there are no effects. [26.3.10.4/1].
Theinsert
,emplace_front
,emplace_back
,emplace
,push_front
,push_back
functions are covered under this rule.forward_list
: None of the overloads ofinsert_after
shall affect the validity of iterators and references [26.3.9.5/1]array
: As a rule, iterators to an array are never invalidated throughout the lifetime of the array. One should take note, however, that during swap, the iterator will continue to point to the same array element, and will thus change its value.
Associative Containers
All Associative Containers
: Theinsert
andemplace
members shall not affect the validity of iterators and references to the container [26.2.6/9]
Unordered Associative Containers
All Unordered Associative Containers
: Rehashing invalidates iterators, changes ordering between elements, and changes which buckets elements appear in, but does not invalidate pointers or references to elements. [26.2.7/9]
Theinsert
andemplace
members shall not affect the validity of references to container elements, but may invalidate all iterators to the container. [26.2.7/14]
Theinsert
andemplace
members shall not affect the validity of iterators if(N+n) <= z * B
, whereN
is the number of elements in the container prior to the insert operation,n
is the number of elements inserted,B
is the container’s bucket count, andz
is the container’s maximum load factor. [26.2.7/15]All Unordered Associative Containers
: In case of a merge operation (e.g.,a.merge(a2)
), iterators referring to the transferred elements and all iterators referring toa
will be invalidated, but iterators to elements remaining ina2
will remain valid. (Table 91 — Unordered associative container requirements)
Container Adaptors
stack
: inherited from underlying containerqueue
: inherited from underlying containerpriority_queue
: inherited from underlying container
Erasure
Sequence Containers
vector
: The functionserase
andpop_back
invalidate iterators and references at or after the point of the erase. [26.3.11.5/3]deque
: An erase operation that erases the last element of adeque
invalidates only the past-the-end iterator and all iterators and references to the erased elements. An erase operation that erases the first element of adeque
but not the last element invalidates only iterators and references to the erased elements. An erase operation that erases neither the first element nor the last element of adeque
invalidates the past-the-end iterator and all iterators and references to all the elements of thedeque
.
[ Note:pop_front
andpop_back
are erase operations. —end note ] [26.3.8.4/4]list
: Invalidates only the iterators and references to the erased elements. [26.3.10.4/3]. This applies toerase
,pop_front
,pop_back
,clear
functions.remove
andremove_if
member functions: Erases all the elements in the list referred by a list iteratori
for which the following conditions hold:*i == value
,pred(*i) != false
. Invalidates only the iterators and references to the erased elements [26.3.10.5/15].unique
member function - Erases all but the first element from every consecutive group of equal elements referred to by the iteratori
in the range[first + 1, last)
for which*i == *(i-1)
(for the version of unique with no arguments) orpred(*i, *(i - 1))
(for the version of unique with a predicate argument) holds. Invalidates only the iterators and references to the erased elements. [26.3.10.5/19]forward_list
:erase_after
shall invalidate only iterators and references to the erased elements. [26.3.9.5/1].remove
andremove_if
member functions - Erases all the elements in the list referred by a list iterator i for which the following conditions hold:*i == value
(forremove()
),pred(*i)
is true (forremove_if()
). Invalidates only the iterators and references to the erased elements. [26.3.9.6/12].unique
member function - Erases all but the first element from every consecutive group of equal elements referred to by the iterator i in the range [first + 1, last) for which*i == *(i-1)
(for the version with no arguments) orpred(*i, *(i - 1))
(for the version with a predicate argument) holds. Invalidates only the iterators and references to the erased elements. [26.3.9.6/16]All Sequence Containers
:clear
invalidates all references, pointers, and iterators referring to the elements of a and may invalidate the past-the-end iterator (Table 87 — Sequence container requirements). But forforward_list
,clear
does not invalidate past-the-end iterators. [26.3.9.5/32]All Sequence Containers
:assign
invalidates all references, pointers and
iterators referring to the elements of the container. Forvector
anddeque
, also invalidates the past-the-end iterator. (Table 87 — Sequence container requirements)
Associative Containers
All Associative Containers
: Theerase
members shall invalidate only iterators and references to the erased elements [26.2.6/9]All Associative Containers
: Theextract
members invalidate only iterators to the removed element; pointers and references to the removed element remain valid [26.2.6/10]
Container Adaptors
stack
: inherited from underlying containerqueue
: inherited from underlying containerpriority_queue
: inherited from underlying container
General container requirements relating to iterator invalidation:
Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking a container member function or passing a container as an argument to a library function shall not invalidate iterators to, or change the values of, objects within that container. [26.2.1/12]
no
swap()
function invalidates any references, pointers, or iterators referring to the elements of the containers being swapped. [ Note: The end() iterator does not refer to any element, so it may be invalidated. —end note ] [26.2.1/(11.6)]
As examples of the above requirements:
transform
algorithm: Theop
andbinary_op
functions shall not invalidate iterators or subranges, or modify elements in the ranges [28.6.4/1]accumulate
algorithm: In the range [first, last],binary_op
shall neither modify elements nor invalidate iterators or subranges [29.8.2/1]reduce
algorithm: binary_op shall neither invalidate iterators or subranges, nor modify elements in the range [first, last]. [29.8.3/5]
and so on...
Past-the-end iterator invalidation in C++11
My question is about all containers:
- In C++11, are there any container operations that may invalidate a past-the-end iterator, and where this behavior is ambiguous in the
language specification?
I am not sure what you mean with "where this behavior is ambiguous in the language specification", but there certainly are operations that invalidate past-the-end operators (like insert into a std::vector
or std::string
).
Said differently,
- Can I trust the validity of a past-the-end iterator after performing a container operation that does not say it may invalidate
the past-the-end iterators?
You can trust the past-the-end iterator like any other iterator: Any operation that does not (potentially) invalidate iterators won't invalidate them. Except for the possibility of the standard sporting a bug, that is all operations where it doesn't say that they (potentially) invalidate operators.
Rules for Iterator Invalidation
These rules are container specific. In fact, these are important criteria for deciding which container you use.
For instance, iterators to an std::vector
can get invalidated when an object is inserted (depends in where the object is inserted and if reallocation takes place), and they get invalidated when an object is removed before the iterator. An std::list
does not have this problem. Inserting and removing objects (except for the object the iterator points to) does not invalidate the iterator.
SGI provides good documentation on this.
what are the rules for iterator invalidation?
There are lots of cases where modifying an iterable while iterating over it cause problems. Here's an example where an XML tree's iteration is messed up when an element is removed during iteration. There are plenty of questions on stackoverflow where you get surprising results when iterating over a list:
>>> lst = [1, 1, 2, 3]
>>> for item in lst:
... if item == 1:
... lst.remove(item)
...
>>> print(lst)
[1, 2, 3]
(Note that there is still a 1
in the output list).
So the general rule is that you probably shouldn't do anything that would add or remove items while an iterator is doing it's thing. If you don't know how the iterator is implemented, this is by far the safest tack. However, some iterators are documented to work in specific ways. e.g. take the list example above, it turns out that we can remove the current (or elements at lower indices) if we iterate the list in reverse:
>>> lst = [1, 1, 2, 3]
>>> for item in reversed(lst):
... if item == 1:
... lst.remove(item)
...
>>> print(lst)
[2, 3]
This is due to certain guarantees that are made by the list iterator. Note that due to the the general rule I listed above, I wouldn't advise doing this (It'll probably cause your code readers to scratch their heads to try to figure out why you're iterating over it backwards).
For the list case, you'll see people iterating over a copy of a list if they're planning on removing/adding elements, but it's harder to give advice for the general iterator case without knowing more about the constraints of the problem.
Do the iterator invalidation rules mean thread safety?
There are two kinds of operations on C++ std
containers. Reader and Writer operations (these are not the terms the standard uses, but this reads easier). In addition, there are operations on elements in the container.
A const
method is a Reader method, as are "lookup" functions that are only non-const
because they return a non-const
reference to an element (or similar). There is a complete list in the standard, but common sense should work. vector::operator[]
, begin()
, map::find()
are all "Readers". map::operator[]
is not a "Reader".
You can have any number of threads engaging in Reader operations at the same time no problem.
If you have any thread engaged in a Writer operation, no other access can occur on the container.
You cannot safely have one Reader and one Writer at the same time. If you have any Writers, you must have excluded all other access.
You can safely have 1337 readers at once.
Operations on elements is somewhat similar, except that Writing to an element need only exclude other access to that element. And you are responsible for making your const
method play nice with each other. (the std
guarantees that the const
methods of the container will only call const
methods of the elements)
Note that changing sorting order in a std::
associative container is UB without modifying the container.
An iterator that is not invalidated, where you just operate on the element, will (I believe) count as operations on the element. Only synchronization on that element is required.
Note that std::vector<bool>
does not follow the above rules.
If you violate the above rules, the C++ std
library does not stop you. It just states there is a race condition -- aka, undefined behavior. Anything can happen. In C++ std
library, if you don't need something, you don't pay for it: and a single-threaded use of those containers would not need the weight of synchronization. So you don't get it by default.
A std::shared_timed_mutex
or std::experimental::shared_mutex
can both be useful to guarantee the above holds. unique_lock
for Writing, and shared_lock
for Reading. Write access to elements has to be shared_lock
ed on the container guarded, and somehow guarded against overlapping access to the same element without deadlock.
Iterator invalidation rules are relatively orthogonal to the thread-safety rules.
iterator invalidation in map C++
Using an invalidated iterator is undefined behaviour. In such case, anything could happen.
Why do you see the values? The iterator contains a pointer to some piece of memory, by pure accident, this memory has not yet been returned to the system and has not yet been overwritten. This is why you still can see the already "dead" values.
It does not change anything, it remains undefined behaviour, and the next time you run the program, the memory page the map element resided in could already have been returned to the OS again and you get an access violation (segmentation fault)...
Does std::vector::erase() really invalidate the iterator at the point of erase?
Dereferencing an invalidated iterator has undefined results. Your program may crash, it may stop with a runtime error or break in the debugger (if you are running a debug build with a debug version of the STL with iterator debugging/validation) and it may "seem to work", i.e., deliver the value that was erased from the collection.
This is because iterators MAY be implemented as just pointers. This is not necessarily the case, but defining behavior in this situation as undefined allows such an efficient and simple implementation. Invalid iterators implemented as pointers MAY still point to a valid memory location, which MAY still contain the value it previously contained, even though it is logically no longer part of the data structure (collection) it was a part of. There is no validation code which checks if the iterator is valid when it is dereferenced (except sometimes in debug builds).
This is both one of the characteristics strengths and one of the weaknesses of C++, as it gives your program better performance at the cost of stability and security in case your program does something undefined (due to a bug or using unvalidated user input).
Vector iterator invalidation after inserting
First of all. STL iterators are the same insecure as pointers. Good or bad, this is so. We cannot change this. This means that you need to be careful when you are working with iterators.
Now what happens behind the scenes. Both iterators are essentially pointers. First one points to the current beginning of the vector body, the second one to the current end of the vector body. During insertion vector is allowed to move body of the vector to some other place. If vector will move the body, your iterator end
will continue pointing to the same place. Even if vector will not move the beginning of the vector (sometimes this happens), your iterator will point to the last valid element (not after that last one as it should in normal case). Again, this will not be what you are expecting.
What is written above is just for understanding. You should never use this knowledge. You should check what actions invalidate what types of iterators and base your code only on what is explicitly allowed in the documentation.
Related Topics
Explicit Return Type of Lambda
What Is Monomorphisation with Context to C++
Can Branches with Undefined Behavior Be Assumed Unreachable and Optimized as Dead Code
Fastest Way to Get the Integer Part of Sqrt(N)
C++ Ifstream Error Using String as Opening File Path
Clean Eclipse Index, It Is Out of Sync with Code
What Should I Know About Structured Exceptions (Seh) in C++
Some Clarification Needed About Synchronous Versus Asynchronous Asio Operations
Making a C++ Class a Monitor (In the Concurrent Sense)
What Is the Branch in the Destructor Reported by Gcov
How to Take Screenshot in Opengl
Do Stl Iterators Guarantee Validity After Collection Was Changed