What Is Iterator Invalidation

What is iterator invalidation?

  1. Iterators are glorified pointers. Iterator invalidation is a lot like pointer invalidation; it means it suddenly points to junk data.

  2. Because it's very natural but wrong to do things like this:

    for(iterator it = map.begin(); it != map.end(); ++it) {
    map.erase(it->first);
    // whoops, now map has been restructured and iterator
    // still thinks itself is healthy
    }
  3. Because that error right there? No compiler error, no warning, you lose. You just have to be trained well enough to watch for them and prevent them. Very insidious bugs if you don't know what you're doing. One of the design philosophies of C++ is speed over safety. The runtime check that would lead iterator invalidation to an exception instead of unspecified behavior is too expensive, in the view of C++ language designers.

You should be on high alert when you are iterating over a data structure and modifying structure itself, instead of merely the objects held in them. At that point you should probably run to the documentation and check whether the operation was illegal.

Iterator invalidation rules for C++ containers

C++17 (All references are from the final working draft of CPP17 - n4659)



Insertion

Sequence Containers

  • vector: The functions insert, emplace_back, emplace, push_back cause reallocation if the new size is greater than the old capacity. Reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence. If no reallocation
    happens, all the iterators and references before the insertion point remain valid. [26.3.11.5/1]

    With respect to the reserve function, reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence. No reallocation shall take place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity(). [26.3.11.3/6]

  • deque: An insertion in the middle of the deque invalidates all the iterators and references to elements of the deque. An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque. [26.3.8.4/1]

  • list: Does not affect the validity of iterators and references. If an exception is thrown there are no effects. [26.3.10.4/1].

    The insert, emplace_front, emplace_back, emplace, push_front, push_back functions are covered under this rule.

  • forward_list: None of the overloads of insert_after shall affect the validity of iterators and references [26.3.9.5/1]

  • array: As a rule, iterators to an array are never invalidated throughout the lifetime of the array. One should take note, however, that during swap, the iterator will continue to point to the same array element, and will thus change its value.

Associative Containers

  • All Associative Containers: The insert and emplace members shall not affect the validity of iterators and references to the container [26.2.6/9]

Unordered Associative Containers

  • All Unordered Associative Containers: Rehashing invalidates iterators, changes ordering between elements, and changes which buckets elements appear in, but does not invalidate pointers or references to elements. [26.2.7/9]

    The insert and emplace members shall not affect the validity of references to container elements, but may invalidate all iterators to the container. [26.2.7/14]

    The insert and emplace members shall not affect the validity of iterators if (N+n) <= z * B, where N is the number of elements in the container prior to the insert operation, n is the number of elements inserted, B is the container’s bucket count, and z is the container’s maximum load factor. [26.2.7/15]

  • All Unordered Associative Containers: In case of a merge operation (e.g., a.merge(a2)), iterators referring to the transferred elements and all iterators referring to a will be invalidated, but iterators to elements remaining in a2 will remain valid. (Table 91 — Unordered associative container requirements)

Container Adaptors

  • stack: inherited from underlying container
  • queue: inherited from underlying container
  • priority_queue: inherited from underlying container


Erasure

Sequence Containers

  • vector: The functions erase and pop_back invalidate iterators and references at or after the point of the erase. [26.3.11.5/3]

  • deque: An erase operation that erases the last element of a deque invalidates only the past-the-end iterator and all iterators and references to the erased elements. An erase operation that erases the first element of a deque but not the last element invalidates only iterators and references to the erased elements. An erase operation that erases neither the first element nor the last element of a deque invalidates the past-the-end iterator and all iterators and references to all the elements of the deque.
    [ Note: pop_front and pop_back are erase operations. —end note ] [26.3.8.4/4]

  • list: Invalidates only the iterators and references to the erased elements. [26.3.10.4/3]. This applies to erase, pop_front, pop_back, clear functions.

    remove and remove_if member functions: Erases all the elements in the list referred by a list iterator i for which the following conditions hold: *i == value, pred(*i) != false. Invalidates only the iterators and references to the erased elements [26.3.10.5/15].

    unique member function - Erases all but the first element from every consecutive group of equal elements referred to by the iterator i in the range [first + 1, last) for which *i == *(i-1) (for the version of unique with no arguments) or pred(*i, *(i - 1)) (for the version of unique with a predicate argument) holds. Invalidates only the iterators and references to the erased elements. [26.3.10.5/19]

  • forward_list: erase_after shall invalidate only iterators and references to the erased elements. [26.3.9.5/1].

    remove and remove_if member functions - Erases all the elements in the list referred by a list iterator i for which the following conditions hold: *i == value (for remove()), pred(*i) is true (for remove_if()). Invalidates only the iterators and references to the erased elements. [26.3.9.6/12].

    unique member function - Erases all but the first element from every consecutive group of equal elements referred to by the iterator i in the range [first + 1, last) for which *i == *(i-1) (for the version with no arguments) or pred(*i, *(i - 1)) (for the version with a predicate argument) holds. Invalidates only the iterators and references to the erased elements. [26.3.9.6/16]

  • All Sequence Containers: clear invalidates all references, pointers, and iterators referring to the elements of a and may invalidate the past-the-end iterator (Table 87 — Sequence container requirements). But for forward_list, clear does not invalidate past-the-end iterators. [26.3.9.5/32]

  • All Sequence Containers: assign invalidates all references, pointers and
    iterators referring to the elements of the container. For vector and deque, also invalidates the past-the-end iterator. (Table 87 — Sequence container requirements)

Associative Containers

  • All Associative Containers: The erase members shall invalidate only iterators and references to the erased elements [26.2.6/9]

  • All Associative Containers: The extract members invalidate only iterators to the removed element; pointers and references to the removed element remain valid [26.2.6/10]

Container Adaptors

  • stack: inherited from underlying container
  • queue: inherited from underlying container
  • priority_queue: inherited from underlying container

General container requirements relating to iterator invalidation:

  • Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking a container member function or passing a container as an argument to a library function shall not invalidate iterators to, or change the values of, objects within that container. [26.2.1/12]

  • no swap() function invalidates any references, pointers, or iterators referring to the elements of the containers being swapped. [ Note: The end() iterator does not refer to any element, so it may be invalidated. —end note ] [26.2.1/(11.6)]

As examples of the above requirements:

  • transform algorithm: The op and binary_op functions shall not invalidate iterators or subranges, or modify elements in the ranges [28.6.4/1]

  • accumulate algorithm: In the range [first, last], binary_op shall neither modify elements nor invalidate iterators or subranges [29.8.2/1]

  • reduce algorithm: binary_op shall neither invalidate iterators or subranges, nor modify elements in the range [first, last]. [29.8.3/5]

and so on...

Past-the-end iterator invalidation in C++11

My question is about all containers:

  • In C++11, are there any container operations that may invalidate a past-the-end iterator, and where this behavior is ambiguous in the
    language specification?

I am not sure what you mean with "where this behavior is ambiguous in the language specification", but there certainly are operations that invalidate past-the-end operators (like insert into a std::vector or std::string).

Said differently,

  • Can I trust the validity of a past-the-end iterator after performing a container operation that does not say it may invalidate
    the past-the-end iterators?

You can trust the past-the-end iterator like any other iterator: Any operation that does not (potentially) invalidate iterators won't invalidate them. Except for the possibility of the standard sporting a bug, that is all operations where it doesn't say that they (potentially) invalidate operators.

Rules for Iterator Invalidation

These rules are container specific. In fact, these are important criteria for deciding which container you use.

For instance, iterators to an std::vector can get invalidated when an object is inserted (depends in where the object is inserted and if reallocation takes place), and they get invalidated when an object is removed before the iterator. An std::list does not have this problem. Inserting and removing objects (except for the object the iterator points to) does not invalidate the iterator.

SGI provides good documentation on this.

what are the rules for iterator invalidation?

There are lots of cases where modifying an iterable while iterating over it cause problems. Here's an example where an XML tree's iteration is messed up when an element is removed during iteration. There are plenty of questions on stackoverflow where you get surprising results when iterating over a list:

>>> lst = [1, 1, 2, 3]
>>> for item in lst:
... if item == 1:
... lst.remove(item)
...
>>> print(lst)
[1, 2, 3]

(Note that there is still a 1 in the output list).

So the general rule is that you probably shouldn't do anything that would add or remove items while an iterator is doing it's thing. If you don't know how the iterator is implemented, this is by far the safest tack. However, some iterators are documented to work in specific ways. e.g. take the list example above, it turns out that we can remove the current (or elements at lower indices) if we iterate the list in reverse:

>>> lst = [1, 1, 2, 3]
>>> for item in reversed(lst):
... if item == 1:
... lst.remove(item)
...
>>> print(lst)
[2, 3]

This is due to certain guarantees that are made by the list iterator. Note that due to the the general rule I listed above, I wouldn't advise doing this (It'll probably cause your code readers to scratch their heads to try to figure out why you're iterating over it backwards).

For the list case, you'll see people iterating over a copy of a list if they're planning on removing/adding elements, but it's harder to give advice for the general iterator case without knowing more about the constraints of the problem.

Do the iterator invalidation rules mean thread safety?

There are two kinds of operations on C++ std containers. Reader and Writer operations (these are not the terms the standard uses, but this reads easier). In addition, there are operations on elements in the container.

A const method is a Reader method, as are "lookup" functions that are only non-const because they return a non-const reference to an element (or similar). There is a complete list in the standard, but common sense should work. vector::operator[], begin(), map::find() are all "Readers". map::operator[] is not a "Reader".

You can have any number of threads engaging in Reader operations at the same time no problem.

If you have any thread engaged in a Writer operation, no other access can occur on the container.

You cannot safely have one Reader and one Writer at the same time. If you have any Writers, you must have excluded all other access.

You can safely have 1337 readers at once.

Operations on elements is somewhat similar, except that Writing to an element need only exclude other access to that element. And you are responsible for making your const method play nice with each other. (the std guarantees that the const methods of the container will only call const methods of the elements)

Note that changing sorting order in a std:: associative container is UB without modifying the container.

An iterator that is not invalidated, where you just operate on the element, will (I believe) count as operations on the element. Only synchronization on that element is required.

Note that std::vector<bool> does not follow the above rules.


If you violate the above rules, the C++ std library does not stop you. It just states there is a race condition -- aka, undefined behavior. Anything can happen. In C++ std library, if you don't need something, you don't pay for it: and a single-threaded use of those containers would not need the weight of synchronization. So you don't get it by default.

A std::shared_timed_mutex or std::experimental::shared_mutex can both be useful to guarantee the above holds. unique_lock for Writing, and shared_lock for Reading. Write access to elements has to be shared_locked on the container guarded, and somehow guarded against overlapping access to the same element without deadlock.


Iterator invalidation rules are relatively orthogonal to the thread-safety rules.

iterator invalidation in map C++

Using an invalidated iterator is undefined behaviour. In such case, anything could happen.

Why do you see the values? The iterator contains a pointer to some piece of memory, by pure accident, this memory has not yet been returned to the system and has not yet been overwritten. This is why you still can see the already "dead" values.

It does not change anything, it remains undefined behaviour, and the next time you run the program, the memory page the map element resided in could already have been returned to the OS again and you get an access violation (segmentation fault)...

Does std::vector::erase() really invalidate the iterator at the point of erase?

Dereferencing an invalidated iterator has undefined results. Your program may crash, it may stop with a runtime error or break in the debugger (if you are running a debug build with a debug version of the STL with iterator debugging/validation) and it may "seem to work", i.e., deliver the value that was erased from the collection.

This is because iterators MAY be implemented as just pointers. This is not necessarily the case, but defining behavior in this situation as undefined allows such an efficient and simple implementation. Invalid iterators implemented as pointers MAY still point to a valid memory location, which MAY still contain the value it previously contained, even though it is logically no longer part of the data structure (collection) it was a part of. There is no validation code which checks if the iterator is valid when it is dereferenced (except sometimes in debug builds).

This is both one of the characteristics strengths and one of the weaknesses of C++, as it gives your program better performance at the cost of stability and security in case your program does something undefined (due to a bug or using unvalidated user input).

Vector iterator invalidation after inserting

First of all. STL iterators are the same insecure as pointers. Good or bad, this is so. We cannot change this. This means that you need to be careful when you are working with iterators.

Now what happens behind the scenes. Both iterators are essentially pointers. First one points to the current beginning of the vector body, the second one to the current end of the vector body. During insertion vector is allowed to move body of the vector to some other place. If vector will move the body, your iterator end will continue pointing to the same place. Even if vector will not move the beginning of the vector (sometimes this happens), your iterator will point to the last valid element (not after that last one as it should in normal case). Again, this will not be what you are expecting.

What is written above is just for understanding. You should never use this knowledge. You should check what actions invalidate what types of iterators and base your code only on what is explicitly allowed in the documentation.



Related Topics



Leave a reply



Submit