How Portable Is End Iterator Decrement

how portable is end iterator decrement?

I think this is the relevant clause:

ISO/IEC 14882:2003 C++ Standard 23.1.1/12 – Sequences

Table 68 lists sequence operations
that are provided for some types of
sequential containers but not others.
An implementation shall provide these
operations for all container types
shown in the "container" column, and
shall implement them so as to take
amortized constant time.


+----------------------------------------------------------------------------+
| Table 68 |
+--------------+-----------------+---------------------+---------------------+
| expression | return type | operational | container |
| | | semantics | |
+--------------+-----------------+---------------------+---------------------+
| a.front() | reference; | *a.begin() | vector, list, deque |
| | const_reference | | |
| | for constant a | | |
+--------------+-----------------+---------------------+---------------------+
| a.back() | reference; | *--a.end() | vector, list, deque |
| | const_reference | | |
| | for constant a | | |
..............................................................................
. . . . .
. . . . .
..............................................................................
| a.pop_back() | void | a.erase(--a.end()) | vector, list, deque |
..............................................................................
. . . . .
. . . . .

So for the containers listed, not only should the iterator returned from end() be decrementable, the decremented iterator should also be dereferencable. (Unless the container is empty, of course. That invokes undefined behavior.)

In fact, vector, list and deque implementations that came with the Visual C++ compiler does it exactly like the table. Of course, that's not to imply that every compiler does it like this:

// From VC++'s <list> implementation

reference back()
{ // return last element of mutable sequence
return (*(--end()));
}

const_reference back() const
{ // return last element of nonmutable sequence
return (*(--end()));
}

Note about the code in the table:

ISO/IEC 14882:2003 C++ Standard 17.3.1.2/6 – Requirements

In some cases the semantic
requirements are presented as C + +
code. Such code is intended as a
specification of equivalence of a
construct to another construct
, not
necessarily as the way the construct
must be implemented.

So while it's true that an implementation may not implement those expressions in terms of begin() and end(), the C++ standard specifies that the two expressions are equivalent. In other words, a.back() and *--a.end() are equivalent constructs according to the above clause. It seems to me that it means that you should be able to replace every instance of a.back() with *--a.end() and vice-versa and have the code still work.


According to Bo Persson, the revision of the C++ standard that I have on hand has a defect with respect to Table 68.

Proposed resolution:

Change the specification in table 68
"Optional Sequence Operations" in
23.1.1/12 for "a.back()" from

*--a.end()

to

{ iterator tmp = a.end(); --tmp; return *tmp; }

and the specification for
"a.pop_back()" from

a.erase(--a.end())

to

{ iterator tmp = a.end(); --tmp; a.erase(tmp); }

It appears that you can still decrement the iterator returned from end() and dereference the decremented iterator, as long as it's not a temporary.

Why can't I decrement std::array::end()?

It depends on how the iterator is defined.

It seems that for the class template std::array the iterator is defined as a pointer. So the functions begin, end. cbegin, cend return just the pointer. Thus as the pointer is returned by value you may not decrease it because an lvalue is required..

For the class template std::vector the iterator is defined as a user-defined class for which the operator --() is defined.

Consider the following demonstrative program

#include <iostream>

class Int
{
public:
Int( int x = 0 ) : x ( x )
{

}

Int & operator --()
{
--x;
return *this;
}

friend std::ostream & operator <<( std::ostream &os, const Int &i )
{
return os << i.x;
}
private:
int x;
};

int f( int x )
{
return x;
}

Int f( Int x )
{
return x;
}

int main()
{
std::cout << --f( Int( 10 ) ) << std::endl;
// error: lvalue required as decrement operand
// std::cout << --f( 10 ) << std::endl;

return 0;
}

Take into account that you can write

auto arrIt = std::prev( arr.end() );

instead of

auto arrIt = --arr.end();

provided that you include header <iterator>.

You can use the operator with the reverse iterators of the class template std::array because the Standard defines them explicitly like

typedef std::reverse_iterator<iterator> reverse_iterator;
typedef std::reverse_iterator<const_iterator> const_reverse_iterator;

and the user-defined class std::reverse_iterator defines the operator --() .

Here is a demonstrative program

#include <iostream>
#include <array>

int main()
{
std::array<double, 2> arr = { { 1, 2 } };

auto it = --arr.rend();

std::cout << *it << std::endl;

return 0;
}

Arithmetic on end() iterator

It is perfectly valid as vector::iterator is a random access iterator. You can perform arithmetic operations on it and it is not platform dependent.

std::vector<double>::iterator it = A.end();
while (it != A.begin()){
--it; //this will skip A.end() and loop will break after processing A.front()
//do something with 'it'
}

But A.end() refers to theoretical past-the-end element, so it does not point to an element and thus shall not be dereferenced. So best practice is to use reverse iterator instead of decrementing end iterator.

for(std::vector<double>::reverse_iterator it = A.rbegin(); it != A.rend(); ++it) {
//do something with 'it'
}

These two loops do the same thing, second is just understandable, cleaner way to do it.

Increment/Decrement for Vector Iterator

The reason for this potential issue is explained quite well by Josuttis:

The reason for this strange problem lies in the fact that iterators of vectors, arrays, and strings might be implemented as ordinary pointers. And for all fundamental data types, such as pointers, you are not allowed to modify temporary values. For structures and classes, however, doing so is allowed.

In other words, it all depends whether std::vector<int>::iterator is defined as a class or is simply a typedef for an int*. Either is allowed by the standard which is why it may cause an issue on some compilers but not others.

When you call coll.begin() a rvalue std::vector<int>::iterator is created. If std::vector<int>::iterator is a class with an implemented prefix operator++ then modification of the rvalue is permitted and so it will compile. However, it std::vector<int>::iterator is a typedef for a pointer to an int, it is an rvalue of a fundamental type and thus may not compile.

Should I allow my custom iterator to go past the end?

It doesn't really matter.

To conform to STL conventions, you need a begin() method which returns an iterator to the start of the collection, and an end() method which returns an iterator to one past the last element. So advancing to end() and comparing for equality is well-defined. However advancing beyond end() is not well-defined. It's better to throw an error, but not necessary, especially if that involves otherwise redundant checks. If people don't use the iterators correctly, you are not responsible for any bugs.

Why does --string::end() compile but --string.size() does not?

I'd like to add that you can customize user-defined member functions like ++ that behave differently when called through lvalue and rvalue references. For example, you could define a custom iterator class that prevents you from calling modifying member functions on a temporary. To do this, you need to use reference qualifiers.

class iterator {
public:
iterator operator++() & { // for lvalues
std::cout << "incrementing...\n";
}
iterator operator++() && = delete; // for rvalues
};

With these qualifiers, you still allow lvalues to be modified:

iterator it = ...;
++it; // totally fine

but you can now prevent temporaries from being modified, which results in a user-defined class that is more consistent with built-in types like the size_t.

++(iterator{}); // ERROR

Live example here

I'm not sure what the standard says about this for std::string's iterator type, but in principle, you could do this for your own iterators and other classes, wherever you think it's always an mistake to modify a temporary.

What's the difference between a sentinel and an end iterator?

Sentinel simply allows the end iterator to have a different type.

The allowed operations on a past-the-end iterator are limited, but this is not reflected in its type. It is not ok to * a .end() iterator, but the compiler will let you.

A sentinel does not have unary dereference, or ++, among other things. It is generally as restricted as the weakest iterators one past the end iterator, but enforced at compile time.

There is a payoff. Often detecting the end state is easier than finding it. With a sentinel, == can dispatch to "detect if the other argument is past the end" at compile time, instead of run time.

The result is that some code that used to be slower than the C equivalent now compiles down to C level speed, such as copying a null terminated string using std::copy. Without sentinels, you either had to scan to find the end before the copy, or pass in iterators with a bool flag saying "I am the end sentinel" (or equivalent), and check it on ==.

There are other similar advantages when working with counting based ranges. In addition, some things like zip ranges1 become easier to express (the end zip sentinel could hold both source sentinels, and return equal if either sentinel does: zip iterators either only compare the first iterator, or compare both).

Another way of thinking about it is that algorithms tend to not use the full richness of the iterator concept on the parameter passed as the past the end iterator, and that iterator is handled way differently in practice. Sentinel means that the caller can exploit that fact, which in turn lets the compiler exploit it easier.


1 A zip range is what you get when you start with 2 or more ranges, and "zip" them together like a zipper. The range is now over tuples of the individual range elements. Advancing a zip iterator advances each of the "contained" iterators, and ditto for dereferencing and comparing.

begin(container)-is defined behavier?

The iterator requirements are rather clear: In 24.2.6 [bidirectional.iterators], Table 110:

--r (Expression) X& (Return type) pre: there exists s such that r == ++s.

Since there is no such s for c.begin(), it can't be decremented without violating the precondition.

Why does dereferencing an iterator make output-stream cut out?

Where did i make a mistake?

This line:

std::cout<<"Before --it: "<<*it<<std::endl;

is de-referencing an iterator that points past the end of the container which is undefined behavior.

Why this case, cuts the output-stream without any exception?

This question comes in 2 parts:

  1. Why does it cut the output stream? Because undefined behavior is undefined. You don't know what will happen. There is plenty of information around about this: http://en.cppreference.com/w/cpp/language/ub

  2. Why no exception? In order to provide an exception to everything that might cause UB, c++ would have to check every de-reference which would be computationally expensive for very little gain. It is up to the user to use the iterator interface correctly. This is the same in the following example:


 int a[5];
a[6] = 10; // Why does c++ allow this when it is clearly a mistake? Because the user is
// responsible for writing good code, not the compiler, and certainly
// not the run-time exception handler.


Related Topics



Leave a reply



Submit