Can Raw Pointers Be Used Instead of Iterators with Stl Algorithms for Containers with Linear Storage

Can raw pointers be used instead of iterators with STL algorithms for containers with linear storage?

One of the features of iterators being based on operator-overloading, is that pointers are already random-access iterators. This was a big design win in the early days of STL, as it made it easier to use algorithms with existing code (as well as making the interface more familiar to programmers). It's perfectly reasonable to wrap an array, add typedef T* iterator; typedef const T* const_iterator, return &array[0] from your begin() and &array[size] from your end(), and then use your container with any iterator-based algorithm. As you already realise, this will work for any container where elements are contiguous in memory (such as an array).

You might implement 'real' iterators if:

  • You have a different-shaped container (such as a tree or list);
  • You want to be able to resize the array without invalidating the iterators;
  • You want to add debugging checks to your iterator use, for example to check if the iterator is used after being invalidated or after the container has been deleted, or to bounds-check;
  • You want to introduce type-safety, and make sure people can't accidentally assign an arbitrary T* to a my_array::iterator.

I'd say this last advantage alone is well worth writing a trivial wrapper class for. If you don't take advantage of C++'s type system by making different kinds of thing have different types, you might as well switch to Javascript :-)

Pointers into elements in a container

The standard specifies when such pointers are invalidated. References into a vector die when you increase its size past capacity or add/remove a preceding element. References into a deque are invalidated if you add/remove from the middle.

Otherwise, references and iterators are safe to keep for the lifespan of the underlying object.

What is the right way of using c++ stl iterators instead of traditional pointers?

The difference between

const vector<double>::iterator

and

vector<double>::const_iterator

is roughly the same as between double * const v and const double *v:

  • the first says that the iterator must remain constant, but what it points to can be changed
  • the second says that the iterator itself is changeable, but what it points to is const.

If you rewrite the function as

void f(const vector<double>::iterator first, const vector<double>::iterator last) {
for(vector<double>::iterator it = first; it != last; it++)
*it = 10;
}

it would compile and run correctly.

C++ create a char* iterator

There are several types of iterators, specified by what properties they have (functionality they support) - there is a nice overview here http://www.cplusplus.com/reference/iterator/

Random access iterators require to implement all the iterator functionality seen in that table.

Raw pointers do in fact support all the operations and are therefore random access operators iterators and can be used for all STL algorithms and containers. Also discussed here Can raw pointers be used instead of iterators with STL algorithms for containers with linear storage?.

Although not necessary, it might still be useful to implement an iterator wrapper for your pointers - this is also discussed in the answers to the question above.

Which STL algorithms are safe to use with single-pass input iterators?

Algorithms which operate on InputIterators and OutputIterators may by contract rely only on a single pass through the range they operate over.

From cppreference1:

On InputIterators:

An InputIterator is an Iterator that can read from the pointed-to element. InputIterators only guarantee validity for single pass algorithms: once an InputIterator i has been incremented, all copies of its previous value may be invalidated.

and on OutputIterators

Assignment through the same value of an output iterator happens only once: algorithms on output iterators must be single-pass algorithms.


This is a list of those algorithms which have arguments of InputIterator and OutputIterator:

  • all_of, any_of and none_of
  • for_each
  • count and count_if
  • mismatch
  • equal
  • find, find_if and find_if_not
  • find_first_of offers InputIterator behavior for the first iterator pair
  • copy and copy_if
  • copy_n
  • move
  • fill_n
  • transform
  • generate_n
  • remove_copy and remove_copy_if
  • replace_copy and replace_copy_if
  • unique_copy
  • is_partitioned
  • partition_copy
  • partial_sort_copy, offers InputIterator behavior for the first iterator pair
  • merge
  • includes
  • set_difference
  • set_intersection
  • set_symmetric_difference
  • set_union
  • lexicographical_compare
  • accumulate
  • inner_product
  • adjacent_difference
  • partial_sum

added in C++17:

  • for_each_n
  • sample either the input or the output iterators may be single pass, but not both--either the input must satisfy ForwardIterator or the output must satisfy RandomAccessIterator.
  • exclusive_scan
  • inclusive_scan
  • transform_reduce
  • transform_exclusive_scan
  • transform_inclusive_scan
  • uninitialized_move
  • uninitialized_move_n

Interestingly, there are three algorithms not on this list that one might expect: max_element, min_element and minmax_element are the standard's algorithms for finding the maximum, minimum and both the minimum and maximum value of a range. One might expect them to iterate over their given range only a single time, and thus require InputIterator arguments. Instead, they require ForwardIterator arguments, because rather than returning the value of the chosen element, they return an iterator to it. Since this violates the single pass requirement of an InputIterator, these algorithms are naturally left with a ForwardIterator.


1. cppreference is backed on both counts by the standard (n4140 draft).

§24.2.3 [input.iterators] states that after ++r where r is the input iterator, "any copies of the previous value of r are no longer required either to be dereferenceable or to be in the domain of =="

§24.2.4 [output.iterators] states for both the expressions *r = o and *r++ = o "After this operation r is not required to be dereferenceable."

Both sections contain notes that mention that the iterator in question is safe for single pass ranges, stating that "Algorithms on (input|output) iterators should never attempt to pass through the same iterator twice. They should be single pass algorithms." Of course, notes aren't binding.

per-container optimizations for functions that operate on iterators

I have also always wished for a way to do this, but the answer is kind-of, but not really. This isn't possible in general, because of the template deduction rules.
Each iterator may be a member of a class, but since multiple classes may have the same iterators, if a function receives these iterators, it is literally and theoretically impossible to deduce which container it came from. That hinders things somewhat.

If you don't need the container type and can work with the iterator type alone, then yes, std::sort could optimize based on the underlying container. But no, std::sort doesn't have a special algorithm for node based containers in any C++ standard library that I know of.

The containers themselves sometimes have specialized versions, see std::list<T>::sort, but the generic std::sort can't make use of that since it works on any arbitrary range, and std::list<T>::sort only works on the entire container.


I really wish they would. Also a specialization of std::lower_bound and similar when called on the tree-based containers would be awesome.



Related Topics



Leave a reply



Submit