What Is an Iterator in General

What is an iterator in general?

In C++ an Iterator is a concept, not a concrete (or abstract) type, but any type that obeys certain iterator like rules.

For example iterators generally can be incremented ++i. They can be accessed (dereferenced) *i to obtain the value they currently point to. They are essentially abstractions of a pointer.

In the Standard Library's containers & algorithms there are different kinds of iterators with different properties. Their properties are listed here:

https://en.cppreference.com/w/cpp/iterator

So when writing algorithms in C++ that accept iterators, generally just accept generic template parameters and use the appropriate iterator properties in the function. The compiler will complain if the user passes something to your function that doesn't obey the iterator rules:

template<typename Iterator>
void my_algorithm(Iterator begin, Iterator end)
{
for(; begin != end; ++begin)
std::cout << *begin << '\n';
}

You can add a whole bunch of specific checks to make sure the user passed something sensible, but that's too broad for this question.

Note:

Whilst currently concepts, such as Iterator, are merely a set of agreed upon semantic properties in the Standard that programmers must follow, a more comprehensive solution that will formalize such concepts (in code) is intended for the next version of the Standard, C++20.

What are Iterators, C++?

Iterators are a way of traversing a collection of objects. Typically, they allow you to access an STL (Standard Template Library) container sequentially in ways similar to accessing a classical C array with a pointer. To access an object through an iterator, you dereference it like a C pointer. To access the next object in a collection, you use the increment (++) operator. Some containers have multiple kinds of iterators that allow you to traverse the collection in different ways.

What exactly is an iterator in C++?

What exactly is an iterator in C++?

Iterator is a concept. The concept describes a type with particular properties and operations with particular behaviours. A type that conforms to the "iterator" concept is said to be an iterator, and similarly objects of such type are also iterators.

More specifically, iterator is a generalisation of a pointer. It generally points to an entity (usually an object), and you can indirect through the iterator to access the pointed object (using the unary operator * which is called the indirection operator) and there is a way to modify the iterator to point to the "next" entity (using the operator ++).

we can conceptualize an iterator as a box that holds values

"holds" may be misleading. The referred entity isn't necessarily nor typically stored in the iterator. It is generally stored elsewhere and the iterator "knows" where the entity is.

Instead of a "box", I would describe it as a sign with with instructions on how to find the entity. As a visual metaphor, it "points" to the entity.

The actual box doesn't change, and that's why (&it) stays the same in both cases

To be clear, the value of it did change. But it is still the same iterator object, so its address is the same.

What are iterator, iterable, and iteration?

Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.

In Python, iterable and iterator have specific meanings.

An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.

An iterator is an object with a next (Python 2) or __next__ (Python 3) method.

Whenever you use a for loop, or map, or a list comprehension, etc. in Python, the next method is called automatically to get each item from the iterator, thus going through the process of iteration.

A good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.

Understanding Iterators in the STL

There are three building blocks in the STL:

  • Containers
  • Algorithms
  • Iterators

At the conceptual level containers hold data. That by itself isn't very useful, because you want to do something with the data; you want to operate on it, manipulate it, query it, play with it. Algorithms do exactly that. But algorithms don't hold data, they have no data -- they need a container for this task. Give a container to an algorithm and you have an action going on.

The only problem left to solve is how does an algorithm traverse a container, from a technical point of view. Technically a container can be a linked list, or it can be an array, or a binary tree, or any other data structure that can hold data. But traversing an array is done differently than traversing a binary tree. Even though conceptually all an algorithm wants is to "get" one element at a time from a container, and then work on that element, the operation of getting the next element from a container is technically very container-specific.

It appears as if you'd need to write the same algorithm for each container, so that each version of the algorithm has the correct code for traversing the container. But there's a better solution: ask the container to return an object that can traverse over the container. The object would have an interface algorithms know. When an algorithm asks the object to "get the next element" the object would comply. Because the object came directly from the container it knows how to access the container's data. And because the object has an interface the algorithm knows, we need not duplicate an algorithm for each container.

This is the iterator.

The iterator here glues the algorithm to the container, without coupling the two. An iterator is coupled to a container, and an algorithm is coupled to the iterator's interface. The source of the magic here is really template programming. Consider the standard copy() algorithm:

template<class In, class Out>
Out copy(In first, In last, Out res)
{
while( first != last ) {
*res = *first;
++first;
++res;
}
return res;
}

The copy() algorithm takes as parameters two iterators templated on the type In and one iterator of type Out. It copies the elements starting at position first and ending just before position last, into res. The algorithm knows that to get the next element it needs to say ++first or ++res. It knows that to read an element it needs to say x = *first and to write an element it needs to say *res = x. That's part of the interface algorithms assume and iterators commit to. If by mistake an iterator doesn't comply with the interface then the compiler would emit an error for calling a function over type In or Out, when the type doesn't define the function.

what is the type of an iterator in STL?

24.2.1/1 [iterator.requirements.general] sums it up nicely:

Iterators are a generalization of pointers that allow a C++ program to work with different data structures (containers) in a uniform manner. To be able to construct template algorithms that work correctly and efficiently on different types of data structures, the library formalizes not just the interfaces but also the semantics and complexity assumptions of iterators.

The phrase "generalization of pointers" means that pointers are iterators. std::vector<T>::iterator is allowed to be a typedef T *. However, most iterators achieve the interface by operator overloading. (Note that iterators don't need to belong to containers, either.)

Such language is very typical of the way the C++ standard is written. It describes how things behave, but avoids defining interfaces in terms of base classes. There are various kinds of iterators: input, output, forward, bidirectional, and random-access. Each has a different specification, and although random-access is a strict superset of the bidirectional interface, they are totally unrelated in the C++ type system.

An iterator can be any class with ++ and * overloaded, and a valid specialization of std::iterator_traits. There is a base class std::iterator which works with std::iterator_traits to define the necessary interface. It is a good case study in C++ generic programming and traits classes.

Is an iterator in C++ a pointer?

The short answer is:

  • Pointer is a kind of iterator.
  • Pointer can be therefore used as an iterator.
  • Pointer has properties other than iterator.

History

Historically, we have C pointer, and it is adapted into C++ when C++ is invented. Pointer represents a location in memory, therefore can be used as a location in an array.

Later, in 1990s, an idea called "iterator concept" is introduced to C++. The "iterator concept" is related to a library called STL (which is later absorbed into the Standard Library) and a paradigm called "generic programming". The iterator concept is inspired from C pointer to represent a location in containers like vector, deque and others, just like how C pointer represent location in array. The iterator concept is carefully engineered to be compatible with C pointer, hence we can nowadays say C pointer models iterator concept.

Iterator concept

A simplified way to understand iterator concept is that, if a data type suppports a list of operations and behaviours, such that it represent a location in a container, and enable some kind of access to the element, it can be called an iterator.

With carefull design of iterator concept, C pointer fulfill that list. Pointer is therefore a kind of iterator.

Iterator concept being just a set of requirement on types, means that you can create your own iterator through C++ power of data abstraction.

Other properties of pointer

Pointer exhibits other properties, and they are not related to iterator concept.

A significant use of pointer is to express reference semantics, i.e. to refer to an object in a remote memory location. This usage of pointer is later considered unsafe, and causes the invention of "smart pointer". By comparing smart pointers and iterators, we can find that they are totally unrelated concepts.

Another use of pointer is to refer to a raw memory location. This is completely unsafe for application programming, but is a essential tool for microcontroller programming to manipulate hardware.

In Java (or in general), are iterators supposed to be reusable or designed for one-time use?

As others have pointed out, there is no way to reset an iterator to the start of the collection, so it is not reusable.

That's true not only of java, but also of C#.

You may be thinking that you could create an iterator once as a member of your collection class, and each time a new iterator is requested, you can reset it and return it.

That won't work either, because a caller of your collection class may start an iteration nested within another iteration, so the nested iteration would interfere with the outer iteration.

Defining iterator of my own container

The C++ specification on what exactly an STL container is mandates that any STL container type have several different fields available. Some, like begin() and end(), are functions, while others, like iterator, are types. These restrictions also apply to iterators. This allows C++ template functions to introspect on their types of their arguments to look up more properties. For example, all STL iterator types must define an iterator_category field containing a type encoding their capabilities. This way, the STL algorithms can have different implementations of different functions based on the power of the iterators they accept. A class example is the distance function, which takes two iterators and returns the number of spaces between them. If the input is a lowly ForwardIterator or BidirectionalIterator this works by marching the iterators forward and counting how many steps were took, which runs in O(n). If the input is a RandomAccessIterator, then the iterators can just be subtracted to get the result in O(1).

Prior to C++17, the recommendation was to include the <iterator> header and inherit from std::iterator to automatically generate the necessary nested types you’d need (reference, pointer, etc.). That type is now deprecated, and so you’ll need to manually either add typedefs or specialize std::iterator_traits to export this information.

As for what operators you need to overload, at a bare minimum you need to get ++ (prefix and postfix), ==, !=, * (pointer dereference), and -> defined. All iterator types support this. For bidirectional iterators or higher, you should also have -- defined (prefix and postfix). Finally, for random-access iterators, you should support [], +, +=, - (back up many steps and subtract two iterators), -=, <, >, <=, and >=.

Hope this helps!



Related Topics



Leave a reply



Submit