C++11 Range Based Loop: How Does It Really Work

how C++ ranged based loops work internally

The C++11 Standard actually gives the equivalent traditional loop code, which is a pretty rare approach for Standardese. You'll find it in section 6.5.4:

Sample Image

In the expansion, it's clear that the end() value from before the loop starts is saved and checked later. Unfortunately, that iterator is invalidated by the first insert() call. The rule on iterator invalidation:

If no reallocation happens, all the iterators and references before the insertion point remain valid.

Clearly end() is not "before the insertion point".

Since the behavior of insert(p, rv) is "Inserts a copy of rv before p.", the insertion point is expressly before the iterator p (which is v.begin() here). So the loop iterator __begin is also not "before the insertion point" on the first pass.

What is the correct way of using C++11's range-based for?

TL;DR: Consider the following guidelines:

For observing the elements, use the following syntax:
```
for (const auto& elem : container)    // capture by const reference
```
- If the objects are cheap to copy (like ints, doubles, etc.),
  it's possible to use a slightly simplified form:
```
  for (auto elem : container)    // capture by value
```

For modifying the elements in place, use:

for (auto& elem : container)    // capture by (non-const) reference

If the container uses "proxy iterators" (like std::vector<bool>), use:
```
  for (auto&& elem : container)    // capture by &&
```

Of course, if there is a need to make a local copy of the element inside the loop body, capturing by value (for (auto elem : container)) is a good choice.

Detailed Discussion

Let's start differentiating between observing the elements in the container
vs. modifying them in place.

Observing the elements

Let's consider a simple example:

vector<int> v = {1, 3, 5, 7, 9};

for (auto x : v)
    cout << x << ' ';

The above code prints the elements (ints) in the vector:

1 3 5 7 9

Now consider another case, in which the vector elements are not just simple integers,
but instances of a more complex class, with custom copy constructor, etc.

// A sample test class, with custom copy semantics.
class X
{
public:
    X() 
        : m_data(0) 
    {}
    
    X(int data)
        : m_data(data)
    {}
    
    ~X() 
    {}
    
    X(const X& other) 
        : m_data(other.m_data)
    { cout << "X copy ctor.\n"; }
    
    X& operator=(const X& other)
    {
        m_data = other.m_data;       
        cout << "X copy assign.\n";
        return *this;
    }
       
    int Get() const
    {
        return m_data;
    }
    
private:
    int m_data;
};

ostream& operator<<(ostream& os, const X& x)
{
    os << x.Get();
    return os;
}

If we use the above for (auto x : v) {...} syntax with this new class:

vector<X> v = {1, 3, 5, 7, 9};

cout << "\nElements:\n";
for (auto x : v)
{
    cout << x << ' ';
}

the output is something like:

[... copy constructor calls for vector<X> initialization ...]

Elements:
X copy ctor.
1 X copy ctor.
3 X copy ctor.
5 X copy ctor.
7 X copy ctor.
9

As it can be read from the output, copy constructor calls are made during range-based for loop iterations.

This is because we are capturing the elements from the container by value
(the auto x part in for (auto x : v)).

This is inefficient code, e.g., if these elements are instances of std::string,
heap memory allocations can be done, with expensive trips to the memory manager, etc.
This is useless if we just want to observe the elements in a container.

So, a better syntax is available: capture by const reference, i.e. const auto&:

vector<X> v = {1, 3, 5, 7, 9};

cout << "\nElements:\n";
for (const auto& x : v)
{ 
    cout << x << ' ';
}

Now the output is:

 [... copy constructor calls for vector<X> initialization ...]

Elements:
1 3 5 7 9

Without any spurious (and potentially expensive) copy constructor call.

So, when observing elements in a container (i.e., for read-only access),
the following syntax is fine for simple cheap-to-copy types, like int, double, etc.:

for (auto elem : container)

Else, capturing by const reference is better in the general case,
to avoid useless (and potentially expensive) copy constructor calls:

for (const auto& elem : container)

Modifying the elements in the container

If we want to modify the elements in a container using range-based for,
the above for (auto elem : container) and for (const auto& elem : container)
syntaxes are wrong.

In fact, in the former case, elem stores a copy of the original
element, so modifications done to it are just lost and not stored persistently
in the container, e.g.:

vector<int> v = {1, 3, 5, 7, 9};
for (auto x : v)  // <-- capture by value (copy)
    x *= 10;      // <-- a local temporary copy ("x") is modified,
                  //     *not* the original vector element.

for (auto x : v)
    cout << x << ' ';

The output is just the initial sequence:

1 3 5 7 9

Instead, an attempt of using for (const auto& x : v) just fails to compile.

g++ outputs an error message something like this:

TestRangeFor.cpp:138:11: error: assignment of read-only reference 'x'
          x *= 10;
            ^

The correct approach in this case is capturing by non-const reference:

vector<int> v = {1, 3, 5, 7, 9};
for (auto& x : v)
    x *= 10;

for (auto x : v)
    cout << x << ' ';

The output is (as expected):

10 30 50 70 90

This for (auto& elem : container) syntax works also for more complex types,
e.g. considering a vector<string>:

vector<string> v = {"Bob", "Jeff", "Connie"};

// Modify elements in place: use "auto &"
for (auto& x : v)
    x = "Hi " + x + "!";
    
// Output elements (*observing* --> use "const auto&")
for (const auto& x : v)
    cout << x << ' ';

the output is:

Hi Bob! Hi Jeff! Hi Connie!

The special case of proxy iterators

Suppose we have a vector<bool>, and we want to invert the logical boolean state
of its elements, using the above syntax:

vector<bool> v = {true, false, false, true};
for (auto& x : v)
    x = !x;

The above code fails to compile.

g++ outputs an error message similar to this:

TestRangeFor.cpp:168:20: error: invalid initialization of non-const reference of
 type 'std::_Bit_reference&' from an rvalue of type 'std::_Bit_iterator::referen
ce {aka std::_Bit_reference}'
     for (auto& x : v)
                    ^

The problem is that std::vector template is specialized for bool, with an
implementation that packs the bools to optimize space (each boolean value is
stored in one bit, eight "boolean" bits in a byte).

Because of that (since it's not possible to return a reference to a single bit),
vector<bool> uses a so-called "proxy iterator" pattern.
A "proxy iterator" is an iterator that, when dereferenced, does not yield an
ordinary bool &, but instead returns (by value) a temporary object,
which is a proxy class convertible to bool.
(See also this question and related answers here on StackOverflow.)

To modify in place the elements of vector<bool>, a new kind of syntax (using auto&&)
must be used:

for (auto&& x : v)
    x = !x;

The following code works fine:

vector<bool> v = {true, false, false, true};

// Invert boolean status
for (auto&& x : v)  // <-- note use of "auto&&" for proxy iterators
    x = !x;

// Print new element values
cout << boolalpha;        
for (const auto& x : v)
    cout << x << ' ';

and outputs:

false true true false

Note that the for (auto&& elem : container) syntax also works in the other cases
of ordinary (non-proxy) iterators (e.g. for a vector<int> or a vector<string>).

(As a side note, the aforementioned "observing" syntax of for (const auto& elem : container) works fine also for the proxy iterator case.)

Summary

The above discussion can be summarized in the following guidelines:

For observing the elements, use the following syntax:
```
for (const auto& elem : container)    // capture by const reference
```
- If the objects are cheap to copy (like ints, doubles, etc.),
  it's possible to use a slightly simplified form:
```
  for (auto elem : container)    // capture by value
```

For modifying the elements in place, use:

for (auto& elem : container)    // capture by (non-const) reference

If the container uses "proxy iterators" (like std::vector<bool>), use:
```
  for (auto&& elem : container)    // capture by &&
```

Of course, if there is a need to make a local copy of the element inside the loop body, capturing by value (for (auto elem : container)) is a good choice.

Additional notes on generic code

In generic code, since we can't make assumptions about generic type T being cheap to copy, in observing mode it's safe to always use for (const auto& elem : container).

(This won't trigger potentially expensive useless copies, will work just fine also for cheap-to-copy types like int, and also for containers using proxy-iterators, like std::vector<bool>.)

Moreover, in modifying mode, if we want generic code to work also in case of proxy-iterators, the best option is for (auto&& elem : container).

(This will work just fine also for containers using ordinary non-proxy-iterators, like std::vector<int> or std::vector<string>.)

So, in generic code, the following guidelines can be provided:

For observing the elements, use:
```
for (const auto& elem : container)
```
For modifying the elements in place, use:
```
for (auto&& elem : container)
```

Can range-based C++11 for do/check extra operations/conditions?

Unfortunately, you can't put the increment into the range based for loop. However, in your specific case - as std::vector stores its elements contigously in memory - you can simulate option 2 by falling back to pointers (thanks to @M.M and @Jarod42 for corrections and improvements):

for ( const int& val : v )  {
    std::cout << "v at index " << &val-v.data() << " is " << val; 
}

more generic:

for ( const auto& val : v )  {
    std::cout << "v at index " << std::addressof(val)-v.data() << " is " << val; 
}

The other thing you can do is to write a index_range class, that represents a collections of indexes over which you can iterate in your range based for loop:

struct index_range_it {
    size_t idx;
    size_t operator*(){
        return idx;
    }
    index_range_it& operator++() {
        idx++;
        return (*this);
    }
};

bool operator!=(index_range_it l,index_range_it r) {
    return l.idx != r.idx;
}

struct index_range {
    size_t size;
    index_range_it end(){return index_range_it{size};}
    index_range_it begin(){return index_range_it{0};}
};

int main()
{
    for (auto i: index_range{v.size()}){
        std::cout << "v at index " << i << " is " << v[i]; 
    }        
}

A full fledged implementation of this idea can be found e.g. here

Such a range can then also be composed to something, where the iterator returns a proxy object containing the index as well as a reference to the current object and with c++17's structured binding that would be even more convenient to use.

Can C++11 and C++17 Range-Based For Loop iterate to a specific position instead of full range of the map?

You either need an external counter to make early exit, eg:

int n = 0;
for(auto&& [k, v] : map)
{
    if(++n > 10) break;
    std::cout << k << ": " << v << std::endl;
}

Or, if you are not afraid of copying the map, you can do:

auto copy = std::map<...>{map.begin(), std::next(map.begin(), 10)};
for(auto&& [k, v] : copy)
{
    std::cout << k << ": " << v << std::endl;
}

Finally, if you can use C++20, then you can simply do this:

#include <ranges>
for(auto&& [k, v] : map | std::views::take(10))
{
    std::cout << k << ": " << v << std::endl;
}

What is the meaning of range-based for loop is a C++11 extension and what is expected expression?

warning: range-based for loop is a C++11 extension [-Wc++11-extensions]

What this means is that you are compiling your code with a version of C++ selected that is prior to C++11, but as an extension, your compiler is going to allow you to do it anyway. However, the code is not portable because not all compilers will allow it. That is why it is warning you.

If you want your code to be portable, you should either stop using the range-based-for, or compile with the C++11 version (or greater) of the C++ Standard selected.

This may mean adding a flag like -std=c++11 to your compile command, but it varies by compiler and development environment.

I would also advise configuring your compiler to adhere to strict standards if you want portable code. For example, with GCC or clang this can be done by adding -pedantic-errors to the compile command.

Does a C++11 range-based for loop condition get evaluated every cycle?

It is only evaluated once. The standard defines a range-based for statement as equivalent to:

{
    auto && __range = range-init;
    for ( auto __begin = begin-expr, __end = end-expr; __begin != __end; ++__begin ) {
        for-range-declaration = *__begin;
        statement
    }
}

where range-init is the expression (surrounded by parentheses) or braced-init-list after the :

Is there any advantage of using a range for loop rather than an iterator?

Iterators predate range-based for loops, so they used to be the only of these two alternatives available. Nowadays, range-based for has mostly replaced iterators when dealing with a simple loop.

However, iterators can be used in various other contexts as well. For example, you can do std::sort(v.begin(), std::next(v.begin(), 5)) to sort the first five elements of a vector while leaving the rest of it alone.

Going back to iterating over a whole container:

If you can accomplish what you want with a range-based for, then it leads to more legible code, so they are preferable by default.
If you need iterators for some reason, such as using an algorithm that requires them, or because you need to jump ahead or back while iterating, then use those instead.

Also: In the later case, you can/should still use auto when declaring the iterator:

for(auto it = just_a_vec.begin(); it < just_a_vec.end(); it++) {
}

Edit: as asked: here's a simple, if a bit contrived, example where an iterator-based loop can still be useful:

// adds all values in the vector, but skips over twos values when encountering a 0
// e.g.: {1,2,0,4,5,2} => 5
int my_weird_accum(const std::vector<int>& data) {
  int result = 0; 
 
  for(auto it = data.begin(); it != data.end(); ++it) {
    auto v = *it;
    result += v;

    if(v == 0) {
      // skip over the next two
      assert(std::distance(it, data.end()) > 2);
      std::advance(it, 2);
    }
  }
  return 0;
}

C++ for loop and range-based loop performance

Here's a crude test. I'm not saying this is a definitive answer as to which is faster, but it seems to me in this particular case, the gcc compiler is able to optimize both loops to roughly the same performance level. You can definitely improve on the testing method, if you wish.

On my system (Ubuntu 14.04, some sort of i7, 8 GB DDR3, gcc):

Without optimization (g++ main.cpp -std=c++11):

Old-fashioned loop: 5.45131 seconds.

Range-based loop: 9.90306 seconds.

With optimization (g++ main.cpp -O3 -std=c++11):

Old-fashioned loop: 0.469001 seconds.

Range-based loop: 0.467045 seconds.

#include <iostream>
#include <vector>
#include <time.h>
using namespace std;

double time_elapsed(timespec& start, timespec& end)
{
    return ((1e9 * end.tv_sec + end.tv_nsec) - 
            (1e9 * start.tv_sec + start.tv_nsec)) / 1.0e9;
}

int main()
{
    vector<int> v(1e9, 42);

    timespec start, end;

//  Old-fashioned loop.
    clock_gettime(CLOCK_MONOTONIC_RAW, &start);
    size_t size = v.size();
    for (size_t i = 0; i < size; i++)
    {
        v[i] *= v[i];
    }
    clock_gettime(CLOCK_MONOTONIC_RAW, &end);

    cout << "Old-fashioned loop: " << time_elapsed(start, end) << " seconds\n";

//  Range-based loop.
    clock_gettime(CLOCK_MONOTONIC_RAW, &start); 
    for (int& val : v)
    {
        val *= val;
    }
    clock_gettime(CLOCK_MONOTONIC_RAW, &end);

    cout << "Range-based loop: " << time_elapsed(start, end) << " seconds.\n";
}

How to use the range based for loop iterator?

I want to replace that for loop with a range based one.

That would not work for your use case. The range-for loop works when you only need the value of the object that you get by dereferencing the iterator. It is not going to work for your case since you are using the iterator in the loop. There is no standard mechanism by which you can get the iterator from the value.