Using C++11 Range-Based for Loop Correctly in Qt

Using C++11 range-based for loop correctly in Qt

template<class T>
std::remove_reference_t<T> const& as_const(T&&t){return t;}

might help. An implicitly shared object returned an rvalue can implicitly detect write-shraring (and detatch) due to non-const iteration.

This gives you:

for(auto&&item : as_const(foo()))
{
}

which lets you iterate in a const way (and pretty clearly).

If you need reference lifetime extension to work, have 2 overloads:

template<class T>
T const as_const(T&&t){return std::forward<T>(t);}
template<class T>
T const& as_const(T&t){return t;}

But iterating over const rvalues and caring about it is often a design error: they are throw away copies, why does it matter if you edit them? And if you behave very differently based off const qualification, that will bite you elsewhere.

Should I use qAsConst on QHash::keys() in a C++11 range-based for

No, it does not even compile (Qt5.9 - MSVC 2015) :

QMap<QString, int> map;
for(auto key : qAsConst(map.keys())) {
// do something with key or map.value(key)
}

error: use of deleted function 'void qAsConst(const T&&) [with T =
QList]'

What is the correct way of using C++11's range-based for?

TL;DR: Consider the following guidelines:

  1. For observing the elements, use the following syntax:

    for (const auto& elem : container)    // capture by const reference
    • If the objects are cheap to copy (like ints, doubles, etc.),
      it's possible to use a slightly simplified form:

        for (auto elem : container)    // capture by value
  2. For modifying the elements in place, use:

    for (auto& elem : container)    // capture by (non-const) reference
    • If the container uses "proxy iterators" (like std::vector<bool>), use:

        for (auto&& elem : container)    // capture by &&

Of course, if there is a need to make a local copy of the element inside the loop body, capturing by value (for (auto elem : container)) is a good choice.



Detailed Discussion

Let's start differentiating between observing the elements in the container
vs. modifying them in place.

Observing the elements

Let's consider a simple example:

vector<int> v = {1, 3, 5, 7, 9};

for (auto x : v)
cout << x << ' ';

The above code prints the elements (ints) in the vector:

1 3 5 7 9

Now consider another case, in which the vector elements are not just simple integers,
but instances of a more complex class, with custom copy constructor, etc.

// A sample test class, with custom copy semantics.
class X
{
public:
X()
: m_data(0)
{}

X(int data)
: m_data(data)
{}

~X()
{}

X(const X& other)
: m_data(other.m_data)
{ cout << "X copy ctor.\n"; }

X& operator=(const X& other)
{
m_data = other.m_data;
cout << "X copy assign.\n";
return *this;
}

int Get() const
{
return m_data;
}

private:
int m_data;
};

ostream& operator<<(ostream& os, const X& x)
{
os << x.Get();
return os;
}

If we use the above for (auto x : v) {...} syntax with this new class:

vector<X> v = {1, 3, 5, 7, 9};

cout << "\nElements:\n";
for (auto x : v)
{
cout << x << ' ';
}

the output is something like:

[... copy constructor calls for vector<X> initialization ...]

Elements:
X copy ctor.
1 X copy ctor.
3 X copy ctor.
5 X copy ctor.
7 X copy ctor.
9

As it can be read from the output, copy constructor calls are made during range-based for loop iterations.

This is because we are capturing the elements from the container by value
(the auto x part in for (auto x : v)).

This is inefficient code, e.g., if these elements are instances of std::string,
heap memory allocations can be done, with expensive trips to the memory manager, etc.
This is useless if we just want to observe the elements in a container.

So, a better syntax is available: capture by const reference, i.e. const auto&:

vector<X> v = {1, 3, 5, 7, 9};

cout << "\nElements:\n";
for (const auto& x : v)
{
cout << x << ' ';
}

Now the output is:

 [... copy constructor calls for vector<X> initialization ...]

Elements:
1 3 5 7 9

Without any spurious (and potentially expensive) copy constructor call.

So, when observing elements in a container (i.e., for read-only access),
the following syntax is fine for simple cheap-to-copy types, like int, double, etc.:

for (auto elem : container) 

Else, capturing by const reference is better in the general case,
to avoid useless (and potentially expensive) copy constructor calls:

for (const auto& elem : container) 

Modifying the elements in the container

If we want to modify the elements in a container using range-based for,
the above for (auto elem : container) and for (const auto& elem : container)
syntaxes are wrong.

In fact, in the former case, elem stores a copy of the original
element, so modifications done to it are just lost and not stored persistently
in the container, e.g.:

vector<int> v = {1, 3, 5, 7, 9};
for (auto x : v) // <-- capture by value (copy)
x *= 10; // <-- a local temporary copy ("x") is modified,
// *not* the original vector element.

for (auto x : v)
cout << x << ' ';

The output is just the initial sequence:

1 3 5 7 9

Instead, an attempt of using for (const auto& x : v) just fails to compile.

g++ outputs an error message something like this:

TestRangeFor.cpp:138:11: error: assignment of read-only reference 'x'
x *= 10;
^

The correct approach in this case is capturing by non-const reference:

vector<int> v = {1, 3, 5, 7, 9};
for (auto& x : v)
x *= 10;

for (auto x : v)
cout << x << ' ';

The output is (as expected):

10 30 50 70 90

This for (auto& elem : container) syntax works also for more complex types,
e.g. considering a vector<string>:

vector<string> v = {"Bob", "Jeff", "Connie"};

// Modify elements in place: use "auto &"
for (auto& x : v)
x = "Hi " + x + "!";

// Output elements (*observing* --> use "const auto&")
for (const auto& x : v)
cout << x << ' ';

the output is:

Hi Bob! Hi Jeff! Hi Connie!

The special case of proxy iterators

Suppose we have a vector<bool>, and we want to invert the logical boolean state
of its elements, using the above syntax:

vector<bool> v = {true, false, false, true};
for (auto& x : v)
x = !x;

The above code fails to compile.

g++ outputs an error message similar to this:

TestRangeFor.cpp:168:20: error: invalid initialization of non-const reference of
type 'std::_Bit_reference&' from an rvalue of type 'std::_Bit_iterator::referen
ce {aka std::_Bit_reference}'
for (auto& x : v)
^

The problem is that std::vector template is specialized for bool, with an
implementation that packs the bools to optimize space (each boolean value is
stored in one bit, eight "boolean" bits in a byte).

Because of that (since it's not possible to return a reference to a single bit),
vector<bool> uses a so-called "proxy iterator" pattern.
A "proxy iterator" is an iterator that, when dereferenced, does not yield an
ordinary bool &, but instead returns (by value) a temporary object,
which is a proxy class convertible to bool.
(See also this question and related answers here on StackOverflow.)

To modify in place the elements of vector<bool>, a new kind of syntax (using auto&&)
must be used:

for (auto&& x : v)
x = !x;

The following code works fine:

vector<bool> v = {true, false, false, true};

// Invert boolean status
for (auto&& x : v) // <-- note use of "auto&&" for proxy iterators
x = !x;

// Print new element values
cout << boolalpha;
for (const auto& x : v)
cout << x << ' ';

and outputs:

false true true false

Note that the for (auto&& elem : container) syntax also works in the other cases
of ordinary (non-proxy) iterators (e.g. for a vector<int> or a vector<string>).

(As a side note, the aforementioned "observing" syntax of for (const auto& elem : container) works fine also for the proxy iterator case.)

Summary

The above discussion can be summarized in the following guidelines:

  1. For observing the elements, use the following syntax:

    for (const auto& elem : container)    // capture by const reference
    • If the objects are cheap to copy (like ints, doubles, etc.),
      it's possible to use a slightly simplified form:

        for (auto elem : container)    // capture by value
  2. For modifying the elements in place, use:

    for (auto& elem : container)    // capture by (non-const) reference
    • If the container uses "proxy iterators" (like std::vector<bool>), use:

        for (auto&& elem : container)    // capture by &&

Of course, if there is a need to make a local copy of the element inside the loop body, capturing by value (for (auto elem : container)) is a good choice.



Additional notes on generic code

In generic code, since we can't make assumptions about generic type T being cheap to copy, in observing mode it's safe to always use for (const auto& elem : container).

(This won't trigger potentially expensive useless copies, will work just fine also for cheap-to-copy types like int, and also for containers using proxy-iterators, like std::vector<bool>.)

Moreover, in modifying mode, if we want generic code to work also in case of proxy-iterators, the best option is for (auto&& elem : container).

(This will work just fine also for containers using ordinary non-proxy-iterators, like std::vector<int> or std::vector<string>.)

So, in generic code, the following guidelines can be provided:

  1. For observing the elements, use:

    for (const auto& elem : container)
  2. For modifying the elements in place, use:

    for (auto&& elem : container)

What is a container detachement in the C++11 ranged based loop over QList? Is it a performance only problem?

Copy-on-write (=Implicit sharing) concept

It is important to understand that copy-on-write (= implicit shared) classes externally behaves like "normal" classes that perform a deep copy of their data. They only postpone this (potentially) expensive copy operation as long as possible. A deep copy is made (=detaching), only if the following sequence occurs:

  • The list is implicitly shared, i.e. the object is copied by value (and at least 2 instances still exist)
  • A non-const member function is accessed on the implicitly shared object.

Your questions

  1. Only if the container is shared (by another copy on write instance of this list), a copy of the list will be made (as a non-const member is invoked on the list object). Note that the C++ range loop is just a short hand for a normal iterator based for loop (see [1] for the exact equivalence which depends on the exactly used C++ version):

    for (QList<QString>::iterator& it = q.begin(); x != q.end(); ++it)
    {
    QString &x = *it;
    ...
    }

    Note that the begin method is a const member function if and only if the list q itself is declared const. If you would write it in full yourself, you should use constBegin and constEnd instead.

    So,

    QList<QString> q;
    q.resize(10);
    QList<QString>& q2 = q; // holds a reference to the same list instance. Modifying q, also modifies q2.
    for (QString &x: q) { .. }

    doesn't perform any copy, as list q isn't implicitly shared with another instance.

    However,

    QList<QString> q;
    q.resize(10);
    QList<QString> q2 = q; // Copy-on-write: Now q and q2 are implicitly shared. Modifying q, doesn't modify q2. Currently, no copy is made yet.
    for (QString &x: q) { .. }

    does make a copy of the data.

  2. This is mostly a performance issue. Only if the list contain some special type with a weird copy constructor/operator, this may be not the case, but this would probably indicate a bad design. In rare cases, you may also encounter the Implicit sharing iterator problem, by detaching (i.e. deep copy) a list when an iterator is still active.

    Therefore, it is good practice to avoid unneeded copies in all circumstances by writing:

    QList<QString> q = ...;
    for (QString &x: qAsConst(q)) { .. }

    or

    const QList<QString> q = ...;
    for (QString &x: q) { .. }
  3. Modifications in the loop aren't broken and work as expected, i.e. they behave as if the QList doesn't use implicit sharing, but performs a deep copy during a copy constructor/operator. For example,

    QList<QString> q;
    q.resize(10);
    QList<QString>& q2 = q;
    QList<QString> q3 = q;
    for (QString &x: q) {x = "TEST";}

    q and q2 are identical, all containing 10 times "TEST". q3 is a different list, containing 10 empty (null) strings.

Also check the Qt documentation itself about Implicit Sharing, which is used extensively by Qt. In modern C++, this performance optimization construct could be (partially) replaced by newly introduced move concept.

Improve understanding by inspecting source code

Every non-const function calls detach, before actually modifying the data, f.ex. [2]:

inline iterator begin() { detach(); return reinterpret_cast<Node *>(p.begin()); }
inline const_iterator begin() const noexcept { return reinterpret_cast<Node *>(p.begin()); }
inline const_iterator constBegin() const noexcept { return reinterpret_cast<Node *>(p.begin()); }

However, detach only effectively detaches/deep copy the data when the list is actually shared [3]:

inline void detach() { if (d->ref.isShared()) detach_helper(); }

and isShared is implemented as follows [4]:

bool isShared() const noexcept
{
int count = atomic.loadRelaxed();
return (count != 1) && (count != 0);
}

i.e. more than 1 copy (= another copy except for the object itself) exists.

How to make my custom type to work with range-based for loops?

The standard has been changed since the question (and most answers) were posted in the resolution of this defect report.

The way to make a for(:) loop work on your type X is now one of two ways:

  • Create member X::begin() and X::end() that return something that acts like an iterator

  • Create a free function begin(X&) and end(X&) that return something that acts like an iterator, in the same namespace as your type X

And similar for const variations. This will work both on compilers that implement the defect report changes, and compilers that do not.

The objects returned do not have to actually be iterators. The for(:) loop,

for( range_declaration : range_expression )

unlike most parts of the C++ standard, is specified to expand to something equivalent to:

{
auto && __range = range_expression ;
for (auto __begin = begin_expr,
__end = end_expr;
__begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}

where the variables beginning with __ are for exposition only, and begin_expr and end_expr is the magic that calls begin/end

The requirements on the begin/end return value are simple: You must overload pre-++, ensure the initialization expressions are valid, binary != that can be used in a boolean context, unary * that returns something you can assign-initialize range_declaration with, and expose a public destructor.

Doing so in a way that isn't compatible with an iterator is probably a bad idea, as future iterations of C++ might be relatively cavalier about breaking your code if you do.

As an aside, it is reasonably likely that a future revision of the standard will permit end_expr to return a different type than begin_expr. This is useful in that it permits "lazy-end" evaluation (like detecting null-termination) that is easy to optimize to be as efficient as a hand-written C loop, and other similar advantages.


¹ Note that for(:) loops store any temporary in an auto&& variable, and pass it to you as an lvalue. You cannot detect if you are iterating over a temporary (or other rvalue); such an overload will not be called by a for(:) loop. See [stmt.ranged] 1.2-1.3 from n4527.

² Either call the begin/end method, or ADL-only lookup of free function begin/end, or magic for C-style array support. Note that std::begin is not called unless range_expression returns an object of type in namespace std or dependent on same.


In c++17 the range-for expression has been updated

{
auto && __range = range_expression ;
auto __begin = begin_expr;
auto __end = end_expr;
for (;__begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}

with the types of __begin and __end have been decoupled.

This permits the end iterator to not be the same type as begin. Your end iterator type can be a "sentinel" which only supports != with the begin iterator type.

A practical example of why this is useful is that your end iterator can read "check your char* to see if it points to '0'" when == with a char*. This allows a C++ range-for expression to generate optimal code when iterating over a null-terminated char* buffer.

struct null_sentinal_t {
template<class Rhs,
std::enable_if_t<!std::is_same<Rhs, null_sentinal_t>{},int> =0
>
friend bool operator==(Rhs const& ptr, null_sentinal_t) {
return !*ptr;
}
template<class Rhs,
std::enable_if_t<!std::is_same<Rhs, null_sentinal_t>{},int> =0
>
friend bool operator!=(Rhs const& ptr, null_sentinal_t) {
return !(ptr==null_sentinal_t{});
}
template<class Lhs,
std::enable_if_t<!std::is_same<Lhs, null_sentinal_t>{},int> =0
>
friend bool operator==(null_sentinal_t, Lhs const& ptr) {
return !*ptr;
}
template<class Lhs,
std::enable_if_t<!std::is_same<Lhs, null_sentinal_t>{},int> =0
>
friend bool operator!=(null_sentinal_t, Lhs const& ptr) {
return !(null_sentinal_t{}==ptr);
}
friend bool operator==(null_sentinal_t, null_sentinal_t) {
return true;
}
friend bool operator!=(null_sentinal_t, null_sentinal_t) {
return false;
}
};

live example of this.

Minimal test code is:

struct cstring {
const char* ptr = 0;
const char* begin() const { return ptr?ptr:""; }// return empty string if we are null
null_sentinal_t end() const { return {}; }
};

cstring str{"abc"};
for (char c : str) {
std::cout << c;
}
std::cout << "\n";

Here is a simple example.

namespace library_ns {
struct some_struct_you_do_not_control {
std::vector<int> data;
};
}

Your code:

namespace library_ns {
int* begin(some_struct_you_do_not_control& x){ return x.data.data(); }
int* end(some_struct_you_do_not_control& x){ return x.data.data()+x.data.size(); }
int const* cbegin(some_struct_you_do_not_control const& x){ return x.data.data(); }
int* cend(some_struct_you_do_not_control const& x){ return x.data.data()+x.data.size(); }
int const* begin(some_struct_you_do_not_control const& x){ return cbegin(x); }
int const* end(some_struct_you_do_not_control const& x){ return cend(x); }
}

this is an example how you can augment a type you don't control to be iterable.

Here I return pointers-as-iterators, hiding the fact I have a vector under the hood.

For a type you do own, you can add methods:

struct egg {};
struct egg_carton {
auto begin() { return eggs.begin(); }
auto end() { return eggs.end(); }
auto cbegin() const { return eggs.begin(); }
auto cend() const { return eggs.end(); }
auto begin() const { return eggs.begin(); }
auto end() const { return eggs.end(); }
private:
std::vector<egg> eggs;
};

here I reuse the vector's iterators. I use auto for brevity; in c++11 I'd have to be more verbose.

Here is a quick and dirty iterable range-view:

template<class It>
struct range_t {
It b, e;
It begin() const { return b; }
It end() const { return e; }

std::size_t size() const
// C++20 only line: (off C++20 it generates a hard error)
requires std::random_access_iterator<It>
{
return end()-begin(); // do not use distance: O(n) size() is toxic
}

bool empty() const { return begin()==end(); }

range_t without_back() const {
if(emptty()) return *this;
return {begin(), std::prev(end())};
}

range_t without_back( std::size_t n ) const
// C++20 only line: (see below)
requires !std::random_access_iterator<It>
{
auto r=*this;
while(n-->0 && !r.empty())
r=r.without_back();
return r;
}

range_t without_front() const {
if(empty()) return *this;
return {std::next(begin()), end()};
}

range_t without_front( std::size_t n ) const
// C++20 only line: (see below)
requires !std::random_access_iterator<It>
{
auto r=*this;
while(n-->0 && !r.empty())
r=r.without_front();
return r;
}

// C++20 section:
range_t without_back( std::size_t n ) const
requires std::random_access_iterator<It>
{
n = (std::min)(n, size());
return {b, e-n};
}
range_t without_front( std::size_t n ) const
requires std::random_access_iterator<It>
{
n = (std::min)(n, size());
return {b+n, e};
}
// end C++20 section

decltype(auto) front() const { return *begin(); }
decltype(auto) back() const { return *(std::prev(end())); }
};
template<class It>
range_t(It,It)->range_t<It>;
template<class C>
auto make_range( C&& c ) {
using std::begin; using std::end;
return range_t{ begin(c), end(c) };
}

using c++17 template class deduction.

std::vector<int> v{1,2,3,4,5};
for (auto x : make_range(v).without_front(2) ) {
std::cout << x << "\n";
}

prints 3 4 5, skipping first 2.

for' loop vs Qt's 'foreach' in C++

It really doesn't matter in most cases.

The large number of questions on StackOverflow regarding whether this method or that method is faster, belie the fact that, in the vast majority of cases, code spends most of its time sitting around waiting for users to do something.

If you are really concerned, profile it for yourself and act on what you find.

But I think you'll most likely find that only in the most intense data-processing-heavy work does this question matter. The difference may well be only a couple of seconds and even then, only when processing huge numbers of elements.

Get your code working first. Then get it working fast (and only if you find an actual performance issue).

Time spent optimising before you've finished the functionality and can properly profile, is mostly wasted time.

temporary object in range-based for

Since you're using C++11, you could use initialization list instead. This will pass valgrind:

int main() {
for (auto i : QList<int>{1, 2, 3})
std::cout << i << std::endl;
return 0;
}

The problem is not totally related to range-based for or even C++11. The following code demonstrates the same problem:

QList<int>& things = QList<int>() << 1;
things.end();

or:

#include <iostream>

struct S {
int* x;

S() { x = NULL; }
~S() { delete x; }

S& foo(int y) {
x = new int(y);
return *this;
}
};

int main() {
S& things = S().foo(2);
std::cout << *things.x << std::endl;
return 0;
}

The invalid read is because the temporary object from the expression S() (or QList<int>{}) is destructed after the declaration (following C++03 and C++11 §12.2/5), because the compiler has no idea that the method foo() (or operator<<) will return that temporary object. So you are now refering to content of freed memory.

How to Write the Range-based For-Loop With Argv?

You don't, because the system can't tell how long argv is at compile time. Someone can likely find the proper section of the standard to quote you about this.

There is a way around it though, and that's to create a custom class to wrap argv. It's not even that hard.

class argv_range {
public:
argv_range(int argc, const char * const argv[])
: argc_(argc), argv_(argv)
{
}

const char * const *begin() const { return argv_; }
const char * const *end() const { return argv_ + argc_; }

private:
const int argc_;
const char * const *argv_;
};

Here's how you use it:

for (const char *arg: argv_range(argc, argv)) {
// Do something.
}

Yeah, I use a lot of consts. Basically, argv is an array of character pointers, none of which should be modified, each pointing to a string, none of the characters of which should be modified either.



Related Topics



Leave a reply



Submit