Which Greedy Initializer-List Examples Are Lurking in the Standard Library

Which greedy initializer-list examples are lurking in the Standard Library?

I assume, with your examples for std::vector<int> and std::string you meant to also cover the other containers, e.g., std::list<int>, std::deque<int>, etc. which have the same problem, obviously, as std::vector<int>. Likewise, the int isn't the only type as it also applies to char, short, long and their unsigned version (possibly a few other integral types, too).

I think there is also std::valarray<T> but I'm not sure if T is allowed to be integral type. Actually, I think these have different semantics:

std::valarray<double>(0.0, 3);
std::valarray<double>{0.0, 3};

There are a few other standard C++ class templates which take an std::initializer_list<T> as argument but I don't think any of these has an overloaded constructor which would be used when using parenthesis instead of braces.

C++11 initializer list fails - but only on lists of length 2

Introduction

Imagine the following declaration, and usage:

struct A {
A (std::initializer_list<std::string>);
};

A {{"a"          }}; // (A), initialization of 1 string
A {{"a", "b" }}; // (B), initialization of 1 string << !!
A {{"a", "b", "c"}}; // (C), initialization of 3 strings

In (A) and (C), each c-style string is causing the initialization of one (1) std::string, but, as you have stated in your question, (B) differs.

The compiler sees that it's possible to construct a std::string using a begin- and end-iterator, and upon parsing statement (B) it will prefer such construct over using "a" and "b" as individual initializers for two elements.

A { std::string { "a", "b" } }; // the compiler's interpretation of (B)


Note: The type of "a" and "b" is char const[2], a type which can implicitly decay into a char const*, a pointer-type which is suitable to act like an iterator denoting either begin or end when creating a std::string.. but we must be careful: we are causing undefined-behavior since there is no (guaranteed) relation between the two pointers upon invoking said constructor.



Explanation

When you invoke a constructor taking an std::initializer_list using double braces {{ a, b, ... }}, there are two possible interpretations:

  1. The outer braces refer to the constructor itself, the inner braces denotes the elements to take part in the std::initializer_list, or:

  2. The outer braces refer to the std::initializer_list, whereas the inner braces denotes the initialization of an element inside it.

It's prefered to do 2) whenever that is possible, and since std::string has a constructor taking two iterators, it is the one being called when you have std::vector<std::string> {{ "hello", "there" }}.

Further example:

std::vector<std::string> {{"this", "is"}, {"stackoverflow"}}.size (); // yields 2


Solution

Don't use double braces for such initialization.

When to use initializer list constructors?

This is a topic that can span an entire book chapter. See a quote from this recent draft Item by Scott Meyers of his upcoming Effective Modern C++ (reformatted for clarity):

Most developers end up choosing one kind of delimiter as a default,
using the other only when they have to.

  • Braces-by-default folks are attracted by their wide applicability,
    their prevention of narrowing conversions, and their avoidance of
    C++’s most vexing parse. Such folks understand that in some cases
    (e.g., creation of a std::vector with a given size and initial element
    value), parentheses are required.

  • In contrast, the go-parentheses-go crowd embraces parentheses as their
    default argument delimiter. They’re attracted to its consistency with
    the C++98 syntactic tradition, its avoidance of the
    auto-deduced-a-std::initializer_list problem, and the knowledge that
    their object creation calls won’t be inadvertently waylaid by
    std::initializer_list constructors
    . They concede that sometimes only
    braces will do (e.g., when creating a container with particular
    values).


Neither approach is rigorously better than the other. My advice is to
pick one and apply it consistently.

are there any plans in C++ standard to address inconsistency of initializer list constructors?

Has there been any discussion or plans by the C++ standard committee to fix this type of ambiguity / unpleasantness?

There have been many fixes to initialization since C++11. For instance, you initially couldn't copy construct aggregates using list-initialization (CWG 1467). This really minor fix broke some code in an undesirable way that lead to a new issue to refine that previous fix to undo it if there's an initializer_list constructor (CWG 2137). It's hard to touch anything in these clauses without lots of unexpected consequences and breaking code, even in small cases. I doubt there will be a push to make any kind of large change to initialization in the future. At this point, the amount of code breakage would be tremendous.

The best solution is just to be aware about the pitfalls of initialization and be careful about what you're doing. My rule of thumb is to only use {}s when I deliberately need the behavior that {} provides, and () otherwise.

Note that this isn't really any different from the more well-known pitfall of:

vector<int> a{10}; // vector of 1 element
vector<int> b(10); // vector of 10 elements

One example would be requiring initializer list constructors to be called like this: vector<int> u({3}), which is already currently legal.

You have the same problem you had before, for the same reasons:

vector<int> u({3});    // vector of one element: 3
vector<string> v({3}); // vector of three elements: "", "", and ""

Even if you were to require the former (which would be impossible), you couldn't make the latter ill-formed.

Calling constructor with () is different from {}

Because std::string have a constructor taking an std::initializer_list, the first example will use that constructor to create a string object with two characters. Initialization like this is called list initialization.

The second example will create a string object with six characters, all initialized to 's'. This form of initialization is called direct initialization.

List initialization and direct initialization can be the same, except that the possible conversions for larger types to smaller types are forbidden for list initialization, and as noticed here if the class have a constructor taking an std::initializer_list.

Will using brace-init syntax change construction behavior when an initializer_list constructor is added later?

What happens to the construction of f? My understanding is that it will no longer call the first constructor but instead now call the init list constructor. If so, this seems bad. Why are so many people recommending using the {} syntax over () for object construction when adding an initializer_list constructor later may break things silently?

On one hand, it's unusual to have the initializer-list constructor and the other one both be viable. On the other hand, "universal initialization" got a bit too much hype around the C++11 standard release, and it shouldn't be used without question.

Braces work best for like aggregates and containers, so I prefer to use them when surrounding some things which will be owned/contained. On the other hand, parentheses are good for arguments which merely describe how something new will be generated.

I can imagine a case where I'm constructing an rvalue using {} syntax (to avoid most vexing parse) but then later someone adds an std::initializer_list constructor to that object. Now the code breaks and I can no longer construct it using an rvalue because I'd have to switch back to () syntax and that would cause most vexing parse. How would one handle this situation?

The MVP only happens with ambiguity between a declarator and an expression, and that only happens as long as all the constructors you're trying to call are default constructors. An empty list {} always calls the default constructor, not an initializer-list constructor with an empty list. (This means that it can be used at no risk. "Universal" value-initialization is a real thing.)

If there's any subexpression inside the braces/parens, the MVP problem is already solved.

Custom brace initializer

You're almost there. MyStruct x = { 1, "string" }; is called copy list initialization. It will attempt to construct a MyStruct from the available constructors with the parameters supplied from the braced-init-list

Your issue is that your constructor takes a char* while "string" is a const char[N] which can decay to a const char*, not a char*. So making thing that change

struct MyStruct {
int field1;
const char* field2;
MyStruct(int a, const char* b): field2(b) {
field1 = a;
}
MyStruct(int a): MyStruct(a, nullptr) {}
~MyStruct() {}
};

Then

MyStruct x = { 1, "string" };

Will work. If you want to make this a little more bullet proof you can change field2 to be a std::string and use

struct MyStruct {
int field1;
std::string field2;
MyStruct(int a, const std::string& b): field1(a), field2(b) {}
MyStruct(int a): MyStruct(a, "") {}
~MyStruct() {}
};


Related Topics



Leave a reply



Submit