Why Compiler Doesn't Allow Std::String Inside Union

Why compiler doesn't allow std::string inside union?

Think about it. How does the compiler know what type is in the union?

It doesn't. The fundamental operation of a union is essentially a bitwise cast. Operations on values contained within unions are only safe when each type can essentially be filled with garbage. std::string can't, because that would result in memory corruption. Use boost::variant or boost::any.

Why is std::string incompatible with C unions?

From Wikipedia:

C++ does not allow for a data member to be any type that has a full fledged constructor/destructor
and/or copy constructor, or a non-trivial copy assignment operator. In particular, it is impossible to
have the standard C++ string as a member of a union.

Think about it this way: If you have a union of a class type like std::string and a primitive type (let's say a long), how would the compiler know when you are using the class type (in which case the constructor/destructor will need to be called) and when you are using the simple type? That's why full-fledged class types are not allowed as members of a union.

C++ unions with string types

You should read all the answers on the [old] Q&A you linked, not just one.

kennytm's answer explains that the rule was relaxed in C++11, and gives an example.

How to use a union of two types

From here

If a union contains a non-static data member with a non-trivial special member function (default constructor, copy/move constructor, copy/move assignment, or destructor), that function is deleted by default in the union and needs to be defined explicitly by the programmer.

You must explicitly define a destructor for your union to replace the one automatically deleted for string.

Also note that this is only valid in c++11. In earlier versions you can not have a type with non-trivial special member functions inside a union at all.

From a practical point of view, this may still not be a great idea.

Union member has a non-trivial copy constructor

You cannot.

A union combines two pieces of functionality: the ability to store an object which may be of a select number of types, and the ability to effectively (and implementation-defined) convert between those types. You could put an integer in and look at its representation as a double. And so forth.

Because a union must support both of these pieces of functionality (and for a few other reasons, like being able to construct one), a union prevents you from doing certain things. Namely, you cannot put "live" objects in them. Any object that is "living" enough that it needs a non-default copy constructor (among many other restrictions) cannot be a member of a union.

After all, a union object does not really have the concept of which type of data it actually stores. It does not store one type of data; it stores all of them, at the same time. It is up to you to be able to fish the right type out. So how could it reasonably copy one union value into another?

Members of a union must be a POD (plain-old-data) type. And while C++11 does loosen those rules, objects still must have a default (or otherwise trivial) copy constructor. And std::string's copy constructor is non-trivial.

What you likely want is a boost::variant. That is an object that can store a number of possible types, just like a union. Unlike a union however, it is type-safe. It therefore knows what is actually in the union; it is therefore able to copy itself and otherwise behave like a regular C++ object.

Why string string is allowed and int int is not allowed by Compiler?

string is not a C++ reserved word, but int is, and a reserved word cannot be used as an identifier.

And its syntactically fine to have class name and object name to be same.

class test {}; 
int main() {
test test; // test is an object of type test.
}

Why string string is allowed and int int is not allowed by Compiler?

string is not a C++ reserved word, but int is, and a reserved word cannot be used as an identifier.

And its syntactically fine to have class name and object name to be same.

class test {}; 
int main() {
test test; // test is an object of type test.
}

Why can compiler not optimize out unused static std::string?

Compiling that code with short string optimization (SSO) may be an equivalent of taking address of std::string's member variable. Constructor have to analyze string length at compile time and choose if it can fit into internal storage of std::string object or it have to allocate memory dynamically but then find that it never was read so allocation code can be optimized out.

Lack of optimization in this case might be an optimization flaw limited to such simple outlying examples like this one:

const int i = 3;

int main()
{
return (long long)(&i); // to make sure that address was used
}

GCC generates code:

i:
.long 3 ; this a variable
main:
push rbp
mov rbp, rsp
mov eax, OFFSET FLAT:i
pop rbp
ret

GCC would not optimize this code as well:

const int i = 3;
const int *p = &i;
int main() { return 0; }

Static variables declared in file scope, especially const-qualified ones can be optimized out per as-if rule unless their address was used, GCC does that only to const-qualified ones regardless of use case. Taking address of variable is an observable behaviour, because it can be passed somewhere. Logic which would trace that would be too complex to implement and would be of little practical value.

Of course, the code that doesn't use address

const int i = 3;
int main() { return i; }

results in optimizing out reserved storage:

main:
mov eax, 3
ret

As of C++20 constexpr construction of std::string? Per older rules it could not be a compile-time expression if result was dependant on arguments. It possible that std::string would allocate memory dynamically if string is too long, which isn't a compile-time action. It appears that only mainstream compiler that supports C++20 features required for that it at this moment is MSVC in certain conditions.



Related Topics



Leave a reply



Submit