C++ #Include Semantics

What is semantics?

The word semantics is used to describe an underlying meaning of something.

You can say that an operation has the move semantics when it transfers an object state from one object to another. In reality of course what happens is some pointers are probably copied over and that's it, but semantically your object has been moved.

Another example is the ownership transfer, when the most important thing that is moved is the responsibility (i.e. the promise to release some resource). In that case from the computational point of view pretty much nothing happens, but semantically the ownership is transferred.

The same goes for copy semantics: you can say that passing an object to a function has copy semantics i.e. your object would be duplicated and the function will get a standalone copy with its own lifetime.

Another side of the coin is the syntax which is how you describe what you want following the rules of the language.

C++ has really flexible syntax - overloading operators, user-defined conversions, macros and what not, so almost any desirable semantics can be attached to any particular syntax.

Is move semantics in C++ something C is missing?

Of course, there is a similar technique in C. We have been doing "move semantics" in C for ages.

Firstly, "move semantics" in C++ is based on a bunch of overload resolution rules that describe how functions with rvalue reference parameters behave during overload resolution. Since C does not support function overloading, this specific matter is not applicable to C. You can still implement move semantics in C manually, by writing dedicated data-moving functions with dedicated names and explicitly calling them when you want to move the data instead of copying it. E.g, for your own data type struct HeavyStruct you can write both a copy_heavy_struct(dst, src) and move_heavy_struct(dst, src) functions with appropriate implementations. You'll just have to manually choose the most appropriate/efficient one to call in each case.

Secondly, the primary purpose of implicit move semantics in C++ is to serve as an alternative to implicit deep-copy semantics in contexts where deep copying is unnecessarily inefficient. Since C does not have implicit deep-copy semantics, the problem does not even arise in C. C always performs shallow copying, which is already pretty similar to move semantics. Basically, you can think of C as an always-move language. It just needs a bit of manual tweaking to bring its move semantics to perfection.

Of course, it is probably impossible to literally reproduce all features of C++ move semantics, since, for example, it is impossible to bind a C pointer to an rvalue. But virtually everything can be "emulated". It just requires a bit more work to be done explicitly/manually.

C++11 auto semantics

The rule is simple : it is how you declare it.

int i = 5;
auto a1 = i; // value
auto & a2 = i; // reference

Next example proves it :

#include <typeinfo>
#include <iostream>

template< typename T >
struct A
{
static void foo(){ std::cout<< "value" << std::endl; }
};
template< typename T >
struct A< T&>
{
static void foo(){ std::cout<< "reference" << std::endl; }
};

float& bar()
{
static float t=5.5;
return t;
}

int main()
{
int i = 5;
int &r = i;

auto a1 = i;
auto a2 = r;
auto a3 = bar();

A<decltype(i)>::foo(); // value
A<decltype(r)>::foo(); // reference
A<decltype(a1)>::foo(); // value
A<decltype(a2)>::foo(); // value
A<decltype(bar())>::foo(); // reference
A<decltype(a3)>::foo(); // value
}

The output:

value
reference
value
value
reference
value

What is the rationale for semantics violation does not require diagnostics?

A possible rationale is given by Rice's theorem : non-trivial semantic properties of programs are undecidable

For example, division by zero is a semantics violation; and you cannot decide, by static analysis alone of the C source code, that it won't happen...

A standard cannot require total detection of such undefined behavior, even if of course some tools (e.g. Frama-C) are sometimes capable of detecting them.

See also the halting problem. You should not expect a C compiler to solve it!

Best Practise and Semantics of namespace nested functions and the use of extern C

This is for MSVC.

The namespace itself is not name-mangled, but the name of the namespace is incorporated in to the function's (or object's) name when name mangling occurs. This process is undocumented, but described here.

Answering your specific questions by jumping around:

1) There is no Standard-defined behavior regarding name mangling. What the Standard actually says is that implementations provides a C-compatible linkage for extern "C" constructs:

7.5.3 [Linkage specifications]

Every implementation shall provide for
linkage to functions written in the C
programming language, "C", and linkage
to C + + functions, "C++". [Example:

complex sqrt(complex); // C + + linkage by default 
extern "C" { double sqrt(double); // C linkage }

—end example]

Ultimately what this means is that since C has no concept of namespaces, if extern "C" functions or objects in namespaces, your exported names will lose the namespace qualification. This leads to...

3) Yes, you can have a linkage problem. Try this:

main.h

#ifndef MAIN_API
# define MAIN_API __declspec(dllexport)
#endif

namespace x
{
extern "C" MAIN_API void foo();
};

namespace y
{
extern "C" MAIN_API void foo();
};

main.cpp

#include <cstdlib>
#include <iostream>
using namespace std;
#define MAIN_API __declspec(dllexport)
#include "main.h"

void x::foo()
{
cout << "x::foo()\n";
}

void y::foo()
{
cout << "y::foo()\n";
}

int main()
{
}

This will emit a linker error because the extern "C"-ed versions of x::foo() and y::foo() have lost their namespace identification, so they end up with exactly the same name: foo()

2) Best practices regarding this. If you must export a C-ABI for functions in namespaces, you have to be careful that the names you end up exporting are not the same. To some degree, this defeats the purpose of using a namespace in the first place. But you can do something like this:

#ifndef MAIN_API
# define MAIN_API __declspec(dllexport)
#endif

namespace x
{
extern "C" MAIN_API void x_foo();
};

namespace y
{
extern "C" MAIN_API void y_foo();
};

what's the meaning of VOID() in C

Looks like a preprocessor macro. Your editor should be able to find what it is. Or try

gcc -E source.c > source2.c

It runs the preprocessor only and replaces macros with what they really evaluate to.

What is move semantics?

I find it easiest to understand move semantics with example code. Let's start with a very simple string class which only holds a pointer to a heap-allocated block of memory:

#include <cstring>
#include <algorithm>

class string
{
char* data;

public:

string(const char* p)
{
size_t size = std::strlen(p) + 1;
data = new char[size];
std::memcpy(data, p, size);
}

Since we chose to manage the memory ourselves, we need to follow the rule of three. I am going to defer writing the assignment operator and only implement the destructor and the copy constructor for now:

    ~string()
{
delete[] data;
}

string(const string& that)
{
size_t size = std::strlen(that.data) + 1;
data = new char[size];
std::memcpy(data, that.data, size);
}

The copy constructor defines what it means to copy string objects. The parameter const string& that binds to all expressions of type string which allows you to make copies in the following examples:

string a(x);                                    // Line 1
string b(x + y); // Line 2
string c(some_function_returning_a_string()); // Line 3

Now comes the key insight into move semantics. Note that only in the first line where we copy x is this deep copy really necessary, because we might want to inspect x later and would be very surprised if x had changed somehow. Did you notice how I just said x three times (four times if you include this sentence) and meant the exact same object every time? We call expressions such as x "lvalues".

The arguments in lines 2 and 3 are not lvalues, but rvalues, because the underlying string objects have no names, so the client has no way to inspect them again at a later point in time.
rvalues denote temporary objects which are destroyed at the next semicolon (to be more precise: at the end of the full-expression that lexically contains the rvalue). This is important because during the initialization of b and c, we could do whatever we wanted with the source string, and the client couldn't tell a difference!

C++0x introduces a new mechanism called "rvalue reference" which, among other things,
allows us to detect rvalue arguments via function overloading. All we have to do is write a constructor with an rvalue reference parameter. Inside that constructor we can do anything we want with the source, as long as we leave it in some valid state:

    string(string&& that)   // string&& is an rvalue reference to a string
{
data = that.data;
that.data = nullptr;
}

What have we done here? Instead of deeply copying the heap data, we have just copied the pointer and then set the original pointer to null (to prevent 'delete[]' from source object's destructor from releasing our 'just stolen data'). In effect, we have "stolen" the data that originally belonged to the source string. Again, the key insight is that under no circumstance could the client detect that the source had been modified. Since we don't really do a copy here, we call this constructor a "move constructor". Its job is to move resources from one object to another instead of copying them.

Congratulations, you now understand the basics of move semantics! Let's continue by implementing the assignment operator. If you're unfamiliar with the copy and swap idiom, learn it and come back, because it's an awesome C++ idiom related to exception safety.

    string& operator=(string that)
{
std::swap(data, that.data);
return *this;
}
};

Huh, that's it? "Where's the rvalue reference?" you might ask. "We don't need it here!" is my answer :)

Note that we pass the parameter that by value, so that has to be initialized just like any other string object. Exactly how is that going to be initialized? In the olden days of C++98, the answer would have been "by the copy constructor". In C++0x, the compiler chooses between the copy constructor and the move constructor based on whether the argument to the assignment operator is an lvalue or an rvalue.

So if you say a = b, the copy constructor will initialize that (because the expression b is an lvalue), and the assignment operator swaps the contents with a freshly created, deep copy. That is the very definition of the copy and swap idiom -- make a copy, swap the contents with the copy, and then get rid of the copy by leaving the scope. Nothing new here.

But if you say a = x + y, the move constructor will initialize that (because the expression x + y is an rvalue), so there is no deep copy involved, only an efficient move.
that is still an independent object from the argument, but its construction was trivial,
since the heap data didn't have to be copied, just moved. It wasn't necessary to copy it because x + y is an rvalue, and again, it is okay to move from string objects denoted by rvalues.

To summarize, the copy constructor makes a deep copy, because the source must remain untouched.
The move constructor, on the other hand, can just copy the pointer and then set the pointer in the source to null. It is okay to "nullify" the source object in this manner, because the client has no way of inspecting the object again.

I hope this example got the main point across. There is a lot more to rvalue references and move semantics which I intentionally left out to keep it simple. If you want more details please see my supplementary answer.



Related Topics



Leave a reply



Submit