Returning Large Objects in Functions

The second approach is more idiomatic, and expressive. It is clear when reading the code that the function has no preconditions on the argument (it does not have an argument) and that it will actually create an object inside. The first approach is not so clear for the casual reader. The call implies that the object will be changed (pass by reference) but it is not so clear if there are any preconditions on the passed object.

About the copies. The code you posted is not using the assignment operator, but rather copy construction. The C++ defines the return value optimization that is implemented in all major compilers. If you are not sure you can run the following snippet in your compiler:

#include <iostream>
class X
{
public:
    X() { std::cout << "X::X()" << std::endl; }
    X( X const & ) { std::cout << "X::X( X const & )" << std::endl; }
    X& operator=( X const & ) { std::cout << "X::operator=(X const &)" << std::endl; }
};
X f() {
    X tmp;
    return tmp;
}
int main() {
    X x = f();
}

With g++ you will get a single line X::X(). The compiler reserves the space in the stack for the x object, then calls the function that constructs the tmp over x (in fact tmp is x. The operations inside f() are applied directly on x, being equivalent to your first code snippet (pass by reference).

If you were not using the copy constructor (had you written: X x; x = f();) then it would create both x and tmp and apply the assignment operator, yielding a three line output: X::X() / X::X() / X::operator=. So it could be a little less efficient in cases.

Creating and returning a big object from a function

For that specific case, you can take advantage of the fact that compilers nowadays are smart enough to optimize for it. The optimization is called named return value optimization (NRVO), so it's okay to return "big" objects like that. The compiler can see such opportunities (especially in something as simple as your code snippet) and generate the binary so that no copies are made.

You can also return unnamed temporaries:

Object f()
{
    return Object();
}

This invokes (unnamed) return value optimization (RVO) on just about all modern C++ compilers. In fact, Visual C++ implements this particular optimization even if all optimizations are turned off.

These kinds of optimizations are specifically allowed by the C++ standard:

ISO 14882:2003 C++ Standard, §12.8 para. 15:
Copying Class Objects

When certain criteria are met, an
implementation is allowed to omit the
copy construction of a class object,
even if the copy constructor and/or
destructor for the object have side
effects. In such cases, the
implementation treats the source and
target of the omitted copy operation
as simply two different ways of
referring to the same object, and the
destruction of that object occurs
later of the times when the two
objects would have been destroyed
without the optimization. This elison
of copy operations is permitted in the
following circumstances (which may be
combined to eliminate multiple
copies):

in a return statement in a function with a class terturn type,
when the expression is the name of a
non-volatile automatic object with the
same cv-unqualified type as the
function return type, the copy
operation can be omitted by
constructing the automatic object
directly into the function's return
value

when a temporary class object that has not been bound to a reference
would be copied to a class object with
the same cv-unqualitied type, the copy
operation can be omitted by
constructing the temporary object
directly into the target of the
omitted copy.

Generally, the compiler will always try to implement NRVO and/or RVO, although it may fail to do so in certain circumstances, like multiple return paths. Nevertheless, it's a very useful optimization, and you shouldn't be afraid to use it.

If in doubt, you can always test your compiler by inserting "debugging statements" and see for yourself:

class Foo
{
public:
    Foo()                      { ::printf("default constructor\n"); }
    // "Rule of 3" for copyable objects
    ~Foo()                     { ::printf("destructor\n");          }
    Foo(const Foo&)            { ::printf("copy constructor\n");    }
    Foo& operator=(const Foo&) { ::printf("copy assignment\n");     } 
};

Foo getFoo()
{
    return Foo();
}

int main()
{
    Foo f = getFoo();
}

If the returned object isn't meant to be copyable, or (N)RVO fails (which is probably not likely to happen), then you can try returning a proxy object:

struct ObjectProxy
{
private:
    ObjectProxy() {}
    friend class Object;    // Allow Object class to grab the resource.
    friend ObjectProxy f(); // Only f() can create instances of this class.
};

class Object
{
public:
    Object() { ::printf("default constructor\n"); }
    ~Object() { ::printf("destructor\n"); }
    // copy functions undefined to prevent copies
    Object(const Object&);
    Object& operator=(const Object&);
    // but we can accept a proxy
    Object(const ObjectProxy&)
    {
        ::printf("proxy constructor\n");
        // Grab resource from the ObjectProxy.
    }
};

ObjectProxy f()
{
    // Acquire large/complex resource like files
    // and store a reference to it in ObjectProxy.
    return ObjectProxy();
}

int main()
{
     Object o = f();
}

Of course, this isn't exactly obvious so proper documentation would be needed (at least a comment about it).

You can also return a smart pointer of some kind (like std::auto_ptr or boost::shared_ptr or something similar) to an object allocated on the free-store. This is needed if you need to return instances of derived types:

class Base {};
class Derived : public Base {};

// or boost::shared_ptr or any other smart pointer
std::auto_ptr<Base> f()
{
    return std::auto_ptr<Base>(new Derived);
}

How do C++ functions return a big object or a structure?

Obviously you can't pass this using the stack...

Actually, the theory is that whenever a function is called and its stack frame is accommodated, it also makes room for the return object. It is then up to the calling function to ensure that that return value is copied somewhere within its own stack frame so that it can hold on to it.

This directly corresponds to how it works in C and C++. You have a return ...; statement, which copies some value into the return object. The return object is a temporary object, so the calling code has to store it somewhere, with something like int value = foo();.

However, it is pretty much never necessary to even bother with reserving space for the return value. Instead, the calling function makes room for it and the called function places the return value directly there. That's exactly what return value optimization is and what copy elision represents.

Which return method is better for large data in C++/C++11?

First of all, the proper technical term for what you are doing is NRVO. RVO relates to temporaries being returned:

X foo() {
   return make_x();
}

NRVO refers to named objects being returned:

X foo() {
    X x = make_x();
    x.do_stuff();
    return x;
}

Second, (N)RVO is compiler optimization, and is not mandated. However, you can be pretty sure that if you use modern compiler, (N)RVOs are going to be used pretty aggressively.

Third of all, (N)RVO is not C++11 feature - it was here long before 2011.

Forth of all, what you have in C++11 is a move constructor. So if your class supports move semantics, it is going to be moved from, not copied, even if (N)RVO is not happening. Unfortunatelly, not everything can be semantically moved efficiently.

Fifth of all, return by reference is a terrible antipattern. It ensures that object will be effectively created twice - first time as 'empty' object, second time when populated with data - and it precludes you from using objects for which 'empty' state is not a valid invariant.

What is the best way to return multiple large objects in C++?

My main question is why foo() ends up copying? RVO should elide the
tuple from being copied but shouldn't the compiler be smart enough to
not copy the A struct? The tuple constructor could be a move
constructor

No, move constructor could only construct it from another tuple<> object. {a,b} is constructing from the component types, so the A and B objects are copied.

what it going on with quux(). I didnt think that additional
std::move() call was necessary but I don't understand why it ends up
causing an additional move to actually occur i.e. I'd expect it to
have the same output as bar().

The 2nd move happens when you are moving the tuple. Moving it prevents the copy elision that occurs in bar(). It is well-know that std::move() around the entire return expression is harmful.

Does passing a large object into a function affect speed?

Python always uses a "pass-by-reference" model. So by definition, it shouldn't affect your code performance directly. However, dealing with larger object can.

Returning Large Objects in Functions