What Happens When C++ Reference Leaves Its Scope

What happens when C++ reference leaves its scope?

this is undefined behavior and you were simply lucky that the memory for a hadn't been used for anything else yet. In a more complex scenario you would almost certainly get garbage. On my machine I get random garbage with this code. For me, this is likely because I am using a 64-bit machine which uses a register calling convention. Registers get re-used much more often than main memory (ideally...).

So to answer your question of "what happens". Well in this scenario, the reference is likely little more than a limited pointer with friendlier syntax :-). Under the hood the address of a is stored. Later the a object goes out of scope, but the B object's reference to that a will not be "auto-magically" updated to reflect this. Hence you have an invalid reference now.

Using this invalid reference will yield just about anything, sometimes crashes, sometimes just bunk data.


EDIT: Thanks to Omnifarious, I've been thinking about this. There is a rule in c++ that basically says that if you have a const reference to a temporary, then the lifetime of the temporary is at least as long as the const reference. Which introduced a new question.

EDIT: Moved to separate question for those interested (const reference to temporary oddity)

C++: Reference to out of scope object

This causes undefined behaviour. Don't do it.

Implementation-wise, realistically, the reference would point into the stack where the stackframe for the call to foo used to be. That memory will in many cases still make sense, so the error is often not immediately apparent. Therefore, you should take care never to make a dangling reference like that.

What happens when a variable goes out of scope?

The actual behavior of your code sample is determined by two primary factors: 1) the behavior is undefined by the language, 2) an optimizing compiler will generate machine code that does not physically match your C code.

For example, despite the fact that the behavior is undefined, GCC can (and will) easily optimize your code to a mere

printf("ptr = %d\n", 17);

which means that the output you see has very little to do with what happens to any variables in your code.

If you want the behavior of your code to better reflect what happens physically, you should declare your pointers volatile. The behavior will still be undefined, but at least it will restrict some optimizations.

Now, as to what happens to local variables when they go out of scope. Nothing physical happens. A typical implementation will allocate enough space in the program stack to store all variables at the deepest level of block nesting in the current function. This space is typically allocated in the stack in one shot at the function startup and released back at the function exit.

That means that the memory formerly occupied by tmp continues to remain reserved in the stack until the function exits. That also means that the same stack space can (and will) be reused by different variables having approximately the same level of "locality depth" in sibling blocks. The space will hold the value of the last variable until some other variable declared in some sibling block variable overrides it. In your example nobody overrides the space formerly occupied by tmp, so you will typically see the value 17 survive intact in that memory.

However, if you do this

int main(void) {
volatile int *ptr;
volatile int *ptrd;

{ // Block
int tmp = 17;
ptr = &tmp; // Just to see if the memory is cleared
}

{ // Sibling block
int d = 5;
ptrd = &d;
}

printf("ptr = %d %d\n", *ptr, *ptrd);
printf("%p %p\n", ptr, ptrd);
}

you will see that the space formerly occupied by tmp has been reused for d and its former value has been overriden. The second printf will typically output the same pointer value for both pointers.

What happens in C++ when I pass an object by reference and it goes out of scope?

It works exactly as for pointers: using something (pointer/reference) that refers to an object that no longer exists is undefined behavior. It may appear to work but it can break at any time.

Warning: what follows is a quick explanation of why such method calls can seem to work in several occasions, just for informative purposes; when writing actual code you should rely only on what the standard says

As for the behavior you are observing: on most (all?) compilers method calls are implemented as function calls with a hidden this parameter that refers to the instance of the class on which the method is going to operate. But in your case, the this pointer isn't being used at all (the code in the function is not referring to any field, and there's no virtual dispatch), so the (now invalid) this pointer is not used and the call succeeds.

In other instances it may appear to work even if it's referring to an out-of-scope object because its memory hasn't been reused yet (although the destructor has already run, so the method will probably find the object in an inconsistent state).

Again, you shouldn't rely on this information, it's just to let you know why that call still works.

Using a reference member out of scope

How come we are still getting the value 123 if the argument was destroyed?

Because nothing guarantees you won't. In C++, accessing an object whose lifetime has ended (and your temporary is dead when you access it) results in undefined behavior. Undefined behavior doesn't mean "crash", or "get empty result". It means the language specification doesn't prescribe an outcome. You can't reason about the results of the program from a pure C++ perspective.

Now what may happen, is that your C++ implementation reserves storage for that temporary. And even though it may reuse that location after p is initialized, it doesn't mean it has to. So you end up reading the "proper value" by sheer luck.

When is an object out of scope?

First, remember that objects in C++ can be created either on the stack or on on the heap.

A stack frame (or scope) is defined by a statement. That can be as big as a function or as small as a flow control block (while/if/for etc.). An arbitrary {} pair enclosing an arbitrary block of code also constitutes a stack frame. Any local variable defined within a frame will go out of scope once the program exits that frame. When a stack variable goes out of scope, its destructor is called.

So here is a classic example of a stack frame (an execution of a function) and a local variable declared within it, which will go out of scope once the stack frame exits - once the function finishes:

void bigSideEffectGuy () {
BigHeavyObject b (200);
b.doSomeBigHeavyStuff();
}
bigSideEffectGuy();
// a BigHeavyObject called b was created during the call,
// and it went out of scope after the call finished.
// The destructor ~BigHeavyObject() was called when that happened.

Here is an example where we see a stack frame being just the body of an if statement:

if (myCondition) {
Circle c (20);
c.draw();
}
// c is now out of scope
// The destructor ~Circle() has been called

The only way for a stack-created object to "remain in scope" after the frame is exited is if it is the return value of a function. But that is not really "remaining in scope" because the object is being copied. So the original goes out of scope, but a copy is made. Example:

Circle myFunc () {
Circle c (20);
return c;
}
// The original c went out of scope.
// But, the object was copied back to another
// scope (the previous stack frame) as a return value.
// No destructor was called.

Now, an object can also be declared on the heap. For the sake of this discussion, think of the heap as an amorphous blob of memory. Unlike the stack, which automatically allocates and de-allocates the necessary memory as you enter and exit stack frames, you must manually reserve and free heap memory.

An object declared on the heap does, after a fashion, "survive" between stack frames. One could say that an object declared on the heap never goes out of scope, but that's really because the object is never really associated with any scope. Such an object must be created via the new keyword, and must be referred to by a pointer.

It is your responsibility to free the heap object once you are done with it. You free heap objects with the delete keyword. The destructor on a heap object is not called until you free the object.

The pointers that refer to heap objects are themselves usually local variables associated with scopes. Once you are done using the heap object, you allow the pointer(s) referring to it to go out of scope. If you haven't explicitly freed the object the pointer is pointing to, then the block of heap memory will never be freed until the process exits (this is called a memory leak).

Think of it all this way: an object created on the stack is like a balloon taped to a chair in a room. When you exit the room, the balloon automatically pops. An object created on the heap is like a balloon on a ribbon, tied to a chair in a room. The ribbon is the pointer. When you exit the room, the ribbon automatically vanishes, but the balloon just floats to the ceiling and takes up space. The proper procedure is to pop the balloon with a pin, and then exit the room, whereupon the ribbon will disappear. But, the good thing about the balloon on the string is you can also untie the ribbon, hold it in your hand, and exit the room and take the balloon with you.

So to go to your linked list example: typically, nodes of such a list are declared on the heap, with each node holding a pointer to the next node. All of this is sitting on the heap and never goes out of scope. The only thing that could go out of scope is the pointer that points to the root of the list - the pointer you use to reference into the list in the first place. That can go out of scope.

Here's an example of creating stuff on the heap, and the root pointer going out of scope:

if (myCondition) {
Node* list_1 = new Node (3);
Node* list_2 = new Node (4);
Node* list_3 = new Node (5);

list_1->next = list_2;
list_2->next = list_3;
list_3->next = null;
}
// The list still exists
// However list_1 just went out of scope
// So the list is "marooned" as a memory leak

Referencing a char* that went out of scope

Inside the scope where b is defined, it is assigned the address of a string literal. These literals typically live in a read-only section of memory as opposed to the stack.

When you do a=b you assign the value of b to a, i.e. a now contains the address of a string literal. This address is still valid after b goes out of scope.

If you had taken the address of b and then attempted to dereference that address, then you would invoke undefined behavior.

So your code is valid and does not invoke undefined behavior, but the following does:

int *a = NULL;
{
int b = 6;
a = &b;
}

printf("b=%d\n", *a);

Another, more subtle example:

char *a = NULL;
{
char b[] = "stackoverflow";
a = b;
}

printf(a);

The difference between this example and yours is that b, which is an array, decays to a pointer to the first element when assigned to a. So in this case a contains the address of a local variable which then goes out of scope.

EDIT:

As a side note, it's bad practice to pass a variable as the first argument of printf, as that can lead to a format string vulnerability. Better to use a string constant as follows:

printf("%s", a);

Or more simply:

puts(a);

How can we return a variable by reference while the scope has gone as

Note: OP and the other answers suggest variations on returning a pre-existing object (global, static in function, member variable). This answer, however, discusses returning a variable whose lifetime starts in the function, which I thought was the spirit of the question, i.e.:

how can we return a variable by reference while the scope of the returning function has gone and its vars have been destroyed as soon as returning the var.

The only way to return by reference a new object is by dynamically allocating it:

int& foo() {
return *(new int);
}

Then, later on:

delete &myref;

Now, of course, that is not the usual way of doing things, nor what people expect when they see a function that returns a reference. See all the caveats at Deleting a reference.

It could make some sense, though, if the object is one of those that "commits suicide" later by calling delete this. Again, this is not typical C++ either. More information about that at Is delete this allowed?.

Instead, when you want to return an object that is constructed inside a function, what you usually do is either:

  • Return by value (possibly taking advantage of copy elision).
  • Return a dynamically allocated object (either returning a raw pointer to it or a class wrapping it, e.g. a smart pointer).

But neither of these two approaches return the actual object by reference.



Related Topics



Leave a reply



Submit