Can a Local Variable'S Memory Be Accessed Outside Its Scope

Can a local variable's memory be accessed outside its scope?

How can it be? Isn't the memory of a local variable inaccessible outside its function?

You rent a hotel room. You put a book in the top drawer of the bedside table and go to sleep. You check out the next morning, but "forget" to give back your key. You steal the key!

A week later, you return to the hotel, do not check in, sneak into your old room with your stolen key, and look in the drawer. Your book is still there. Astonishing!

How can that be? Aren't the contents of a hotel room drawer inaccessible if you haven't rented the room?

Well, obviously that scenario can happen in the real world no problem. There is no mysterious force that causes your book to disappear when you are no longer authorized to be in the room. Nor is there a mysterious force that prevents you from entering a room with a stolen key.

The hotel management is not required to remove your book. You didn't make a contract with them that said that if you leave stuff behind, they'll shred it for you. If you illegally re-enter your room with a stolen key to get it back, the hotel security staff is not required to catch you sneaking in. You didn't make a contract with them that said "if I try to sneak back into my room later, you are required to stop me." Rather, you signed a contract with them that said "I promise not to sneak back into my room later", a contract which you broke.

In this situation anything can happen. The book can be there -- you got lucky. Someone else's book can be there and yours could be in the hotel's furnace. Someone could be there right when you come in, tearing your book to pieces. The hotel could have removed the table and book entirely and replaced it with a wardrobe. The entire hotel could be just about to be torn down and replaced with a football stadium, and you are going to die in an explosion while you are sneaking around.

You don't know what is going to happen; when you checked out of the hotel and stole a key to illegally use later, you gave up the right to live in a predictable, safe world because you chose to break the rules of the system.

C++ is not a safe language. It will cheerfully allow you to break the rules of the system. If you try to do something illegal and foolish like going back into a room you're not authorized to be in and rummaging through a desk that might not even be there anymore, C++ is not going to stop you. Safer languages than C++ solve this problem by restricting your power -- by having much stricter control over keys, for example.

UPDATE

Holy goodness, this answer is getting a lot of attention. (I'm not sure why -- I considered it to be just a "fun" little analogy, but whatever.)

I thought it might be germane to update this a bit with a few more technical thoughts.

Compilers are in the business of generating code which manages the storage of the data manipulated by that program. There are lots of different ways of generating code to manage memory, but over time two basic techniques have become entrenched.

The first is to have some sort of "long lived" storage area where the "lifetime" of each byte in the storage -- that is, the period of time when it is validly associated with some program variable -- cannot be easily predicted ahead of time. The compiler generates calls into a "heap manager" that knows how to dynamically allocate storage when it is needed and reclaim it when it is no longer needed.

The second method is to have a “short-lived” storage area where the lifetime of each byte is well known. Here, the lifetimes follow a “nesting” pattern. The longest-lived of these short-lived variables will be allocated before any other short-lived variables, and will be freed last. Shorter-lived variables will be allocated after the longest-lived ones, and will be freed before them. The lifetime of these shorter-lived variables is “nested” within the lifetime of longer-lived ones.

Local variables follow the latter pattern; when a method is entered, its local variables come alive. When that method calls another method, the new method's local variables come alive. They'll be dead before the first method's local variables are dead. The relative order of the beginnings and endings of lifetimes of storages associated with local variables can be worked out ahead of time.

For this reason, local variables are usually generated as storage on a "stack" data structure, because a stack has the property that the first thing pushed on it is going to be the last thing popped off.

It's like the hotel decides to only rent out rooms sequentially, and you can't check out until everyone with a room number higher than you has checked out.

So let's think about the stack. In many operating systems you get one stack per thread and the stack is allocated to be a certain fixed size. When you call a method, stuff is pushed onto the stack. If you then pass a pointer to the stack back out of your method, as the original poster does here, that's just a pointer to the middle of some entirely valid million-byte memory block. In our analogy, you check out of the hotel; when you do, you just checked out of the highest-numbered occupied room. If no one else checks in after you, and you go back to your room illegally, all your stuff is guaranteed to still be there in this particular hotel.

We use stacks for temporary stores because they are really cheap and easy. An implementation of C++ is not required to use a stack for storage of locals; it could use the heap. It doesn't, because that would make the program slower.

An implementation of C++ is not required to leave the garbage you left on the stack untouched so that you can come back for it later illegally; it is perfectly legal for the compiler to generate code that turns back to zero everything in the "room" that you just vacated. It doesn't because again, that would be expensive.

An implementation of C++ is not required to ensure that when the stack logically shrinks, the addresses that used to be valid are still mapped into memory. The implementation is allowed to tell the operating system "we're done using this page of stack now. Until I say otherwise, issue an exception that destroys the process if anyone touches the previously-valid stack page". Again, implementations do not actually do that because it is slow and unnecessary.

Instead, implementations let you make mistakes and get away with it. Most of the time. Until one day something truly awful goes wrong and the process explodes.

This is problematic. There are a lot of rules and it is very easy to break them accidentally. I certainly have many times. And worse, the problem often only surfaces when memory is detected to be corrupt billions of nanoseconds after the corruption happened, when it is very hard to figure out who messed it up.

More memory-safe languages solve this problem by restricting your power. In "normal" C# there simply is no way to take the address of a local and return it or store it for later. You can take the address of a local, but the language is cleverly designed so that it is impossible to use it after the lifetime of the local ends. In order to take the address of a local and pass it back, you have to put the compiler in a special "unsafe" mode, and put the word "unsafe" in your program, to call attention to the fact that you are probably doing something dangerous that could be breaking the rules.

For further reading:

  • What if C# did allow returning references? Coincidentally that is the subject of today's blog post:

    https://ericlippert.com/2011/06/23/ref-returns-and-ref-locals/

  • Why do we use stacks to manage memory? Are value types in C# always stored on the stack? How does virtual memory work? And many more topics in how the C# memory manager works. Many of these articles are also germane to C++ programmers:

    https://ericlippert.com/tag/memory-management/

Pointer can point local variable's memory outside it's scope? [duplicate]

Because you're returning a pointer to a local variable, this is undefined behavior. This includes "appearing" to work, but it's a terrible idea to rely on it in the general case.

In this specific case, the value is left on the stack, and it appears the generated code fetches *ptr just after the call to foo, and before any other function calls. As such, the value has not been overwritten by any other function calls.

If you were to instead insert a function call between the foo(&ptr) and cout << ... statements, the value would more than likely be garbage.

How we can access auto and static variables outside their scope in C?

As you said, static variables exist through out the life cycle of the program i.e memory allocated to them is not destroyed as long as the program is running. So, to access such a variable out side its scope, we can pass around the pointer to that memory location via pointer. A small example to show the same

#include <stdio.h>
#include <stdlib.h>

int* func()
{
static int a = 0;
a++;
printf("a in func = %d\n", a);
return &a;
}

int main()
{
int *p;
p = func();
printf("a in main from ptr : %d\n", *p);
*p++;
p = func();
return 0;
}

As you can see in the example, func() returns the pointer to the static variable it has declared, and any one who wishes to access the variable a, can use that pointer. NOTE: we can only do this because static variable's life is through out the program. Now irrespective of the static variable being in a different function or a different file, as long as you can some how get hold of the pointer to that static variable, you can use it.

Now coming to the case of auto variable.

What happens if you run the above program changing a from static to auto?
you will see that while compiling a warning warning: function returns address of local variable [-Wreturn-local-addr] is thrown and when executing, we get a segmentation fault.
What causes this is that the auto variable exists only in its scope, i.e as long as the function func() is being executed, the variable a has memory allocated for itself. As soon as the function exits, the memory allocated for variable a is freed and so the value pointed to by pointer p is at some unallocated memory location (resulting in segmentation fault).

Why can't access a variable declared with new outside of the scope it was declared in?

There is a difference between x (a pointer) and the thing it points to (an int).

The int is not "declared on the heap"- the heap is not a scope nor does it contain declarations. x, on the other hand, is just a normal variable on the stack that disappears when the execution of its containing block completes.

The int on the heap does continue to exist on the heap, but when you throw away x (the pointer) you have no way to access it and the int has leaked.

Accessing a scope-local variable through reference

C compilers aren't required to generate an error when you do something like this. That's part of what makes C fast.

That also means that if you do something you're not supposed to do, you can trigger undefined behavior. This essentially means that no guarantees can be made regarding what the program will do. It could crash, it could output strange results, or it could appear to be work properly as in your case.

How UB manifests itself can change by making a seemingly unrelated change, such as adding an unused local variable or a call to printf for debugging.

In this particular case the memory in question hasn't been reserved for some other use so it is still in the valid memory space for the program and hasn't yet been overwritten. But again, you can't rely on that behavior.

Pointer to local variable is stored outside the scope of this variable

The ret_ptr parameter to the function in question is expected to point to a variable in the calling function. This pointer is then dereferenced for both reading and writing this external variable.

The if (ret_ptr == NULL) block checks whether the caller actually passed in the address of some variable. If not, this pointer is then made to point to the local variable unused so that the pointer can still be safely dereferenced later in the code. But since ret_ptr now points to a local, changes made by dereferencing it are not seen outside the function. This is fine, since the caller passed in NULL for ret_ptr. Similarly, since ret_ptr is a parameter, any changes to it are not visible outside of the function.

Nothing needs to be refactored here. The code works as intended with regard to ret_ptr. This is a false positive from PVS-Studio.

EDIT:

This is NOT a false positive. The unused variable is defined at a lower scope than ret_ptr, namely the scope of the first if block in the function. After the if block, ret_ptr is then dereferenced. If it was pointing to ununsed, that variable is now out of scope and dereferencing ret_ptr invokes undefined behavior.

To fix this, unused must be declared and assigned to above the if block:

gearman_job_st *gearman_worker_grab_job(gearman_worker_st *worker_shell,
gearman_job_st *job,
gearman_return_t *ret_ptr)
{
gearman_return_t unused;
if (ret_ptr == NULL)
{
ret_ptr= &unused;
}

if (worker_shell and worker_shell->impl())
{
...
}

assert(*ret_ptr != GEARMAN_MAX_RETURN);
return NULL;
}

Why isn't the local variable going out of scope? [duplicate]

Local variables are allocated in the stack. When you call fun() from main() the stack appears as something like:

+---------------+ <---- Stack pointer  
| local var x |
+---------------+ <---- Address of 'x'
| Return addr |
| in main() |
+---------------+
|Local vars of |
| main() |
+---------------+
| ... |
+---------------+

When you go back to main(), the local variables, returned address and parameters are popped from the stack. But the stack is not cleared (by the way, it would consume too much CPU!). So, only the stack pointer moves:

+---------------+
| local var x |
+---------------+ <---- Address of 'x'
| Return addr |
| in main() |
+---------------+ <---- Stack pointer moved with the pops
|Local vars of |
| main() |
+---------------+
| ... |
+---------------+

Everything which is above the stack pointer is considered invalid even if it is not cleared. So, it is why that you are lucky enough to get the value of x in the main() function.

But let's say that you call another function right after fun():

#include<stdio.h> 
#include<string.h>

int *p = NULL;

void fun2()
{
int var = 18;
int var2 = 43;

printf("fun2() called, var@%p=%d, var2@%p=%d\n", &var, var, &var2, var2);
}

int *fun()
{
int x = 5;
p= &x;
return p;
}

// Driver Code
int main(int argc, char *argv[])
{
int *px;

px = fun();
printf("x@%p=%d\n", px, *px);
if (argc != 1) {
fun2();
}
printf("x@%p=%d\n", px, *px);
return 0;
}

When the program does not call fun2(), it behaves as yours but I added the display of the address of x:

$ gcc try.c -o try
$ ./try
x@0x7ffd5beb5f04=5

When the program is passed any parameter, we call fun2() after fun() and we display x before and after the call to fun2():

$ ./try any_param
x@0x7ffeadacc084=5
fun2() called, var@0x7ffeadacc080=18, var2@0x7ffeadacc084=43
x@0x7ffeadacc084=43

We can see that the value of x is changed to 43 after the call to fun2() because, the local variable var2 in fun2() has been put at the same place as x when fun() was running. Hence, the same address 0x7ffeadacc084 in the stack and of course the new value 43 of x which is in fact the value of var2.

Here is how the stack looks like after calling fun2() (the former data of _fun() have been overwritten by the data of fun2()):

+---------------+
|local var var |
+---------------+ <---- Address of 'var' = 0x7ffeadacc080
|local var var2 |
+---------------+ <---- Address of 'var2' = 0x7ffeadacc084
| Return addr |
| in main() |
+---------------+ <---- Stack pointer moved with the pops
|Local vars of |
| main() |
+---------------+
| ... |
+---------------+

PS: The stack grows from the high to low addresses. Hence, var is located at an address lower than the address of var2.

How local variable can change the value of global variable in cpp

how can the operation n / 10 which is inside the inner loop, while(n>0), can change the value of n which is inside the same while loop.

For the same reason that digit_sum = digit_sum + last_digit; can change the value of digit_sum, even though it is declared outside of the loop, same as n.

How the operation of local variable which is inside the inner while loop can change the value of upper level scope variable which is outside the while loop.

Why shouldn't it be able to? n (and digit_sum) is in scope inside the loop. A variable's lifetime is the duration of the scope that it is declared in, but inner scopes also have access to it.



Related Topics



Leave a reply



Submit