Why Don't I Get a Segmentation Fault When I Write Beyond the End of an Array

Why don't I get a segmentation fault when I write beyond the end of an array?

I'm guessing you're coming from Java or a Java-like language where once you step out of the boundary of an array, you get the "array index out of bounds" exception.

Well, C expects more from you; it saves up the space you ask for, but it doesn't check to see if you're going outside the boundary of that saved up space. Once you do that as mentioned above, the program has that dreaded undefined behavior.

And remember for the future that if you have a bug in your program and you can't seem to find it, and when you go over the code/debug it, everything seems OK, there is a good chance you're "out of bounds" and accessing an unallocated place.

why doesn't segmentation fault occur here? [duplicate]

C++ has a notion of undefined behavior. Accessing a vector out-of-bounds is a typical example of undefined behavior (because vector::operator[] performs no bounds checking). Nothing meaningful can be said about the outcome of the program.

But to explain what probably happens...

A vector is a class that usually contains a pointer to a heap-allocated array and its size (the "capacity").

An empty vector has a null pointer value. Dereferencing a null pointer often immediately leads to a segfault because there's no virtual memory region allocated at address 0.

On the other hand, a vector of capacity 2 points to an array of size 2. Writing past it is often possible, you will simply overwrite heap memory that happens to reside immediately after that array. This is called a heap buffer overflow. In a simple program it may appear to work fine. But in a larger program nothing good will happen somewhere further down the code.

Segmentation Fault doesn't come up immediately after accessing out-of-bound memory

Your array tab will be located someplace on the stack. When you print past the end of the array, you are actually printing values of other memory locations on the stack.

The reason it takes around 1000 iterations to get the seg fault is that the stack is mapped in pages, and pages are usually 4 KB in size. Once you read around 1000 ints, you are around 4000 bytes past where you should be and you have crossed over to an unmapped page. Reading from an unmapped page is what actually triggers the seg fault.

Take note that I am only explaining what happened on your system. There is no guarantee that the stack will be mapped in pages or that the pages will be 4 KB in size. Technically, you are triggering undefined behavior and anything can happen. You might find it illuminating to do printf("%p\n", &tab[i]); on each iteration and see what is the last address it prints before you get the seg fault. If I was right about the 4 KB pages, the last address you see printed will end in ffc because that will be the last 4 bytes on the page.

Why doesn't my program crash when I write past the end of an array?

Something I wrote sometime ago for education-purposes...

Consider the following c-program:

int q[200];

main(void) {
int i;
for(i=0;i<2000;i++) {
q[i]=i;
}
}

after compiling it and executing it, a core dump is produced:

$ gcc -ggdb3 segfault.c
$ ulimit -c unlimited
$ ./a.out
Segmentation fault (core dumped)

now using gdb to perform a post mortem analysis:

$ gdb -q ./a.out core
Program terminated with signal 11, Segmentation fault.
[New process 7221]
#0 0x080483b4 in main () at s.c:8
8 q[i]=i;
(gdb) p i
$1 = 1008
(gdb)

huh, the program didn’t segfault when one wrote outside the 200 items allocated, instead it crashed when i=1008, why?

Enter pages.

One can determine the page size in several ways on UNIX/Linux, one way is to use the system function sysconf() like this:

#include <stdio.h>
#include <unistd.h> // sysconf(3)

int main(void) {
printf("The page size for this system is %ld bytes.\n",
sysconf(_SC_PAGESIZE));

return 0;
}

which gives the output:

The page size for this system is 4096 bytes.

or one can use the commandline utility getconf like this:

$ getconf PAGESIZE
4096

post mortem

It turns out that the segfault occurs not at i=200 but at i=1008, lets figure out why. Start gdb to do some post mortem ananlysis:

$gdb -q ./a.out core

Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
[New process 4605]
#0 0x080483b4 in main () at seg.c:6
6 q[i]=i;
(gdb) p i
$1 = 1008
(gdb) p &q
$2 = (int (*)[200]) 0x804a040
(gdb) p &q[199]
$3 = (int *) 0x804a35c

q ended at at address 0x804a35c, or rather, the last byte of q[199] was at that location. The page size is as we saw earlier 4096 bytes and the 32-bit word size of the machine gives that an virtual address breaks down into a 20-bit page number and a 12-bit offset.

q[] ended in virtual page number:

0x804a = 32842
offset:

0x35c = 860
so there were still:

4096 - 864 = 3232
bytes left on that page of memory on which q[] was allocated. That space can hold:

3232 / 4 = 808
integers, and the code treated it as if it contained elements of q at position 200 to 1008.

We all know that those elements don’t exists and the compiler didn’t complain, neither did the hw since we have write permissions to that page. Only when i=1008 did q[] refer to an address on a different page for which we didn’t have write permission, the virtual memory hw detected this and triggered a segfault.

An integer is stored in 4 bytes, meaning that this page contains 808 (3236/4) additional fake elements meaning that it is still perfectly legal to access these elements from q[200], q[201] all the way up to element 199+808=1007 (q[1007]) without triggering a seg fault. When accessing q[1008] you enter a new page for which the permission are different.

Why don't i get Segmentation Fault? [duplicate]

Because undefined behavior doesn't mean "you will receive a segfault", that would be defined behavior.

Let's assume you're running in debug mode and your compiler is padding your stack/local variable space. You're probably just writing into some unused part of the stack space.

Build a release version on a Monday when your compiler is feeling cranky and now you overwrite the return address, or the code that sets up the call to printf, whatever. Oops.

Just one possible outcome, but you get the idea.

Why does this program NOT segfault? [duplicate]

char rev[2] assigns a memory of size 2*sizeof(char) with variable/pointer rev. You are accessing memory not allocated to the pointer. It may or may not cause errors.

It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.

When you do rev[2] or rev[3] you are accessing rev + 2 and rev + 3 addresses which are not allocated to rev pointer. Since its a small program and there is nothing there, it's not causing any errors.

In respect to edit:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char arr[4] = "TEST";
char rev[2] = "00";

printf("%s\n",rev);
}

%s prints till null is encountered, the size you have assigned of the arr and rev doesn't allow for null to be there, try changing values as follow:

    char arr[5] = "TEST";
char rev[3] = "00";

The program will work as intended as in arr there will be TEST\0 and rev will be rev\0 where \0 is null character in C.

Give this article a read, it'll solve most of your queries.

Definitive List of Common Reasons for Segmentation Faults

WARNING!


The following are potential reasons for a segmentation fault. It is virtually impossible to list all reasons. The purpose of this list is to help diagnose an existing segfault.

The relationship between segmentation faults and undefined behavior cannot be stressed enough! All of the below situations that can create a segmentation fault are technically undefined behavior. That means that they can do anything, not just segfault -- as someone once said on USENET, "it is legal for the compiler to make demons fly out of your nose.". Don't count on a segfault happening whenever you have undefined behavior. You should learn which undefined behaviors exist in C and/or C++, and avoid writing code that has them!

More information on Undefined Behavior:

  • What is the simplest standard conform way to produce a Segfault in C?
  • Undefined, unspecified and implementation-defined behavior
  • How undefined is undefined behavior?

What Is a Segfault?

In short, a segmentation fault is caused when the code attempts to access memory that it doesn't have permission to access. Every program is given a piece of memory (RAM) to work with, and for security reasons, it is only allowed to access memory in that chunk.

For a more thorough technical explanation about what a segmentation fault is, see What is a segmentation fault?.

Here are the most common reasons for a segmentation fault error. Again, these should be used in diagnosing an existing segfault. To learn how to avoid them, learn your language's undefined behaviors.

This list is also no replacement for doing your own debugging work. (See that section at the bottom of the answer.) These are things you can look for, but your debugging tools are the only reliable way to zero in on the problem.


Accessing a NULL or uninitialized pointer

If you have a pointer that is NULL (ptr=0) or that is completely uninitialized (it isn't set to anything at all yet), attempting to access or modify using that pointer has undefined behavior.

int* ptr = 0;
*ptr += 5;

Since a failed allocation (such as with malloc or new) will return a null pointer, you should always check that your pointer is not NULL before working with it.

Note also that even reading values (without dereferencing) of uninitialized pointers (and variables in general) is undefined behavior.

Sometimes this access of an undefined pointer can be quite subtle, such as in trying to interpret such a pointer as a string in a C print statement.

char* ptr;
sprintf(id, "%s", ptr);

See also:

  • How to detect if variable uninitialized/catch segfault in C
  • Concatenation of string and int results in seg fault C

Accessing a dangling pointer

If you use malloc or new to allocate memory, and then later free or delete that memory through pointer, that pointer is now considered a dangling pointer. Dereferencing it (as well as simply reading its value - granted you didn't assign some new value to it such as NULL) is undefined behavior, and can result in segmentation fault.

Something* ptr = new Something(123, 456);
delete ptr;
std::cout << ptr->foo << std::endl;

See also:

  • What is a dangling pointer?
  • Why my dangling pointer doesn't cause a segmentation fault?

Stack overflow

[No, not the site you're on now, what is was named for.] Oversimplified, the "stack" is like that spike you stick your order paper on in some diners. This problem can occur when you put too many orders on that spike, so to speak. In the computer, any variable that is not dynamically allocated and any command that has yet to be processed by the CPU, goes on the stack.

One cause of this might be deep or infinite recursion, such as when a function calls itself with no way to stop. Because that stack has overflowed, the order papers start "falling off" and taking up other space not meant for them. Thus, we can get a segmentation fault. Another cause might be the attempt to initialize a very large array: it's only a single order, but one that is already large enough by itself.

int stupidFunction(int n)
{
return stupidFunction(n);
}

Another cause of a stack overflow would be having too many (non-dynamically allocated) variables at once.

int stupidArray[600851475143];

One case of a stack overflow in the wild came from a simple omission of a return statement in a conditional intended to prevent infinite recursion in a function. The moral of that story, always ensure your error checks work!

See also:

  • Segmentation Fault While Creating Large Arrays in C
  • Seg Fault when initializing array

Wild pointers

Creating a pointer to some random location in memory is like playing Russian roulette with your code - you could easily miss and create a pointer to a location you don't have access rights to.

int n = 123;
int* ptr = (&n + 0xDEADBEEF); //This is just stupid, people.

As a general rule, don't create pointers to literal memory locations. Even if they work one time, the next time they might not. You can't predict where your program's memory will be at any given execution.

See also:

  • What is the meaning of "wild pointer" in C?

Attempting to read past the end of an array

An array is a contiguous region of memory, where each successive element is located at the next address in memory. However, most arrays don't have an innate sense of how large they are, or what the last element is. Thus, it is easy to blow past the end of the array and never know it, especially if you're using pointer arithmetic.

If you read past the end of the array, you may wind up going into memory that is uninitialized or belongs to something else. This is technically undefined behavior. A segfault is just one of those many potential undefined behaviors. [Frankly, if you get a segfault here, you're lucky. Others are harder to diagnose.]

// like most UB, this code is a total crapshoot.
int arr[3] {5, 151, 478};
int i = 0;
while(arr[i] != 16)
{
std::cout << arr[i] << std::endl;
i++;
}

Or the frequently seen one using for with <= instead of < (reads 1 byte too much):

char arr[10];
for (int i = 0; i<=10; i++)
{
std::cout << arr[i] << std::endl;
}

Or even an unlucky typo which compiles fine (seen here) and allocates only 1 element initialized with dim instead of dim elements.

int* my_array = new int(dim);

Additionally it should be noted that you are not even allowed to create (not to mention dereferencing) a pointer which points outside the array (you can create such pointer only if it points to an element within the array, or one past the end). Otherwise, you are triggering undefined behaviour.

See also:

  • I have segfaults!

Forgetting a NUL terminator on a C string.

C strings are, themselves, arrays with some additional behaviors. They must be null terminated, meaning they have an \0 at the end, to be reliably used as strings. This is done automatically in some cases, and not in others.

If this is forgotten, some functions that handle C strings never know when to stop, and you can get the same problems as with reading past the end of an array.

char str[3] = {'f', 'o', 'o'};
int i = 0;
while(str[i] != '\0')
{
std::cout << str[i] << std::endl;
i++;
}

With C-strings, it really is hit-and-miss whether \0 will make any difference. You should assume it will to avoid undefined behavior: so better write char str[4] = {'f', 'o', 'o', '\0'};


Attempting to modify a string literal

If you assign a string literal to a char*, it cannot be modified. For example...

char* foo = "Hello, world!"
foo[7] = 'W';

...triggers undefined behavior, and a segmentation fault is one possible outcome.

See also:

  • Why is this string reversal C code causing a segmentation fault?

Mismatching Allocation and Deallocation methods

You must use malloc and free together, new and delete together, and new[] and delete[] together. If you mix 'em up, you can get segfaults and other weird behavior.

See also:

  • Behaviour of malloc with delete in C++
  • Segmentation fault (core dumped) when I delete pointer

Errors in the toolchain.

A bug in the machine code backend of a compiler is quite capable of turning valid code into an executable that segfaults. A bug in the linker can definitely do this too.

Particularly scary in that this is not UB invoked by your own code.

That said, you should always assume the problem is you until proven otherwise.


Other Causes

The possible causes of Segmentation Faults are about as numerous as the number of undefined behaviors, and there are far too many for even the standard documentation to list.

A few less common causes to check:

  • UD2 generated on some platforms due to other UB
  • c++ STL map::operator[] done on an entry being deleted

DEBUGGING

Firstly, read through the code carefully. Most errors are caused simply by typos or mistakes. Make sure to check all the potential causes of the segmentation fault. If this fails, you may need to use dedicated debugging tools to find out the underlying issues.

Debugging tools are instrumental in diagnosing the causes of a segfault. Compile your program with the debugging flag (-g), and then run it with your debugger to find where the segfault is likely occurring.

Recent compilers support building with -fsanitize=address, which typically results in program that run about 2x slower but can detect address errors more accurately. However, other errors (such as reading from uninitialized memory or leaking non-memory resources such as file descriptors) are not supported by this method, and it is impossible to use many debugging tools and ASan at the same time.

Some Memory Debuggers

  • GDB | Mac, Linux
  • valgrind (memcheck)| Linux
  • Dr. Memory | Windows

Additionally it is recommended to use static analysis tools to detect undefined behaviour - but again, they are a tool merely to help you find undefined behaviour, and they don't guarantee to find all occurrences of undefined behaviour.

If you are really unlucky however, using a debugger (or, more rarely, just recompiling with debug information) may influence the program's code and memory sufficiently that the segfault no longer occurs, a phenomenon known as a heisenbug.

In such cases, what you may want to do is to obtain a core dump, and get a backtrace using your debugger.

  • How to generate a core dump in Linux on a segmentation fault?
  • How do I analyse a program's core dump file with GDB when it has command-line parameters?

Why I do not get a segmentation fault? [duplicate]

Part of the performance of C is that it does not have much in the way of built in error checking. All strcpy knows is that it was passed a pointer of the correct type, it doesn't know how much allocated memory it was pointing at. The resulting machine code simply reads from the src pointer up to the first null byte and then pastes it into the dst pointer. If it doesn't overwrite somebody else's memory, there is no error.

What is "someone else's memory"? Generally a process is allocated memory in pages. When you malloc one byte, the process is given a whole page of memory, probably a few kilobytes, to slice and dice as it needs. Segmentation faults occur when your process tries to write outside its allocated pages, and for other reasons. The error is typically generated by the operating system and/or hardware which is doing the memory management. Since src is only a few dozen bytes it is unlikely to walk outside the process' page. If you make src a longer string, you'll probably get the segfault you're expecting.

There are various malloc wrapper libraries used for debugging which, through various tricks, make C check for memory mistakes. Valgrind and Electric Fence being some of the most famous.

PS I'm a little hazy on exactly how this stuff works, but its more satisfying than "it's undefined behavior". Please feel free to edit where my explanation is lacking.

Segmentation fault Not happenign where it should [duplicate]

The behaviour of vals[8] is undefined.

It's equivalent to *(vals + 8) which is dereferencing memory outside the bounds of the array.

A "segmentation fault" is one of many things that could happen. The compiler could also eat your cat.



Related Topics



Leave a reply



Submit