Access Array Beyond the Limit in C and C++

Access array beyond the limit in C and C++

Accessing outside the array bounds is undefined behavior, from the c99 draft standard section Annex J.2 J.2 Undefined behavior includes the follow point:

An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).

and the draft C++ standard in section 5.7 Additive operators paragraph 5 says:

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. [...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

For completeness sake, section 5.2.1 Subscripting paragraph 1 says:

[...]The expression E1[E2] is identical (by definition) to *((E1)+(E2)) [ Note: see 5.3 and 5.7 for details of * and + and 8.3.4 for details of arrays. —end note ]

It is important to note that the compiler is not required to produce a warning(diagnostic) for undefined behavior, the draft C++ standard in section 1.4 Implementation compliance paragraph 1 says:

The set of diagnosable rules consists of all syntactic and semantic rules in this International Standard except for those rules containing an explicit notation that “no diagnostic is required” or which are described as resulting in “undefined behavior.”

(C) Why can I access array elements beyond the given limit?

When you create on array on C, the program allocates the memory you need for that and gives you the pointer for the first element. So when you say array[0], what you are doing is summing 0 to the base pointer of that array, therefore array[1] is increasing 1(4 bytes to be more precise) to the inicial pointer, so you can see the 2 element and so on (Dont forget that the array is a continous segment of memory, every value is next to his previous one). If you try to reach a position out of the array, the program will not crash, what it will do is read the memory from where it is pointing, which in most cases will most probably be garbish, but C has no problem with it, this language allows you to do pretty much everything!

Hope it helps :)

Accessing an array out of bounds gives no error, why?

Welcome to every C/C++ programmer's bestest friend: Undefined Behavior.

There is a lot that is not specified by the language standard, for a variety of reasons. This is one of them.

In general, whenever you encounter undefined behavior, anything might happen. The application may crash, it may freeze, it may eject your CD-ROM drive or make demons come out of your nose. It may format your harddrive or email all your porn to your grandmother.

It may even, if you are really unlucky, appear to work correctly.

The language simply says what should happen if you access the elements within the bounds of an array. It is left undefined what happens if you go out of bounds. It might seem to work today, on your compiler, but it is not legal C or C++, and there is no guarantee that it'll still work the next time you run the program. Or that it hasn't overwritten essential data even now, and you just haven't encountered the problems, that it is going to cause — yet.

As for why there is no bounds checking, there are a couple aspects to the answer:

  • An array is a leftover from C. C arrays are about as primitive as you can get. Just a sequence of elements with contiguous addresses. There is no bounds checking because it is simply exposing raw memory. Implementing a robust bounds-checking mechanism would have been almost impossible in C.
  • In C++, bounds-checking is possible on class types. But an array is still the plain old C-compatible one. It is not a class. Further, C++ is also built on another rule which makes bounds-checking non-ideal. The C++ guiding principle is "you don't pay for what you don't use". If your code is correct, you don't need bounds-checking, and you shouldn't be forced to pay for the overhead of runtime bounds-checking.
  • So C++ offers the std::vector class template, which allows both. operator[] is designed to be efficient. The language standard does not require that it performs bounds checking (although it does not forbid it either). A vector also has the at() member function which is guaranteed to perform bounds-checking. So in C++, you get the best of both worlds if you use a vector. You get array-like performance without bounds-checking, and you get the ability to use bounds-checked access when you want it.

Array index out of bound behavior

The problem is that C/C++ doesn't actually do any boundary checking with regards to arrays. It depends on the OS to ensure that you are accessing valid memory.

In this particular case, you are declaring a stack based array. Depending upon the particular implementation, accessing outside the bounds of the array will simply access another part of the already allocated stack space (most OS's and threads reserve a certain portion of memory for stack). As long as you just happen to be playing around in the pre-allocated stack space, everything will not crash (note i did not say work).

What's happening on the last line is that you have now accessed beyond the part of memory that is allocated for the stack. As a result you are indexing into a part of memory that is not allocated to your process or is allocated in a read only fashion. The OS sees this and sends a seg fault to the process.

This is one of the reasons that C/C++ is so dangerous when it comes to boundary checking.

Index limit for access array with pointer arithmetic in C

Most modern operating systems use a flat memory model, which means that a pointer is able to represent any address in the virtual address space.

On 64-bit systems, pointers are 64 bits, on 32-bit systems, they are 32 bits.

On 32-bit operating systems, the virtual address space of every process is 2^32 bytes, which is several billion bytes (i.e. several gigabytes).

On 64-bit operating systems, the virtual address space of every process is 2^64 bytes, which is an astronomically high number.

So what does happen if I try to access *(array+i) where i is bigger than the maximum [pointer]?

You will never encounter such a situation on systems with a flat memory model, since a pointer is always large enough to represent any address in a process' virtual address space.

It may be theoretically possible for you to encounter such a situation on a system with a segmented memory model. However, that is certainly not the reason why you are having the problem you describe.

As I understand pointers are always integers so the last entry of the array I can access would be the maximum integer value of that specific system.

The last element of the array that you can legally access depends on how much memory you allocated for the array.

Using another variable for array index limit in C++

Here you go:

const int limit = 10;
int data_1[limit], data_2[limit], data_3[limit];

limit must be a const

EDIT:
As other answers have mentioned, limit could also simply be defined through a preprocessing step, like so:

#define LIMIT 10 // Usually preprocessor-defined variables are in all caps


Related Topics



Leave a reply



Submit