What Is a Buffer Overflow and How to Cause One

What is a buffer overflow and how do I cause one?

A buffer overflow is basically when a crafted section (or buffer) of memory is written outside of its intended bounds. If an attacker can manage to make this happen from outside of a program it can cause security problems as it could potentially allow them to manipulate arbitrary memory locations, although many modern operating systems protect against the worst cases of this.

While both reading and writing outside of the intended bounds are generally considered a bad idea, the term "buffer overflow" is generally reserved for writing outside the bounds, as this can cause an attacker to easily modify the way your code runs. There is a good article on Wikipedia about buffer overflows and the various ways they can be used for exploits.

In terms of how you could program one yourself, it would be a simple matter of:

char a[4];
strcpy(a,"a string longer than 4 characters"); // write past end of buffer (buffer overflow)
printf("%s\n",a[6]); // read past end of buffer (also not a good idea)

Whether that compiles and what happens when it runs would probably depend on your operating system and compiler.

What is the difference between a buffer overflow attack and a ROP attack?

A ROP attack is one kind of payload you can deliver via a buffer-overflow vulnerability, for buffers on the stack. (Overflowing other buffers could let you overwrite other data, e.g. in a struct or nearby other globals, but not take control of the program-counter.)


A buffer overflow is when incorrect bounds checking or handling of implicit-length data (e.g. strcpy or strcat) lets malicious input write memory past the end of an array. This gets interesting when the array was allocated on the call-stack, so one of the things following it is the return address of this function.

(In theory overwriting a static variable past the end of a static array could be useful as an exploit, and that would also be a buffer overflow. But usually a buffer overflow implies a buffer on the stack, allowing the attacker to control the return address. And thus to gain control of the instruction pointer.)

As well as a new return address, your malicious data will include more data which will be in memory below and above that return address. Part of this is the payload. Controlling the return address alone is usually not sufficient: in most processes there isn't anywhere you can jump to that (without other inputs) will execve a shell listening on a TCP port, for example.

Traditionally your payload would be machine-code ("shellcode"), and the return address would be the stack address where you knew that payload would land. (+- a NOP slide so you didn't have to get it exactly right).

Stack ASLR and non-executable stacks have made the traditional shellcode injection method of exploiting a buffer overflow impossible in normal modern programs. "Buffer overflow attack" used to (I think) imply shellcode injection, because there was no need to look for more complicated attacks. But that's no longer true.


A ROP attack is when the payload is a sequence of return addresses and data to be popped by pop instructions, and/or some strings like "/bin/sh". The first return address in the payload sends execution to some already-existing bytes at a known address in an executable page.



and try to execute the function check() by giving a certain input to gets() function.

The code for check() already exists in the target program, so the simplest attack would be a ROP attack.

This is the absolute simplest form of ROP attack, where code to do exactly what you want exists at a single known address, without needing any "function args". So it makes a good example to introduce the topic.

Is this a ROP attack or a buffer overflow attack?

It's both. It's a buffer overflow to inject a ROP payload.

If the program was compiled with -z execstack -no-pie, you could also choose to inject e.g. x86 shellcode that did mov eax, imm32 / jmp eax to jump to the known absolute address of check. In that case it would be a buffer overflow but not a ROP attack; it would be a code-injection attack.

(You might not call it "shellcode" because the purpose isn't to run a shell replacing the program, but rather to do something using the existing code of the
program. But terminology is often used sloppily, so I think many people would call any injectable machine code "shellcode" regardless of what it does.)



Buffer overflow attack:

When a buffer has a certain size, fill the buffer and an add additional code so that the attacker can execute another function in the code or his/her own shellcode.

The "in the code" option would be a ROP attack. You point the return address at code which is already in memory.

The "or his/her own shellcode" option would be a code-injection attack. You point the return address at the buffer you just overflowed. (Either directly or via a ret2reg ROP attack to defeat stack ASLR, by looking for a jmp esp gadget on x86 for example.)

This "Buffer Overflow" definition is still slightly too narrow: it excludes overwriting some other critical variable (like bool user_authenticated) without overwriting a return address.

But yes, code injection and ROP attacks are the 2 main ways, with code injection normally made impossible by non-executable stack memory.

How are buffer overflows used to exploit computers?

This is the most widely known document on the subject: Smashing the Stack for Fun and Profit

However, 'stack overflows' have nothing to do with buffer overflows. Stack overflows are generally just an error case in bad code that can't be exploited for anything outside of a crash (DoS).

Edit: You also asked about heap overflows. This is a good document on the subject: http://www.w00w00.org/files/articles/heaptut.txt

What is causing the Stack Buffer Overflow?

Let's think of a case where there are two points [-1, 0] and [0, 1], and the value of k is 2. If you use map, you will get only 1 <key, value> pair for your case, because the key (in this case sqrt(1)) is same for both points. So, you need to use multimap, where you can have the same key multiple times. Read here.

The working code example based on your code:

class Solution 
{
public:
vector<vector<int>> kClosest(vector<vector<int>>& points, int k)
{
multimap<int,vector<int>> m;
vector<vector<int>> answer;
for (int i = 0; i < points.size(); i++)
{
int x = (points[i][0] * points[i][0]) + (points[i][1] * points[i][1]);
m.insert(pair<int, vector<int>>(x,{points[i][0], points[i][1]}));
}
int i = 0;
for (auto it = m.begin(); it != m.end() && i < k; it++, i++) {
//cout << it->first << " : " << it->second[0] << ", " << it->second[1] << endl;
answer.push_back(it->second);
}

return answer;
}
};

Suggestions:

  • You don't need to bother about whether to store the distance value in double or float. Since you are comparing sqrt(some_value1) with sqrt(some_value2), you can omit the sqrt altogether.
  • std::advance has a linear time complexity. You can just simply initialize the iterator at the beginning of the for loop, and increment with ++ operator. This operator has a constant time complexity. Read more about std::advance here.

C++ Buffer Overflow

I modified your program a little bit to make it more illustrative:

#include <iostream>

int main( void )
{
int authentication = 0;
char cUsername[ 10 ];
char cPassword[ 10 ];

std::cout << "Username: ";
std::cin >> cUsername;

std::cout << "Pass: ";
std::cin >> cPassword;

if( std::strcmp( cUsername, "admin" ) == 0 && std::strcmp( cPassword, "adminpass" ) == 0 )
{
authentication = 1;
}
if( authentication )
{
std::cout << "Access granted\n";
std::cout << ( char )authentication;
}
else
{
std::cout << "Wrong username and password\n";
}

return ( 0 );
}

I compiled it with x64 compiler command-line MS compiler, no optimizations. So now we have an exe that we want to "hack". We load the program with WinDbg (really good debugger) and take a look at the disassembly (notice, I've supplied full debug info, for clarity):

00000001`3f1f1710 4883ec68        sub     rsp,68h
00000001`3f1f1714 488b0515db0300 mov rax,qword ptr [Prototype_Console!__security_cookie (00000001`3f22f230)]
00000001`3f1f171b 4833c4 xor rax,rsp
00000001`3f1f171e 4889442450 mov qword ptr [rsp+50h],rax
00000001`3f1f1723 c744243800000000 mov dword ptr [rsp+38h],0 // This gives us address of "authentication" on stack.
00000001`3f1f172b 488d156e1c0300 lea rdx,[Prototype_Console!std::_Iosb<int>::end+0x78 (00000001`3f2233a0)]
00000001`3f1f1732 488d0d47f00300 lea rcx,[Prototype_Console!std::cout (00000001`3f230780)]
00000001`3f1f1739 e8fdf9ffff call Prototype_Console!ILT+310(??$?6U?$char_traitsDstdstdYAAEAV?$basic_ostreamDU?$char_traitsDstd (00000001`3f1f113b)
00000001`3f1f173e 488d542428 lea rdx,[rsp+28h] // This gives us address of "cUsername" on stack.
00000001`3f1f1743 488d0df6f00300 lea rcx,[Prototype_Console!std::cin (00000001`3f230840)]
00000001`3f1f174a e823faffff call Prototype_Console!ILT+365(??$?5DU?$char_traitsDstdstdYAAEAV?$basic_istreamDU?$char_traitsDstd (00000001`3f1f1172)
00000001`3f1f174f 488d153e1c0300 lea rdx,[Prototype_Console!std::_Iosb<int>::end+0x6c (00000001`3f223394)]
00000001`3f1f1756 488d0d23f00300 lea rcx,[Prototype_Console!std::cout (00000001`3f230780)]
00000001`3f1f175d e8d9f9ffff call Prototype_Console!ILT+310(??$?6U?$char_traitsDstdstdYAAEAV?$basic_ostreamDU?$char_traitsDstd (00000001`3f1f113b)
00000001`3f1f1762 488d542440 lea rdx,[rsp+40h] // This gives us address of "cPassword" on stack.
00000001`3f1f1767 488d0dd2f00300 lea rcx,[Prototype_Console!std::cin (00000001`3f230840)]
00000001`3f1f176e e8fff9ffff call Prototype_Console!ILT+365(??$?5DU?$char_traitsDstdstdYAAEAV?$basic_istreamDU?$char_traitsDstd (00000001`3f1f1172)
00000001`3f1f1773 488d15321c0300 lea rdx,[Prototype_Console!std::_Iosb<int>::end+0x84 (00000001`3f2233ac)]
00000001`3f1f177a 488d4c2428 lea rcx,[rsp+28h]
00000001`3f1f177f e86c420000 call Prototype_Console!strcmp (00000001`3f1f59f0)
00000001`3f1f1784 85c0 test eax,eax
00000001`3f1f1786 751d jne Prototype_Console!main+0x95 (00000001`3f1f17a5)
00000001`3f1f1788 488d15291c0300 lea rdx,[Prototype_Console!std::_Iosb<int>::end+0x90 (00000001`3f2233b8)]
00000001`3f1f178f 488d4c2440 lea rcx,[rsp+40h]
00000001`3f1f1794 e857420000 call Prototype_Console!strcmp (00000001`3f1f59f0)
00000001`3f1f1799 85c0 test eax,eax
00000001`3f1f179b 7508 jne Prototype_Console!main+0x95 (00000001`3f1f17a5)
00000001`3f1f179d c744243801000000 mov dword ptr [rsp+38h],1
00000001`3f1f17a5 837c243800 cmp dword ptr [rsp+38h],0
00000001`3f1f17aa 7426 je Prototype_Console!main+0xc2 (00000001`3f1f17d2)
00000001`3f1f17ac 488d15151c0300 lea rdx,[Prototype_Console!std::_Iosb<int>::end+0xa0 (00000001`3f2233c8)]
00000001`3f1f17b3 488d0dc6ef0300 lea rcx,[Prototype_Console!std::cout (00000001`3f230780)]
00000001`3f1f17ba e87cf9ffff call Prototype_Console!ILT+310(??$?6U?$char_traitsDstdstdYAAEAV?$basic_ostreamDU?$char_traitsDstd (00000001`3f1f113b)
00000001`3f1f17bf 0fb6542438 movzx edx,byte ptr [rsp+38h]
00000001`3f1f17c4 488d0db5ef0300 lea rcx,[Prototype_Console!std::cout (00000001`3f230780)]
00000001`3f1f17cb e825f9ffff call Prototype_Console!ILT+240(??$?6U?$char_traitsDstdstdYAAEAV?$basic_ostreamDU?$char_traitsDstd (00000001`3f1f10f5)
00000001`3f1f17d0 eb13 jmp Prototype_Console!main+0xd5 (00000001`3f1f17e5)
00000001`3f1f17d2 488d15ff1b0300 lea rdx,[Prototype_Console!std::_Iosb<int>::end+0xb0 (00000001`3f2233d8)]
00000001`3f1f17d9 488d0da0ef0300 lea rcx,[Prototype_Console!std::cout (00000001`3f230780)]
00000001`3f1f17e0 e856f9ffff call Prototype_Console!ILT+310(??$?6U?$char_traitsDstdstdYAAEAV?$basic_ostreamDU?$char_traitsDstd (00000001`3f1f113b)
00000001`3f1f17e5 33c0 xor eax,eax
00000001`3f1f17e7 488b4c2450 mov rcx,qword ptr [rsp+50h]
00000001`3f1f17ec 4833cc xor rcx,rsp
00000001`3f1f17ef e8bc420000 call Prototype_Console!__security_check_cookie (00000001`3f1f5ab0)
00000001`3f1f17f4 4883c468 add rsp,68h
00000001`3f1f17f8 c3 ret

Now, since we know how x64 stack works we can start "hacking". RSP is stack pointer, function stack is addresses above RSP value (stack grows into smaller addresses). So, we see that RSP+28h is where cUsername, RSP+38h is authentication, and RSP+40h is cPassword, where 28h, 38h and 40h are hexadecimal offsets. Here is little image to illustrate:

-----> old RSP value // Stack frame of caller of `main` is above, stack frame of main is below 

16 bytes of
"cPassword"
+40h
8 bytes of "authentication"
+38h
16 bytes of
"cUsername"
+28h

-----> RSP value = old RSP-68h

What do we see from here? We see that compiler aligned data on 8 byte boundary: for example, we asked to allocate 10 bytes for cUsername, but we got 16 bytes - x64 bit stack is aligned on 8-byte boundary, naturally. That means in order to write into authentication we need to write into cUsername MORE that 16 bytes (symbols). Notice also, that compiler put cPassword higher that authentication - we cannot overwrite authentication using cPassword, only cUsername.

So now we run our program and input Username: 0123456789abcdef1. 0123456789abcdef = 16 bytes, the next 1 is going to be put into lower byte of authentication - good enough for us:

Username: 0123456789abcdef1
Pass: whatever
Access granted
1

Why is this code vulnerable to buffer overflow attacks?

On most compilers the maximum value of an unsigned short is 65535.

Any value above that gets wrapped around, so 65536 becomes 0, and 65600 becomes 65.

This means that long strings of the right length (e.g. 65600) will pass the check, and overflow the buffer.


Use size_t to store the result of strlen(), not unsigned short, and compare len to an expression that directly encodes the size of buffer. So for example:

char buffer[100];
size_t len = strlen(str);
if (len >= sizeof(buffer) / sizeof(buffer[0])) return -1;
memcpy(buffer, str, len + 1);


Related Topics



Leave a reply



Submit