User-Space Memory Editing Programs

User-space memory editing programs

Gaining access to another processes memory under linux is fairly straightforward (assuming you have sufficient user privileges).

For example the file /dev/mem will provide access to the entire memory space of cpu. Details of the mappings for an individual process can be found in /proc/<pid>/maps.

Another example has been given here.

How (and why?) do memory editors work

On Windows, the function typically used to alter the memory of another process is called WriteProcessMemory:

https://learn.microsoft.com/en-in/windows/win32/api/memoryapi/nf-memoryapi-writeprocessmemory

If you search the Cheat Engine source code for WriteProcessMemory you can find it both in their Pascal code and the C kernel code. It needs PROCESS_VM_WRITE and PROCESS_VM_OPERATION access to the process which basically means you need to run Cheat Engine as admin.

WriteProcessMemory is used any time you want to alter the runtime behavior of another process. There are legitimate uses, such as with Cheat Engine or ModOrganizer, and of course lots of illegitimate ones. It's worth mentioning that anti-virus software is typically trained to look for this API call (among others) so unless your application has been whitelisted it might get flagged because of it.

How to prevent memory editing to prevent hooking

I think that you can't protect from hooking entirely. Still hacker can modify you executable file on disk in such a way that it won't install hooks. Or he just can install hooks inside your executable himself.

To prevent this actually there are a lot of techniques. For example, you can do a lot of checks for modification of your program. You can do hidden checks for modification of particular sections of code in program and you can obfuscate your code. There are many other techniques, that are usually combined to make an efficient software protection system. But none of the ways can make your code fully protected from hacker's modifications. If there was such a protection, no software/game would be pirated. Now pirating software which is distributed to clients is only a matter of difficulty. The more complicated protection system is, the more time it takes to hack it. But it's never impossible.

Accessing memory used by other program

It crashes because 0x11111111 does not point to a valid address within your app's memory space.

As for cheat engine, there are a couple of ways to access another program's memory:

1) run code inside the target process's memory space. There are various ways to inject code into another process using SetWindowsHookEx() or CreateRemoteThread().

2) use ReadProcessMemory() and WriteProcessMemory()

How to correct relative addressing after memory copy function to user space?

May the -fno-pic compiler option be what you want?

Kernel mode - user mode communication via shared-memory without using system threads

One of the problems with using shared memory for communication is that there's no cooperation with the scheduler. Specifically; there's no way for a task to block (so it uses no CPU time) until data arrives/changes and then unblock when data does arrive/change.

For communication between user-space and kernel you'd have the same problem. E.g. user-space code modifies data in the "shared with kernel" memory, and the kernel doesn't know if/when that happened; so (to avoid the kernel wasting CPU time constantly polling the shared memory) user-space code has to use a normal kernel call to say "Hey, look at the shared memory now!", but if you're using a normal kernel call anyway then you could just pass a pointer to the data without using shared memory.

Another problem with using shared memory for communication is security risks. Specifically, a task can see that data arrived/changed, validate the new data to make sure it's acceptable, then malicious attackers can change the data after it was validated, then the task can act on the "validated" data. A very simple example would be something like "if(new_value_in_shared_memory < MAX_VALUE) { myPrivateArray[new_value_in_shared_memory]++;" (where the malicious attacker changes new_value_in_shared_memory after it was checked, tricking the task into modifying something that it shouldn't). For tasks that trust each other (e.g. a process communicating with an fork() of itself) this isn't a problem at all; and when participants don't trust each other (e.g. kernel doesn't trust user-space code) it's major pain (extremely easy to make a mistake and get pawned). The easiest way to guard against this kind of attack is to copy the data from the shared memory into a private buffer, then validate it (knowing that its impossible for attacker to modify the copy), then act on the data in the copy. This copying adds overhead and mostly ruins any performance advantages of shared memory.

For the "user-space communicating with kernel" case there are a few possible alternatives - the kernel can suspend all threads that can access the shared memory during the "validate then use" phase (which would be a performance disaster, especially for multi-CPU systems); and the kernel could do virtual memory management tricks (e.g. set the underlying pages to a "page fault if user-space tries to modify it" state) during the "validate then use" phase (which would be a performance disaster, especially for multi-CPU systems).

Note that the same kind of "modified after validated" security risk occurs for "kernel call accepts pointer to data from user-space" and for "kernel call relies on data from user-space task's stack". However; for both of these cases (which don't involve shared memory but do involve "kernel accesses task's normal memory") typically the kernel doesn't actually access the data and only transfers it. For example, for a write() the kernel might forward the data to file system code (without touching the data itself), which might forward the data to a storage device driver (without touching the data itself), which might transfer the data to a hard drive (without touching the data itself).

How does the referencing of objects and variables in programs work?

How does a program find empty memory to store new variables/objects in physical memory.

Modern operating systems use logical address translation. A process sees a range of logical addresses—its address space. The system hardware breaks the address range into pages. The size of the page is system dependent and is often configurable. The operating system manages page tables that map logical pages to physical page frames of the same size.

The address space is divided into a range of pages that is the system space, shared by all processes, and a user space, that is generally unique to each process.

Within the user and system spaces, pages may be valid or invalid. An invalid page has not yet been mapped to the process address space. Most pages are likely to be invalid.

Memory is always allocated from the operating system image pages. The operating system will have system services that transform invalid pages into valid pages with mappings to physical memory. In order to map pages, the operating system needs to find (or the application needs to specify) a range of pages that are invalid and then has to allocate physical page frames to map to the those pages. Note that physical page frames do not have to be mapped contiguously to logical pages.

You mention stacks and heaps. Stacks and heap are just memory. The operating system cannot tell whether memory is a stack, heap or something else. User mode libraries for memory allocation (such as those that implement malloc/free) allocate memory in pages to create heaps. The only thing that makes this memory a heap is that there is a heap manager controlling it. The heap manager can then allocate smaller blocks of memory from the pages allocated to the heap.

A stack is simpler. It is just a contiguous range of pages. Typically an operating system service that creates a thread or process will allocate a range of pages for a stack and assign the hardware stack pointer register to the high end of the stack range.

How does a program know where an object starts and where an object ends in memory. With number variables I can imagine there is a few extra information provided in memory that show the porgram how many bits the variable occupies, but correct me if I'm wrong.

This depends upon how the program is created and how the object is created in memory. For typed languages, the linker binds variables to addresses. The linker also generates instruction for mapping those addresses to the address space. For stack/auto variables, the compiler generates offsets from a pointer to the stack. When a function/subroutine gets called, the compiler generates code to allocate the memory required by the procedure, which it does by simply subtracting from the stack pointer. The memory gets freed by simply adding that value back to the stack pointer.

In the case of typeless languages, such as assembly language or Bliss, the programmer has to keep track of the type for each location. When memory is dynamically, the programmer also has to keep track of the type. Most programming languages help this out by having pointers with types.

This is similar to my first question, but: when a variable has a value representd only by zeros, how does the program not confuse that with free memory.

Free memory is invalid. Accessing free memory causes a hardware exception.

Does the object value null mean that the address of an object is a bunch of 0's or does the object point to litterally nothing? And if so, how is the "reference" stored to assign it an address later on?

The linker defines the initial state of a program's user address space. Most linkers do not map the first page (or even more than one page). That page is then invalid. That means a null pointer, as you say, references absolutely nothing. If you try to dereference a null pointer you will usually get some kind of access violation exception

Most operating system will allow the user to map the first page. Some linkers will allow the user to override the default setting and map the first page. This is not commonly done as it makes detecting memory error difficult.

Kernel Mode memory size for an x86 LARGEADDRESSAWARE program on an x64 machine?

There's a few misconceptions that are confusing you.

First, let's look at 32-bit Windows. The virtual address space for each process has a certain part allocated to the process itself, and a certain part for whatever the kernel needs. However, all the processes share the same kernel memory - the fact that you even have kernel memory in your own virtual address space is basically a performance optimization to avoid having to switch address spaces when dealing with kernel objects and data in your application.

By default, this is a 1:1 split, so you get 2 GiB of user address space and 2 GiB of kernel address space. This was (ab)used by early 32-bit Windows software (when your computer might have had as little as 4 MiB of memory total with a 486 CPU or similar), because due to the way the memory was laid out, your user address space never had any pointers above the 2 GiB barriers - effectively giving you the highest bit of any pointer free for your own data. Often this was used to allow for a hybrid "if it fits, this is a value, otherwise it's a pointer to a structure" approach, saving memory and a bit of indirection. Since this is so wide-spread, the default has been the same split as in the early days to prevent compatibility issues. However, you also have a way to opt-in to a different split - 3 GiB of user space and 1 GiB of kernel space. This is what the /3GB option does. But that's not enough - your application must also opt-in using /LARGEADDRESSAWARE. This basically says "I don't do weird stuff with my pointers".

It should be noted that 32-bit OS or process doesn't necessarily mean you can only address 4 GiB of memory - it just limits what's directly accessible to the CPU at any point. For memory intensive server software, even the "32-bit" versions may have support for addressing a lot more memory - for example, the 32-bit MS SQL Server supports up to 64 GiB through AWE. This is basically another layer of virtualization which allows remapping the virtual address' physical addresses. In theory, there is no limit to the amount of memory you can address, with or without AWE - after all, nothing is preventing you from having your own hardware that acts as a memory mapped file, effectively giving you unlimited address space. Of course, like the days of segmented memory, it's not very easy to work with or practical :)

On 64-bit Windows, the /3GB no longer makes any sense and is ignored. The default address space split depends on the exact version of Windows, but is in the "terabytes and more" range, way out of the 32-bit limits. For modern Windows, this is usually 128 TiB user + 128 TiB kernel. 32-bit applications still have to use /LARGEADDRESSAWARE as before. However, since the kernel is now 64-bit, it can't be in the same address space as the user process anyway, so a 32-bit application on a 64-bit OS has full access to the 4 GiB of address space.

Of course, those limits are still well below what 64-bit is theoretically capable of addressing. However, most 64-bit CPUs actually can't address the whole 64-bit address space - the most common the last time I checked was just 48-bit. And surprise, surprise - that gives you 256 TiB of address space, the limit in Windows. Not a Microsoft conspiracy after all! :) This isn't something new, actually. The fact that Intel x86's 32-bit ALU is associated with a 32-bit address space is quite an outlier in CPU history - CPUs often have either higher and lower address space (for either virtual addressing or physical addressing) width than their ALU-width. The MS DOS typical limit of 1 MiB of addressable memory (with 640 kiB left over to user applications) comes from this as well - the "32-bit" CPUs of the time could only use 20-bit addresses.

User-Space Memory Editing Programs