What Is Memory Fragmentation

What is memory fragmentation?

Imagine that you have a "large" (32 bytes) expanse of free memory:

----------------------------------
| |
----------------------------------

Now, allocate some of it (5 allocations):

----------------------------------
|aaaabbccccccddeeee |
----------------------------------

Now, free the first four allocations but not the fifth:

----------------------------------
| eeee |
----------------------------------

Now, try to allocate 16 bytes. Oops, I can't, even though there's nearly double that much free.

On systems with virtual memory, fragmentation is less of a problem than you might think, because large allocations only need to be contiguous in virtual address space, not in physical address space. So in my example, if I had virtual memory with a page size of 2 bytes then I could make my 16 byte allocation with no problem. Physical memory would look like this:

----------------------------------
|ffffffffffffffeeeeff |
----------------------------------

whereas virtual memory (being much bigger) could look like this:

------------------------------------------------------...
| eeeeffffffffffffffff
------------------------------------------------------...

The classic symptom of memory fragmentation is that you try to allocate a large block and you can't, even though you appear to have enough memory free. Another possible consequence is the inability of the process to release memory back to the OS (because each of the large blocks it has allocated from the OS, for malloc etc. to sub-divide, has something left in it, even though most of each block is now unused).

Tactics to prevent memory fragmentation in C++ work by allocating objects from different areas according to their size and/or their expected lifetime. So if you're going to create a lot of objects and destroy them all together later, allocate them from a memory pool. Any other allocations you do in between them won't be from the pool, hence won't be located in between them in memory, so memory will not be fragmented as a result. Or, if you're going to allocate a lot of objects of the same size then allocate them from the same pool. Then a stretch of free space in the pool can never be smaller than the size you're trying to allocate from that pool.

Generally you don't need to worry about it much, unless your program is long-running and does a lot of allocation and freeing. It's when you have mixtures of short-lived and long-lived objects that you're most at risk, but even then malloc will do its best to help. Basically, ignore it until your program has allocation failures or unexpectedly causes the system to run low on memory (catch this in testing, for preference!).

The standard libraries are no worse than anything else that allocates memory, and standard containers all have an Alloc template parameter which you could use to fine-tune their allocation strategy if absolutely necessary.

Memory fragmentation

As per ISO/IEC 9899:201x -> 7.22.3

The order and contiguity of storage allocated by successive calls to
the aligned_alloc, calloc, malloc, and realloc functions is
unspecified.

A good memory manager will be able to tackle the issue to an extent. However, there are other aspects like data alignment [1] which causes internal fragmentation.

What you could do if you rely on inbuilt memory management?

  1. Use a profiler - say valgrind - with memory check option to find the memory which is not freed after use.
    Example:


     valgrind --leak-check=yes myprog arg1 arg2
  2. Follow good practices. Example - In C++, if you intend others to inherit from your polymorphic class, you may declare its destructor virtual.

  3. Use smart pointers.

Notes:

  1. Internal fragmentation.

  2. If you were to use your own memory management system, you may consider Boehm-Demers-Weiser garbage collector.

  3. Valgrind Instrumentation Framework.

  4. Memory not freed after use will contribute to fragmentation.

What causes memory fragmentation in .NET

You know, I somewhat doubt the memory profiler here. The memory management system in .NET actually tries to defragment the heap for you by moving around memory (that's why you need to pin memory for it to be shared with an external DLL).

Large memory allocations taken over longer periods of time is prone to more fragmentation. While small ephemeral (short) memory requests are unlikely to cause fragmentation in .NET.

Here's also something worth thinking about. With the current GC of .NET, memory allocated close in time, is typically spaced close together in space. Which is the opposite of fragmentation. i.e. You should allocate memory the way you intend to access it.

Is it a managed code only or does it contains stuff like P/Invoke, unmanaged memory (Marshal.AllocHGlobal) or stuff like GCHandle.Alloc(obj, GCHandleType.Pinned)?

How to solve Memory Fragmentation

First, I agree with the other posters who suggested a resource leak. You really want to rule that out first.

Hopefully, the heap manager you are currently using has a way to dump out the actual total free space available in the heap (across all free blocks) and also the total number of blocks that it is divided over. If the average free block size is relatively small compared to the total free space in the heap, then you do have a fragmentation problem. Alternatively, if you can dump the size of the largest free block and compare that to the total free space, that will accomplish the same thing. The largest free block would be small relative to the total free space available across all blocks if you are running into fragmentation.

To be very clear about the above, in all cases we are talking about free blocks in the heap, not the allocated blocks in the heap. In any case, if the above conditions are not met, then you do have a leak situation of some sort.

So, once you have ruled out a leak, you could consider using a better allocator. Doug Lea's malloc suggested in the question is a very good allocator for general use applications and very robust most of the time. Put another way, it has been time tested to work very well for most any application. However, no algorithm is ideal for all applications and any management algorithm approach can be broken by the right pathelogical conditions against it's design.

Why are you having a fragmentation problem? - Sources of fragmentation problems are caused by the behavior of an application and have to do with greatly different allocation lifetimes in the same memory arena. That is, some objects are allocated and freed regularly while other types of objects persist for extended periods of time all in the same heap.....think of the longer lifetime ones as poking holes into larger areas of the arena and thereby preventing the coalesce of adjacent blocks that have been freed.

To address this type of problem, the best thing you can do is logically divide the heap into sub arenas where the lifetimes are more similar. In effect, you want a transient heap and a persistent heap or heaps that group things of similar lifetimes.

Some others have suggested another approach to solve the problem which is to attempt to make the allocation sizes more similar or identical, but this is less ideal because it creates a different type of fragmentation called internal fragmentation - which is in effect the wasted space you have by allocating more memory in the block than you need.

Additionally, with a good heap allocator, like Doug Lea's, making the block sizes more similar is unnecessary because the allocator will already be doing a power of two size bucketing scheme that will make it completely unnecessary to artificially adjust the allocation sizes passed to malloc() - in effect, his heap manager does that for you automatically much more robustly than the application will be able to make adjustments.

Avoid memory fragmentation when memory pools are a bad idea

If everything really is/works as you say it does and there is no bug you have not yet found, then try this:

malloc and other memory allocation usually uses chunks of 16 bytes anyway, even if the actual requested size is smaller than 16 bytes. So you only need 4000/16 - 30/16 ~ 250 different memory pools.

const int chunk_size = 16;

memory_pools pool[250]; // 250 memory pools, managing '(idx+1)*chunk_size' size

char* reserve_mem(size_t sz)
{
size_t pool_idx_to_use = sz/chunk_size;
char * rc=pool[pool_idx_to_use].allocate();
}

IOW, you have 250 memory pools. pool[0] allocates and manages chunk with a length of 16 bytes. pool[100] manages chunks with 1600 bytes etc...

If you know the length distribution of your strings in advance, you can reserve initial memory for the pools based on this knowledge. Otherwise I'd just probably reserve memory for the pools in 4096 bytes increment.

Because while the malloc C heap usually allocates memory in multiple of 16 bytes it will (at least unter Windows, but I am guessing, Linux is similar here) ask the OS for memory - which usually works with 4K pages. IOW, the "outer" memory heap managed by the operating system reserves and frees 4096 bytes.

So increasing your own internal memory pool in 4096 bytes means no fragmentation in the OS app heap. This 4096 page size (or multiple of...) comes from the processor architecture. Intel processors have a builtin page size of 4K (or multiple of). Don't know about other processors, but I suspect similar architectures there.

So, to sum it up:

Use chunks of multiple of 16 bytes for your strings per memory pool.
Use chunks of multiple of 4K bytes to increase your memory pool.

That will align the memory use of your application with the memory management of the OS and avoid fragmentation as much as possible.

From the OS point of view, your application will only increment memory in 4K chunks. That's very easy to allocate and release. And there is no fragmentation.

From the internal (lib) C heap management point of view, your application will use memory pools and waste at most 15 bytes per string. Also all similar length allocations will be heaped together, so also no fragmentation.

Does memory fragmentation slows down New/Malloc?

Memory allocation is platform specific, depending on the platform.

I would say "Yes, new takes time to allocate memory. How much time depends on many factors, such as algorithm, level of fragmentation, processor speed, optimizations, etc.

The best answer for how much time is taken, is to profile and measure. Write a simple program that fragments the memory, then measure the time for allocating memory.

There is no direct method for a program to find out the difficulty of finding available memory locations. You may be able to read a clock, allocate memory, then read again. Another idea is to set a timer.

Note: in many embedded systems, dynamic memory allocation is frowned upon. In critical systems, fragmentation can be the enemy. So fixed sized arrays are used. Fixed sized memory allocations (at compile time) remove fragmentation as an defect issue.

Edit 1: The Search

Usually, memory allocation requires a call to a function. The impact of the this is that the processor may have to reload its instruction cache or pipeline, consuming extra processing time. There also may be extra instruction for passing parameters such as the minimal size. Local variables and allocations at compile time usually don't need a function call for allocation.

Unless the allocation algorithm is linear (think array access), it will require steps to find an available slot. Some memory management algorithms use different strategies based on the requested size. For example, some memory managers may have separate pools for sizes of 64-bits or smaller.

If you think of a memory manager as having a linked list of blocks, the manager will need to find the first block greater than or equal in size to the request. If the block is larger than the requested size, it may be split and the left over memory is then created into a new block and added to the list.

There is no standard algorithm for memory management. They differ based on the needs of the system. Memory managers for platforms with restricted (small) sizes of memory will be different than those that have large amounts of memory. Memory allocation for critical systems may be different than those for non-critical systems. The C++ standard does not mandate the behavior of a memory manager, only some requirements. For example, the memory manager is allowed to allocate from a hard drive, or a network device.

The significance of the impact depends on the memory allocation algorithm. The best path is to measure the performance on your target platform.

Dealing with fragmentation in a memory pool?

I wanted to add my 2 cents only because no one else pointed out that from your description it sounds like you are implementing a standard heap allocator (i.e what all of us already use every time when we call malloc() or operator new).

A heap is exactly such an object, that goes to virtual memory manager and asks for large chunk of memory (what you call "a pool"). Then it has all kinds of different algorithms for dealing with most efficient way of allocating various size chunks and freeing them. Furthermore, many people have modified and optimized these algorithms over the years. For long time Windows came with an option called low-fragmentation heap (LFH) which you used to have to enable manually. Starting with Vista LFH is used for all heaps by default.

Heaps are not perfect and they can definitely bog down performance when not used properly. Since OS vendors can't possibly anticipate every scenario in which you will use a heap, their heap managers have to be optimized for the "average" use. But if you have a requirement which is similar to the requirements for a regular heap (i.e. many objects, different size....) you should consider just using a heap and not reinventing it because chances are your implementation will be inferior to what OS already provides for you.

With memory allocation, the only time you can gain performance by not simply using the heap is by giving up some other aspect (allocation overhead, allocation lifetime....) which is not important to your specific application.

For example, in our application we had a requirement for many allocations of less than 1KB but these allocations were used only for very short periods of time (milliseconds). To optimize the app, I used Boost Pool library but extended it so that my "allocator" actually contained a collection of boost pool objects, each responsible for allocating one specific size from 16 bytes up to 1024 (in steps of 4). This provided almost free (O(1) complexity) allocation/free of these objects but the catch is that a) memory usage is always large and never goes down even if we don't have a single object allocated, b) Boost Pool never frees the memory it uses (at least in the mode we are using it in) so we only use this for objects which don't stick around very long.

So which aspect(s) of normal memory allocation are you willing to give up in your app?



Related Topics



Leave a reply



Submit