Heap Corruption Under Win32; How to Locate

Heap corruption under Win32; how to locate?

My first choice would be a dedicated heap tool such as pageheap.exe.

Rewriting new and delete might be useful, but that doesn't catch the allocs committed by lower-level code. If this is what you want, better to Detour the low-level alloc APIs using Microsoft Detours.

Also sanity checks such as: verify your run-time libraries match (release vs. debug, multi-threaded vs. single-threaded, dll vs. static lib), look for bad deletes (eg, delete where delete [] should have been used), make sure you're not mixing and matching your allocs.

Also try selectively turning off threads and see when/if the problem goes away.

What does the call stack etc look like at the time of the first exception?

Finding heap corruption

Use the debug version of the Microsoft runtime libraries. Turn on red-zoning and get your heap automatically checked every 128 (say) heap operations by calling _CrtSetDbgFlag() once during initialisation.

_CRTDBG_DELAY_FREE_MEM_DF can be quite useful for finding memory-used-after-free bugs, but your heap size grows monitonically while using it.

Immediate detection of heap corruption errors on Windows. How?

Can it produce any false positives?

So, this will only catch bugs of the class "use after free()". For that purpose, I think, it's reasonably good.

If you try to delete something that wasn't new'ed, that's a different type of bug. In delete you should first check if the memory has been indeed allocated. You shouldn't be blindly freeing the memory and marking it as inaccessible. I'd try to avoid that and report (by, say, doing a debug break) when there's an attempt to delete something that shouldn't be deleted because it was never new'ed.

Can it miss some of the heap corruptions? (even if we replace malloc/realloc/free?)

Obviously, this won't catch all corruptions of heap data between new and and the respective delete. It will only catch those attempted after delete.

E.g.:

myObj* = new MyObj(1,2,3);
// corruption of *myObj happens here and may go unnoticed
delete myObj;

It fails to run on 32-bit target with OUT_OF_MEMORY error, only on 64-bit. Am I right that we simply run out of the virtual address space on 32-bits?

Typically you have available about ~2GB of the virtual address space on a 32-bit Windows. That's good for at most ~524288 new's like in the provided code. But with objects bigger than 4KB, you'll be able to successfully allocate fewer instances than that. And then address space fragmentation will reduce that number further.

It's a perfectly expected outcome if you create many object instances during the life cycle of your program.

How to debug heap corruption errors?

Application Verifier combined with Debugging Tools for Windows is an amazing setup. You can get both as a part of the Windows Driver Kit or the lighter Windows SDK. (Found out about Application Verifier when researching an earlier question about a heap corruption issue.) I've used BoundsChecker and Insure++ (mentioned in other answers) in the past too, although I was surprised how much functionality was in Application Verifier.

Electric Fence (aka "efence"), dmalloc, valgrind, and so forth are all worth mentioning, but most of these are much easier to get running under *nix than Windows. Valgrind is ridiculously flexible: I've debugged large server software with many heap issues using it.

When all else fails, you can provide your own global operator new/delete and malloc/calloc/realloc overloads -- how to do so will vary a bit depending on compiler and platform -- and this will be a bit of an investment -- but it may pay off over the long run. The desirable feature list should look familiar from dmalloc and electricfence, and the surprisingly excellent book Writing Solid Code:

sentry values: allow a little more space before and after each alloc, respecting maximum alignment requirement; fill with magic numbers (helps catch buffer overflows and underflows, and the occasional "wild" pointer)
alloc fill: fill new allocations with a magic non-0 value -- Visual C++ will already do this for you in Debug builds (helps catch use of uninitialized vars)
free fill: fill in freed memory with a magic non-0 value, designed to trigger a segfault if it's dereferenced in most cases (helps catch dangling pointers)
delayed free: don't return freed memory to the heap for a while, keep it free filled but not available (helps catch more dangling pointers, catches proximate double-frees)
tracking: being able to record where an allocation was made can sometimes be useful

Note that in our local homebrew system (for an embedded target) we keep the tracking separate from most of the other stuff, because the run-time overhead is much higher.

If you're interested in more reasons to overload these allocation functions/operators, take a look at my answer to "Any reason to overload global operator new and delete?"; shameless self-promotion aside, it lists other techniques that are helpful in tracking heap corruption errors, as well as other applicable tools.

Because I keep finding my own answer here when searching for alloc/free/fence values MS uses, here's another answer that covers Microsoft dbgheap fill values.

Need Sample code for heap corruption in C++

This ought to corrupt heap:

char *cp = new char[10];
(*(cp - 5))++;

That should corrupt the header in front of the allocated memory block. It should also give you an idea on hiw to create specific kinds of corrupted header data, if you look up the structure of the header created by your compiler.

You might want to experiment with optimizations disabled, as this is undefined behavior, and optimizer might do some funny things with UB code. When in doubt, examine the assembly output of compiler or machine code disassembly in debugger to see what code got generated.

How to detect heap corruption errors under MinGW?

There is a tool provided by Microsoft called Application Verifier. It is a gui tool that changes system settings to run selected applications in a controlled environment. This makes it possible to crash your program if it causes detectable memory errors. This is a controlled crash that can be debugged.

Fortunately it is obtainable from Microsoft as a separate download. Another way to get it is to have Windows SDK installed with checked Application Verifier checkbox. SDK offers also an option Application Verifier redistributable.

After you configure Application Verifier to have an eye for your app, you need to debug it. Debugging under MinGW is a more common subject, already explained on stackoverflow. [mingw] [debugging] query on stackoverflow gives interesting articles. One of them is How do I use the MinGW gdb debugger to debug a C++ program in Windows?. Gdb is the one I used.

The general questions How to debug heap corruption errors? and Heap corruption detection tool for C++ were helpful to find this tool, but I wasn't sure if it is compatible with MinGW. It is.

What's the best way of finding a heap corruption that only occurs under a performance test?

The best tool is Appverifier in combination with gFlags but there are many other solutions that may help.

For example, you could specify a heap check every 16 malloc, realloc, free, and _msize operations with the following code:

#include <crtdbg.h>
int main( )
{
int tmp;

// Get the current bits
tmp = _CrtSetDbgFlag(_CRTDBG_REPORT_FLAG);

// Clear the upper 16 bits and OR in the desired freqency
tmp = (tmp & 0x0000FFFF) | _CRTDBG_CHECK_EVERY_16_DF;

// Set the new bits
_CrtSetDbgFlag(tmp);
}

MFC: Why does this heap corruption happen? (array_s.cpp / afxcoll.inl)

It turned out that release and debug dlls were both being loaded (because of another dll):

msvcr100.dll
msvcr100d.dll
msvcp100.dll
msvcp100d.dll

The "modules" window sure does help, if you only know that you should look there.

Thanks to PaulMcKenzie and IInspectable for leading me into the right direction.

Heap Corruption Under Win32; How to Locate