Unexplainable core dump
So, unlikely as it may seem, we appear to have hit an actual bona-fide CPU bug.
http://support.amd.com/us/Processor_TechDocs/41322_10h_Rev_Gd.pdf has erratum #721:
721 Processor May Incorrectly Update Stack Pointer
Description
Under a highly specific and detailed set of internal timing conditions,
the processor may incorrectly update the stack pointer after a long series
of push and/or near-call instructions, or a long series of pop
and/or near-return instructions. The processor must be in 64-bit mode for
this erratum to occur.
Potential Effect on System
The stack pointer value jumps by a value of approximately 1024, either in
the positive or negative direction.
This incorrect stack pointer causes unpredictable program or system behavior,
usually observed as a program exception or crash (for example, a #GP or #UD).
Analyzing core dump generated by multiple applications with gdb
I think you cannot achieve what you want with a single invocation of gdb
. But you could run gdb
twice, in different terminal windows. I did that more than once, and it works quite well (except of course that your own brain could be slightly overloaded).
a gdb
process can debug only one single program, with one single debugged process or (for post mortem debug) one single core
file.
And a given core
file is produced by abnormal termination of one single process (not several), so I don't understand your question.
Apparently, you have a crash in some execution of python
probably augmented by your faulty C code. I suggest having a debuggable variant of Python, perhaps by installing the python3-all-dbg
package or something similar, then use gdb
on it. Of course, compile your C code plugged into Python with debugging enabled. Perhaps you violated some invariant of the Python garbage collector.
How to generate a core dump in Linux on a segmentation fault?
This depends on what shell you are using. If you are using bash, then the ulimit command controls several settings relating to program execution, such as whether you should dump core. If you type
ulimit -c unlimited
then that will tell bash that its programs can dump cores of any size. You can specify a size such as 52M instead of unlimited if you want, but in practice this shouldn't be necessary since the size of core files will probably never be an issue for you.
In tcsh, you'd type
limit coredumpsize unlimited
Dumping only stack trace in linux core dumps
You can set /proc/$PID/coredump_filter
to 0x10
.
See http://man7.org/linux/man-pages/man5/core.5.html
Large unexplained memory in the memory dump of a .NET process
After investigation, the problem happens to be heap fragmentation because of pinned buffers. I'll explain how to investigate and what pinned buffers are.
All profilers I've used agreed to say most of the heap is free. Now I needed to look at fragmentation. I can do it with WinDbg for example:
!dumpheap -stat
Then I looked at the "Fragmented blocks larger than..." section. WinDbg says objects lie between the free blocks making compaction impossible. Then I looked at what is holding these objects and if they are pinned, here for example object at address 0000000bfaf93b80:
!gcroot 0000000bfaf93b80
It displays the reference graph:
00000004082945e0 (async pinned handle)
-> 0000000535b3a3e0 System.Threading.OverlappedData
-> 00000006f5266d38 System.Threading.IOCompletionCallback
-> 0000000b35402220 System.Net.Sockets.SocketAsyncEventArgs
-> 0000000bf578c850 System.Net.Sockets.Socket
-> 0000000bf578c900 System.Net.SocketAddress
-> 0000000bfaf93b80 System.Byte[]
00000004082e2148 (pinned handle)
-> 0000000bfaf93b80 System.Byte[]
The last two lines tell you the object is pinned.
Pinned objects are buffers than can't be moved because their address is shared with non-managed code. Here you can guess it is the system TCP layer. When managed code needs to send the address of a buffer to external code, it needs to "pin" the buffer so that the address remains valid: the GC cannot move it.
These buffers, while being a very small part of the memory make compaction impossible and thus cause large memory "leak", even if it is not exactly a leak, more a fragmentation problem. This can happen on the LOH or on generational heaps just the same. Now the question is: what is causing these pinned objects to live forever: find the root cause of the leak that causes the fragmentation.
You can read similar questions here:
https://ayende.com/blog/181761-C/the-curse-of-memory-fragmentation
.NET deletes pinned allocated buffer (good explanation of pinned objects in the answer)
Note: the root cause was in a third party library AerospikeClient using the .NET async Socket API that is known for pinning the buffers sent to it. While AerospikeClient properly used a buffer pool, the buffer pool was re-created when re-creating their client. Since we re-created their client every hour instead of creating one forever, the buffer pool was re-created, causing a growing number of pinned buffers, in turn causing unlimited fragmentation. What remains unclear is why old buffers are never unpinned when transmission is over or at least when their client is disposed.
Identify concrete type of object behind auto_ptr from core dump
What I'd really like to know, if the pointer belongs to an IBaror an IBaz
GDB should be able to tell you that. Use (gdb) set print object on
. Documentation here.
When displaying a pointer to an object, identify the actual (derived)
type of the object rather than the declared type, using the virtual
function table. Note that the virtual function table is required—this
feature can only work for objects that have run-time type
identification; a single virtual method in the object's declared type
is sufficient.
Update:
it only outputs the IFoo* interface
That likely means that the pointer really is pointing to IFoo
(e.g. the object that was of type IBar
or IBaz
has already been destructed).
Would working with dynamic_cast imply
Yes, dynamic_cast
can't work without RTTI; if you are using dynamic_cast
, print object on
should just work.
Related Topics
How to Load Luks Passphrase from Usb, Falling Back to Keyboard
How to Hide Wget Output in Linux
How to Find Lines Containing a String in Linux
Listen on a Network Port and Save Data to a Text File
How to Count Number of Unique Values of a Field in a Tab-Delimited Text File
How to Speed Up Linux Kernel Compilation
How to Monitor Data on a Serial Port in Linux
How to Add My Own Software to a Buildroot Linux Package
In Linux, What Do All the Values in the "Top" Command Mean
How to Skip Saturday and Sunday in a Cron Expression
How to Open a "-" Dashed Filename Using Terminal
How to Obtain the Mdns.Service File Needed for Building Mdns in Yocto
What's the Reason Docker Ubuntu Official Image Would Exit Immediately When You Run/Start It
Linux How to Get Error Description by Error Number
How to Gzip All Files in All Sub-Directories into One Compressed File in Bash
Highlight Text Similar to Grep, But Don't Filter Out Text
The Gnu Screen Is Unresponsive, Seems Blocked
How to Create a Directory and Give Permission in Single Command