How to Programmatically Cause a Core Dump in C/C++

How to programmatically cause a core dump in C/C++

Raising of signal number 6 (SIGABRT in Linux) is one way to do it (though keep in mind that SIGABRT is not required to be 6 in all POSIX implementations so you may want to use the SIGABRT value itself if this is anything other than quick'n'dirty debug code).

#include <signal.h>
: : :
raise (SIGABRT);

Calling abort() will also cause a core dump, and you can even do this without terminating your process by calling fork() followed by abort() in the child only - see this answer for details.

How can a C program produce a core dump of itself without terminating?

void create_dump(void)
{
    if(!fork()) {
        // Crash the app in your favorite way here
        *((void*)0) = 42;
    }
}

Fork the process then crash the child - it'll give you a snapshot whenever you want

Strcmp generate a core dump

Can't write comments yet but,

Like Nunchy wrote, tmp is not defined in that context.
I also noticed that your code never increments the map iterator, which would result in a never ending loop.

I'm assuming you did not copy your actual code into your post but instead rewrote it hastily which resulted in some typos, but if not, try making sure you're using temp and not tmp in your call to strcmp, and make sure the loop actually increments the iterator.

Like one of the comments on your post points out as well, make sure you actually have data in the map, and the function parameter.

How to create a core dump even if the process is normally running?

Call gdb, then

attach pid
gcore

where pid is the process id of the process in question.

How to enable core dump in my Linux C++ program

You need to set ulimit -c. If you have 0 for this parameter a coredump file is not created. So do this: ulimit -c unlimited and check if everything is correct ulimit -a. The coredump file is created when an application has done for example something inappropriate. The name of the file on my system is core.<process-pid-here>.

Why does creating and writing to a very large vector cause a core dump?

The problem is that you run out of memory.

A lot of operating systems are "lazy" to allocate memory. This means that the OS will not actually allocate the real amount of memory you ask for until you use it. You are asking for at least 75 106 434 393 octets (a.k.a. 70 Gio) but Rust don't optimize the size of Vec<bool>, so you are asking for 600 851 475 143 bytes (a.k.a. 600 GiB) — your OS must not have found enough memory.

It's an error that your OS can't handle because it already told you "OK" when you asked for the memory. It's a critical error, so it ends your process with a core dump.

I thought Rust was all about safety and avoiding core dumps?

A core dump doesn't necessarily imply that your program is not safe. As you see, your program didn't do an out of bounds memory access, it just doesn't have enough memory. It's the best way to handle this error from your OS point of view and there is nothing unsafe according to the definition of safe in Rust.

BTW, on my machine (archlinux), your program is simply killed:

[1]    4901 killed     cargo run

Fork and core dump with threads

Are you familiar with process checkpoint-restart? In particular, CRIU? It seems to me it might provide an easy option for you.

I want to obtain a core dump of a running Linux process without interrupting the process [and] to somehow obtain the relevant data of the other, original threads.

Forget about not interrupting the process. If you think about it, a core dump has to interrupt the process for the duration of the dump; your true goal must therefore be to minimize the duration of this interruption. Your original idea of using fork() does interrupt the process, it just does so for a very short time.

Is the memory that contains all of the threads' stacks still available and accessible in the forked process?

No. The fork() only retains the thread that does the actual call, and the stacks for the rest of the threads are lost.

Here is the procedure I'd use, assuming CRIU is unsuitable:

Have a parent process that generates a core dump of the child process whenever the child is stopped. (Note that more than one consecutive stop event may be generated; only the first one until the next continue event should be acted on.)
You can detect the stop/continue events using waitpid(child,,WUNTRACED|WCONTINUED).
Optional: Use sched_setaffinity() to restrict the process to a single CPU, and sched_setscheduler() (and perhaps sched_setparam()) to drop the process priority to IDLE.
You can do this from the parent process, which only needs the CAP_SYS_NICE capability (which you can give it using setcap 'cap_sys_nice=pe' parent-binary to the parent binary, if you have filesystem capabilities enabled like most current Linux distributions do), in both the effective and permitted sets.
The intent is to minimize the progress of the other threads between the moment a thread decides it wants a snapshot/dump, and the moment when all threads have been stopped. I have not tested how long it takes for the changes to take effect -- certainly they only happen at the end of their current timeslices at the very earliest. So, this step should probably be done a bit beforehand.
Personally, I don't bother. On my four-core machine, the following SIGSTOP alone yields similar latencies between threads as a mutex or a semaphore does, so I don't see any need to strive for even better synchronization.
When a thread in the child process decides it wants to take a snapshot of itself, it sends a SIGSTOP to itself (via kill(getpid(), SIGSTOP)). This stops all threads in the process.
The parent process will receive the notification that the child was stopped. It will first examines /proc/PID/task/ to obtain the TIDs for each thread of the child process (and perhaps /proc/PID/task/TID/ pseudofiles for other information), then attaches to each TID using ptrace(PTRACE_ATTACH, TID). Obviously, ptrace(PTRACE_GETREGS, TID, ...) will obtain the per-thread register states, which can be used in conjunction with /proc/PID/task/TID/smaps and /proc/PID/task/TID/mem to obtain the per-thread stack trace, and whatever other information you're interested in. (For example, you could create a debugger-compatible core file for each thread.)
When the parent process is done grabbing the dump, it lets the child process continue. I believe you need to send a separate SIGCONT signal to let the entire child process continue, instead of just relying on ptrace(PTRACE_CONT, TID), but I haven't checked this; do verify this, please.

I do believe that the above will yield a minimal delay in wall clock time between the threads in the process stopping. Quick tests on AMD Athlon II X4 640 on Xubuntu and a 3.8.0-29-generic kernel indicates tight loops incrementing a volatile variable in the other threads only advance the counters by a few thousand, depending on the number of threads (there's too much noise in the few tests I made to say anything more specific).

Limiting the process to a single CPU, and even to IDLE priority, will drastically reduce that delay even further. CAP_SYS_NICE capability allows the parent to not only reduce the priority of the child process, but also lift the priority back to original levels; filesystem capabilities mean the parent process does not even have to be setuid, as CAP_SYS_NICE alone suffices. (I think it'd be safe enough -- with some good checks in the parent program -- to be installed in e.g. university computers, where students are quite active in finding interesting ways to exploit the installed programs.)

It is possible to create a kernel patch (or module) that provides a boosted kill(getpid(), SIGSTOP) that also tries to kick off the other threads from running CPUs, and thus try to make the delay between the threads stopping even smaller. Personally, I wouldn't bother. Even without the CPU/priority manipulation I get sufficient synchronization (small enough delays between the times the threads are stopped).

Do you need some example code to illustrate my ideas above?

How to Programmatically Cause a Core Dump in C/C++