Linux, Where Are the Return Codes Stored of System Daemons and Other Processes

Linux, where are the return codes stored of system daemons and other processes?

When a process terminates its parent process must acknowledge this using the wait or waitpid function. These functions also return the exit status. After the call to wait or waitpid the process table entry is removed, and the exit status is no longer stored anywhere in the operating system. You should check if the software you use to start the process saves the exit status somewhere.

If the parent process has not acknowledged that the child has terminated you can read its exit status from the /proc file system: it is the last field in /proc/[pid]/stat. It is stored in the same format that wait returns it, so you have to divide by 256 to get the exit code. Also you probably have to be root.

Controlling a C daemon from another program

Instead of running the program via popen why not use the good old POSIX fork + exec? It gives you a bit more flexibility.

Now, to answer you question:

My problem is to detect that the daemon has been stopped.

To do this you have to listen to SIGCHLD signal in your parent/controlling process. This is good enough since you directly invoked the process. But if you called a shell script which then forked your daemon, it would get difficult. This is why most daemons write something called a pid file - A file written by the daemon early on with it's PID as the only content in that file. Normally, people have it put it /tmp/mydaemon.pid or something like that.

On Linux, your controlling process can read the PID from this file, then every second you can test if /proc/<pid>/exe file exists. If not, you know the daemon died. For example, if your child program's PID is 1234, then /proc/1234/exe will be a soft link to the actual location of the executable of the child program.

Something like this:

FILE *f;
pid_t pid_child;
char proc_path[256];

f = fopen("/tmp/mydaemon.pid", "r");
fscanf(f, "%d", &pid_child);
fclose(f);
sprintf(proc_path, "/proc/%d/exe", pid_child);

while(1) {
    if (access(proc_path, F_OK) == 0) {
        printf("Program alive !\n");
        sleep(1);
    } else {
        printf("Program dead!\n");
        break;
    }
}

In fact, this is roughly how many init systems are implemented. See rc, systemd, upstart etc. for a better understanding of how they implement this in more details.

How to get the correct exit code of a shell command executed via pipe using popen() and pclose() in a daemonized process?

The issue with my program was, that the signal SIGCHLD was not handled by my program.

After evaluating errno and checking the error given by pclose() I was able to find a solution. errno returned ECHILD which means "No child processes" according to the errno man page. I caught this error by adding some more handling to my pclose_wrapper lambda function:

...
  auto pclose_wrapper = [&rc](FILE *cmd) {
    rc = pclose(cmd);
    if (rc < 0) {
      /* Log the error if pclose returns "-1" signaling an error occured */
      syslog(LOG_ERR, "rc is negativ - %s", strerror(errno));
    }
  };
...

After some more research and looking at the man page for signal I found that my program was ignoring the SIGCHLD signal as mentioned before. This signal informs a process if a child process stopped or terminated.

The solution was to add sigaction(SIGCHLD, &newSigAction, NULL); and throwing out the line signal(SIGCHLD, SIG_IGN); which explicitly ignored the signal.

Here is the working code:

#include <array> // For std::array
#include <cstring>
#include <memory> // For std::unique_ptr
#include <string>
#include <sys/stat.h>
#include <sys/syslog.h> // For all syslog things
#include <sys/wait.h>
#include <unistd.h>

void signalHandler(int sig) {
  switch (sig) {
  case SIGINT:
  case SIGTERM:
    break;
  case SIGCHLD:
    /* Some child related action */
    break;
  }
}

int main(int argc, char *argv[]) {
  /* Open log file to be able to use syslog */
  setlogmask(LOG_UPTO(LOG_DEBUG));
  openlog("MyDemoProg", LOG_PID, LOG_DAEMON);

#if 1 // Set to 0 to disable the daemonizing

  pid_t pid = fork();

  if (pid < 0)
    exit(EXIT_FAILURE);

  if (pid > 0)
    exit(EXIT_SUCCESS);

  if (setsid() < 0)
    exit(EXIT_FAILURE);

  struct sigaction newSigAction;
  newSigAction.sa_handler = signalHandler;
  sigemptyset(&newSigAction.sa_mask);
  newSigAction.sa_flags = 0;

  sigaction(SIGHUP, &newSigAction, NULL);  /* catch hangup signal */
  sigaction(SIGTERM, &newSigAction, NULL); /* catch term signal */
  sigaction(SIGINT, &newSigAction, NULL);  /* catch interrupt signal */
  sigaction(SIGCHLD, &newSigAction,
            NULL); /* catch child stopped or terminated signal */

  pid = fork();
  if (pid < 0)
    exit(EXIT_FAILURE);

  if (pid > 0)
    exit(EXIT_SUCCESS);

  umask(0);
  chdir("/");
  for (int x = sysconf(_SC_OPEN_MAX); x >= 0; x--) {
    close(x);
  }
  syslog(LOG_DEBUG, "Daemonizing is enabled");
#else
  syslog(LOG_DEBUG, "Daemonizing is disabled");
#endif

  std::string command = "ls /var/bla/; sleep 2; echo test";
  syslog(LOG_DEBUG, "Command is: %s", command.c_str());

  int rc = -999;
  std::array<char, 16> buffer;
  std::string commandResult;

  // A wrapper function to be able to get the return code while still using the
  // automatic close function wizzardy of unique_ptr
  auto pclose_wrapper = [&rc](FILE *cmd) {
    rc = pclose(cmd);
    if (rc < 0) {
      /* Log the error if pclose returns "-1" signaling an error occured */
      syslog(LOG_ERR, "rc is negativ - %s", strerror(errno));
    }
  };
  {
    const std::unique_ptr<FILE, decltype(pclose_wrapper)> pipe(
        popen(command.c_str(), "r"), pclose_wrapper);

    if (!pipe) {
      syslog(LOG_ERR, "Could not open pipe! Exiting");
      return EXIT_FAILURE;
    }

    /* Read in the pipe and save the content to a buffer */
    while (::fgets(buffer.data(), buffer.size(), pipe.get()) != nullptr) {
      commandResult += buffer.data();
    }
  }
  syslog(LOG_DEBUG, "Command result is: %s", commandResult.c_str());
  syslog(LOG_DEBUG, "Return code is: %d", rc);

  return EXIT_SUCCESS;
}

And here is the output of both the non-daemonized and daemonized version:

(I added a syslog message to indicate if the daemonizing code was enabled or disabled.)

May 10 09:24:30 MY-EMBEDDED-DEVICE daemon.debug MyDemoProg[10872]: Daemonizing is disabled
May 10 09:24:30 MY-EMBEDDED-DEVICE daemon.debug MyDemoProg[10872]: Command is: ls /var/bla/; sleep 2; echo test
May 10 09:24:32 MY-EMBEDDED-DEVICE daemon.debug MyDemoProg[10872]: Command result is: test
May 10 09:24:32 MY-EMBEDDED-DEVICE daemon.debug MyDemoProg[10872]: Return code is: 0
---
May 10 09:24:49 MY-EMBEDDED-DEVICE daemon.debug MyDemoProg[10881]: Daemonizing is enabled
May 10 09:24:49 MY-EMBEDDED-DEVICE daemon.debug MyDemoProg[10881]: Command is: ls /var/bla/; sleep 2; echo test
May 10 09:24:51 MY-EMBEDDED-DEVICE daemon.debug MyDemoProg[10881]: Command result is: test
May 10 09:24:51 MY-EMBEDDED-DEVICE daemon.debug MyDemoProg[10881]: Return code is: 0

Now both versions give the expected return code of "0".

Does adding '&' makes it run as a daemon?

Yes the process will be ran as a daemon, or background process; they both do the same thing.

You can verify this by looking at the opt parser in the source code (if you really want to verify this):

. cmdoption:: --detach
    Detach and run in the background as a daemon.

https://github.com/celery/celery/blob/d59518f5fb68957b2d179aa572af6f58cd02de40/celery/bin/beat.py#L12

https://github.com/celery/celery/blob/d59518f5fb68957b2d179aa572af6f58cd02de40/celery/platforms.py#L365

Ultimately, the code below is what detaches it in the DaemonContext. Notice the fork and exit calls:

def _detach(self):
    if os.fork() == 0:      # first child
        os.setsid()         # create new session
        if os.fork() > 0:   # pragma: no cover
            # second child
            os._exit(0)
    else:
        os._exit(0)
    return self

Creating a daemon in Linux

In Linux i want to add a daemon that cannot be stopped and which monitors filesystem changes. If any changes would be detected it should write the path to the console where it was started + a newline.

Daemons work in the background and (usually...) don't belong to a TTY that's why you can't use stdout/stderr in the way you probably want.
Usually a syslog daemon (syslogd) is used for logging messages to files (debug, error,...).

Besides that, there are a few required steps to daemonize a process.

If I remember correctly these steps are:

fork off the parent process & let it terminate if forking was successful. -> Because the parent process has terminated, the child process now runs in the background.
setsid - Create a new session. The calling process becomes the leader of the new session and the process group leader of the new process group. The process is now detached from its controlling terminal (CTTY).
Catch signals - Ignore and/or handle signals.
fork again & let the parent process terminate to ensure that you get rid of the session leading process. (Only session leaders may get a TTY again.)
chdir - Change the working directory of the daemon.
umask - Change the file mode mask according to the needs of the daemon.
close - Close all open file descriptors that may be inherited from the parent process.

To give you a starting point: Look at this skeleton code that shows the basic steps. This code can now also be forked on GitHub: Basic skeleton of a linux daemon

/*
 * daemonize.c
 * This example daemonizes a process, writes a few log messages,
 * sleeps 20 seconds and terminates afterwards.
 */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <syslog.h>

static void skeleton_daemon()
{
    pid_t pid;

    /* Fork off the parent process */
    pid = fork();

    /* An error occurred */
    if (pid < 0)
        exit(EXIT_FAILURE);

    /* Success: Let the parent terminate */
    if (pid > 0)
        exit(EXIT_SUCCESS);

    /* On success: The child process becomes session leader */
    if (setsid() < 0)
        exit(EXIT_FAILURE);

    /* Catch, ignore and handle signals */
    //TODO: Implement a working signal handler */
    signal(SIGCHLD, SIG_IGN);
    signal(SIGHUP, SIG_IGN);

    /* Fork off for the second time*/
    pid = fork();

    /* An error occurred */
    if (pid < 0)
        exit(EXIT_FAILURE);

    /* Success: Let the parent terminate */
    if (pid > 0)
        exit(EXIT_SUCCESS);

    /* Set new file permissions */
    umask(0);

    /* Change the working directory to the root directory */
    /* or another appropriated directory */
    chdir("/");

    /* Close all open file descriptors */
    int x;
    for (x = sysconf(_SC_OPEN_MAX); x>=0; x--)
    {
        close (x);
    }

    /* Open the log file */
    openlog ("firstdaemon", LOG_PID, LOG_DAEMON);
}

int main()
{
    skeleton_daemon();

    while (1)
    {
        //TODO: Insert daemon code here.
        syslog (LOG_NOTICE, "First daemon started.");
        sleep (20);
        break;
    }

    syslog (LOG_NOTICE, "First daemon terminated.");
    closelog();

    return EXIT_SUCCESS;
}

Compile the code: gcc -o firstdaemon daemonize.c
Start the daemon: ./firstdaemon
Check if everything is working properly: ps -xj | grep firstdaemon
The output should be similar to this one:


+------+------+------+------+-----+-------+------+------+------+-----+
| PPID | PID  | PGID | SID  | TTY | TPGID | STAT | UID  | TIME | CMD |
+------+------+------+------+-----+-------+------+------+------+-----+
|    1 | 3387 | 3386 | 3386 | ?   |    -1 | S    | 1000 | 0:00 | ./  |
+------+------+------+------+-----+-------+------+------+------+-----+

What you should see here is:

The daemon has no controlling terminal (TTY = ?)
The parent process ID (PPID) is 1 (The init process)
The PID != SID which means that our process is NOT the session leader

(because of the second fork())
Because PID != SID our process can't take control of a TTY again

Reading the syslog:

Locate your syslog file. Mine is here: /var/log/syslog
Do a: grep firstdaemon /var/log/syslog
The output should be similar to this one:


  firstdaemon[3387]: First daemon started.
  firstdaemon[3387]: First daemon terminated.

A note:
In reality you would also want to implement a signal handler and set up the logging properly (Files, log levels...).

Further reading:

Linux-UNIX-Programmierung - German
Unix Daemon Server Programming

Linux, Where Are the Return Codes Stored of System Daemons and Other Processes