How to Find The Reason for a Dead Process Without Log File on Unix

What killed my process and why?

If the user or sysadmin did not kill the program the kernel may have. The kernel would only kill a process under exceptional circumstances such as extreme resource starvation (think mem+swap exhaustion).

Log the reason for process termination with C++ on Linux

You can register a function to handle unexpected exceptions:

set_unexpected()

If not delt will with will cause application to call terminat().

You can register a function to log things on termination:

set_terminate()

You can add your own atexit() logging function that will do stuff (set a flag so that it only does stuff if exit happens abnormally then set the flag just before leaving main).

signal handler can be tricky (especially if you want them to be portable). If you use them you are limited in what you can do safely inside so I usually limit myself to setting a global flags so that they can be handled by normal code (of course if you are terminating then that is very limiting).

How to detect defunct processes on Linux?

from wikipedia:

On Unix and Unix-like computer operating systems, a zombie process or defunct process is a process that has completed execution but still has an entry in the process table. This entry is still needed to allow the process that started the (now zombie) process to read its exit status.

If the parent fetches the exit status by calling wait, waitpid or the like, the zombie should disappear.

You can detect whether a process is alive through the wait functions (man wait).

Tracking the death of a child process

Typically you write a handler for SIGCHLD which calls waitpid() on pid -1. You can use the return value from that to determine what pid died. For example:

void my_sigchld_handler(int sig)
{
pid_t p;
int status;

while ((p=waitpid(-1, &status, WNOHANG)) != -1)
{
/* Handle the death of pid p */
}
}

/* It's better to use sigaction() over signal(). You won't run into the
* issue where BSD signal() acts one way and Linux or SysV acts another. */

struct sigaction sa;

memset(&sa, 0, sizeof(sa));
sa.sa_handler = my_sigchld_handler;

sigaction(SIGCHLD, &sa, NULL);

Alternatively you can call waitpid(pid, &status, 0) with the child's process ID specified, and synchronously wait for it to die. Or use WNOHANG to check its status without blocking.

Ending tail -f started in a shell script

The best answer I can come up with is this

  1. Put a timeout on the read, tail -f logfile | read -t 30 line
  2. Start tail with --pid=$$, that way it'll exit when the bash-process has finished.

It'll cover all cases I can think of (server hangs with no output, server exits, server starts correctly).

Dont forget to start your tail before the server.

tail -n0 -F logfile 2>/dev/null | while read -t 30 line

the -F will 'read' the file even if it doesn't exist (start reading it when it appears). The -n0 won't read anything already in the file, so you can keep appending to the logfile instead of overwriting it each time, and to standard log rotation on it.

EDIT:
Ok, so a rather crude 'solution', if you're using tail. There are probably better solutions using something else but tail, but I got to give it to you, tail gets you out of the broken-pipe quite nicely. A 'tee' which is able to handle SIGPIPE would probably work better. The java process actively doing a file system drop with an 'im alive' message of some sort is probably even easier to wait for.

function startServer() {
touch logfile

# 30 second timeout.
sleep 30 &
timerPid=$!

tail -n0 -F --pid=$timerPid logfile | while read line
do
if echo $line | grep -q 'Started'; then
echo 'Server Started'
# stop the timer..
kill $timerPid
fi
done &

startJavaprocess > logfile &

# wait for the timer to expire (or be killed)
wait %sleep
}

How do I write a bash script that deletes my system log files after asking for permission at a certain time everyday?

#!/bin/sh
# I would want it to self-execute at some time every day.
# Would that mean that the shell needs to always run in the background?

If you think of the implications of running in the foreground, you will understand why it needs to run in the background. What if you're not logged-in at that exact time? What if you're typing a document? You might consider creating a pop-up window (Windows-style), but where? Which console?

# Is there a method by which I can add this execution to a daemon or other 
# background-process's code-content? (If so please give me details)

Try man crontab. In the crontab, you can specify how often jobs must run (for example:

#  m    h  D   M  dow  cmd
0 3 * * * /usr/local/bin/backup.diaadm

runs, on my system, a backup of my diapositives-administration every night at 3:00.

touch /Permission.txt

It is a bad idea to create files like this in the root-directory. It should probably be in a directory under /var/local somewhere.

# Process creates a file in root directory
chmod 0747 /Permission.txt
# Sets permissions so that I can write to the file without being root.
# I also set group permissions to read-only, my logic being that I want direct
# and raw data to go into this file without it being modified by shared
# Ownership processes..?

747 is a bizarre permission. If you're on a multi-user system, it should (probably) be 0774 and you put the administrators in the correct group.

I do not understand your flawed logic. You do not want the data to be modified by shared access, but you give write permissions to the world (=everyone on the system)?

echo "Delete System Logs for: "%d"/"%m"/"%y" ?">/Permission.txt
open /Permission.txt
Input=/Permission.txt</dev/f0

I am not sure what you are trying to establish here. It is not valid bash-scripting.

# Not just for the purposes of this script but for future scripts-I want to be
# able to capture raw binary data from peripheral devices. Is the output
# of the keyboard interpreted by the OS or any other process before saving
# to the text file?
# How do I get raw binary-output to a file from hardware devices? How can
# I go about understanding the opcode syntax-whether it uses even or odd
# parity; whether it is big or small endian; and other binary nuances.

In general, peripheral devices under Unix are represented by a device-file under /dev. You should be able to do a cat /dev/devicename to read the input from a device. However, if you want to use raw input from a device, bash might not be the right tool. Perl and Python have better support for these kind of things, and C or C++ is best suited for these kind of jobs.

while read Input ; do

if [ $Input == "Yes"|"yes" ]; then
echo "Exit">/Permission.txt

Here, you overwrite the file that you are reading. I do not understand why and it seems so illogical that I doubt that you want that.

     echo "sudo rm -rf /private/var/log/*"
echo "********" #The Password

Using sudo in a script is not a great idea. You should make sure that a script runs under the right permissions. Echoing the password is a bad idea. If you must, you can set-up passwordless sudo vis visudo.

   fi

if [ $Input == "No"|"no" ]; then
echo "Exit">/Permission.txt

see my previous comment on this.

 fi

end

end?

#  1 not sure where the process is in the system and whether or not sending 
# that string to tty will result in the command being executed as if terminal
# were running, will the ";" be seen as the return button being pressed?

No. Why would it? Echoing to the tty echoes to the tty.

You were probably trying something like this:

while read Input ; do
if [ $Input = "Yes" ]; then
sudo rm -rf /private/var/log/* ;
elif [ $Input = "No" ]; then
exit
fi
done < /Permission.txt

You should also, at least put your code through shellcheck.

Killing a defunct process on UNIX system

You have killed the process, but a dead process doesn't disappear from the process table until its parent process performs a task called "reaping" (essentially calling wait(3) for that process to read its exit status). Dead processes that haven't been reaped are called "zombie processes."

The parent process id you see for 31756 is process id 1, which always belongs to init. That process should reap its zombie processes periodically, but if it can't, they will remain zombies in the process table until you reboot.

tail -f makes disk full?

The space a file occupies cannot be reclaimed until all references to that file are gone. Therefore, any process that has the file open will prevent the file from being deleted from the disk.

An active tail -f following the file, for example.

If these files need to be deleted to free disk space (e.g. because they are very big, or there are very many of them), then having processes lying around that hold references to them will prevent their deletion, and eventually lead to the disk filling up.

Edit in response to the comment on the other answer:

The diagnostics you report are exactly what you would expect to see in the situation that Adam and I describe. df reports that 56G of disk are in use, and du reports that only 10G are visible in the folder. The discrepancy is because there are 46G worth of files that have been removed from the folder, but cannot be physically removed from disk because some processes are holding a references to them.

It's easy enough to experiment with this yourself: find a filesystem it's safe to play with, and create a humongous file. Write a C program that opens the file and goes into an infinite loop. Now, do the following:

  • Start the program
  • Check the output of df
  • rm the file
  • Check the output of df again
  • Stop your program
  • Check the output of df again

You will see that the output of df doesn't change after rming the file, but does change once you stop the program (thus removing the last reference to the file).

If you need even more evidence that this is what's going on, you may be able to get information from the /proc filesystem, if you have it. Specifically, find the PID of one of the tail -f processes (or other processes you think might be the cause), and look at the directory /proc/<pid>/fd to see all of the files it has open.

(I don't have *nix at home, so I can't actually check to see just what you'll see /proc/<pid>/fd in this situation)

How can I keep running a unix program in the background even if I log out?

My preferred method, and arguably the easiest, is using screen:

screen -d -m ./myProcess


Related Topics



Leave a reply



Submit