Shell Script for Process Monitoring

start and monitoring a process inside shell script for completion

The best thing you could do is to put some kind of instrumentation in your application,
and let it report the actual progress in terms of work items processed / total amount of work.

Failing that, you can indeed refer to the time that the thing has been running.

Here's a sample of what I've used in the past. Works in ksh93 and bash.

#! /bin/ksh
set -u
prog_under_test="sleep"
args_for_prog=30

max=30 interval=1 n=0

main() {
($prog_under_test $args_for_prog) & pid=$! t0=$SECONDS

while is_running $pid; do
sleep $interval
(( delta_t = SECONDS-t0 ))
(( percent=100*delta_t/max ))
report_progress $percent
done
echo
}

is_running() { (kill -0 ${1:?is_running: missing process ID}) 2>& -; }

function report_progress { typeset percent=$1
printf "\r%5.1f %% complete (est.) " $(( percent ))
}

main

Shell Script for checking processes

Finally, found out a way to pull out the process in a text file. Please refer to the code below.

    for i in `cat /tmp/serverlist`
do
echo $i:`ssh -l root $i "uname -n;ps -eo comm | grep -i syslog"` >> sysloginfo.txt
done

And that gives an output as,

    xx.xx.xx.xx: xxxxxxx101 syslogd
xx.xx.xx.xx: xxxxxxx102 syslog-ng
xx.xx.xx.xx: xxxxxxx103
xx.xx.xx.xx: xxxxxxx104 syslog-ng syslog-ng

Bash script to monitor process and sendmail if failed

So, one thing about your tests is that you're pushing the output to /dev/null, which means that VAL1 and VAL2 will always be empty.

Secondly, you don't need the elif. You have two basic conditions. Either things are running, or they are not. If anything is not running, send an email. You could do some additional testing to determine whether it's PROCESS TEST1 or PROCESS TEST2 that died, but that wouldn't strictly be necessary.

Here's how I might write a script to do the same thing.

#!/usr/bin/env bash

#Check if process is running
PID1=$(/usr/ucb/ps aux | grep "[P]ROCESS TEST1" | awk '{print $2}')
PID2=$(/usr/ucb/ps aux | grep "[P]ROCESS TEST2" | awk '{print $2}')

err=0

if [ "x$PID1" == "x" ]; then
# PROCESS TEST1 died
err=$(( err + 1 ))
else
echo "$(date) - PROCESS TEST1 $VAL2 is Running" >> /var/tmp/Log.txt;
fi

if [ "x$PID2" == "x" ]; then
# PROCESS TEST2 died
err=$(( err + 2 ))
else
echo "$(date) - PROCESS TEST2 is Running" >> /var/tmp/Log.txt;
fi

if (( $err > 0 )); then
# identify which PROCESS TEST had the problem.
if $(( err == 1 )); then
condition="PROCESS TEST1 is down"
elif (( $err == 2 )); then
condition="PROCESS TEST2 is down"
else
condition="PROCESS TEST1 and PROCESS TEST2 are down"
fi

# let's send an email to get eyes on the issue, but we will restart the process after
# we send the email.
SUBJ="Process Error Detected"
FROM="Server"
TO="someone@acme.com"
(
cat <<-EOT
To : ${TO}
From : ${FROM}
Subject : ${SUBJ}

$condition at $(date) please login to the server to check that the processes were restarted successfully.

EOT
) | sendmail -v ${TO}

# we reached an error condition, and we sent mail
# now let's restart the svc.
/usr/sbin/svcadm restart Foo
fi

Real time CPU% by process monitoring script in AIX 6.1

Solved this by making a script that generates 30second tprof logs and iterates through them adding up the process threads by PID and reaching a sum that equals more or less a real-time CPU load% process list.

bash process monitor script echo empty when no process found

The problem is that ps -ef | grep rman | grep -v grep is matching the process name rman_check.sh. So when you run your script, $rmanRunning is not 0, because it's counting itself.

Use pgrep with the -x option so that it matches the command name exactly, rather than looking for a substring.

#!/bin/bash

rmanRunning=$(pgrep -x rman | wc -l)

if [ "$rmanRUNNING" -gt "0" ]
then
PPIDs=($( ps -oppid= $(pgrep -x rman)))
kshddPID=($(pgrep -f 'ksh.*/rman_dd.ksh'))
for i in "${PPIDs[@]}"
do
:
for j in "${kshddPID[@]}"
do
:
if [ "$i" == "$j" ]
then
result="ok"
else
result="bad"
break
fi
done
if [ "$result" == "bad" ]
then
break
fi
done
echo "$result"
else
echo "ok"
fi

However, there's also a problem with your overall logic. If you have two rman_dd.ksh processes and they each have an rman child, you'll report bad when you compare one parent with the other. A simpler way is to just sort the two PID lists and compare them.

PPIDs=$(pgrep -x rman | sort)
kshddPIDs=$(pgrep -f 'ksh.*rman_dd.ksh' | sort)
if [ "$PPIDs" = "$kshddPIDs" ]
then echo "ok"
else echo "bad"
fi


Related Topics



Leave a reply



Submit