Pkill Returns 255 in Combination with Another Command via Remote Ssh

pkill returns 255 in combination with another command via remote ssh

The documentation for the pkill -f option says:

-f

The pattern is normally only matched against the process name. When -f is set, the full command line is used.

So pkill -f xyz will kill any process with "xyz" anywhere on its command line.

When you run ssh <remoteHost> 'source /etc/profile; pkill -f xyz', the remote ssh server will run the equivalent of this on your behalf:

$SHELL -c 'source /etc/profile; pkill -f xyz'

The resulting shell instance is a process with "xyz" in its command line. My guess is that pkill is killing it, and ssh is reporting the killed session as exit code 255, like this:

$ ssh localhost 'kill $$'
$ echo $?
255

It doesn't happen when you just run ssh <remoteHost> 'pkill -f xyz', because some shells like bash will optimize for this case. Instead of running pkill as a subprocess, the shell instance will replace itself with the pkill process. So by the time pkill runs, the shell process with "xyz" on its command line is gone.

You can probably work around this by running pkill like this:

ssh <remoteHost> 'source /etc/profile; exec pkill -f xyz'

If that doesn't work, you can specify the pkill pattern in such a way that it doesn't match the pattern itself. For example:

ssh <remoteHost> 'source /etc/profile; exec pkill -f "[x]yz"'

The pattern [x]yz matches the text "xyz", so pkill will kill processes where the text "xyz" appears. But the pattern doesn't match itself, so pkill won't kill processes where the pattern appears.

Why does running pkill -f anything over ssh fail only when branching on its result?

The problem appears to be that pkill is killing itself. Or rather, it is killing the shell that owns it.

First of all, it appears that ssh uses the remote user's shell to execute certain "complicated" commands:

$ ssh user@remote 'ps -F --pid $$'
UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
user 9531 9526 0 11862 1616 6 14:36 ? 00:00:00 ps -F --pid 9531

$ ssh user@remote 'ps -F --pid $$ && echo hi'
UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
user 9581 9577 0 28316 1588 5 14:36 ? 00:00:00 bash -c ps -F --pid $$ && echo hi
hi

Second, it appears that pkill -f normally knows not to kill itself (otherwise all pkill -f commands would suicide). But if run from a subshell, that logic fails:

$ pkill -f fake_process; echo $?
1

$ sh -c 'pkill -f fake_process'; echo $?
[1] 14031 terminated sh -c 'pkill -f fake_process'
143

In my case, to fix this I just re-worked some of the code around my ssh/pkill so that I could avoid having a "complicated" remote command. Theoretically I think you could also do something like pgrep -f <cmd> | grep -v $$ | xargs kill.

Timeout a command in bash without unnecessary delay

I think this is precisely what you are asking for:

http://www.bashcookbook.com/bashinfo/source/bash-4.0/examples/scripts/timeout3

#!/bin/bash
#
# The Bash shell script executes a command with a time-out.
# Upon time-out expiration SIGTERM (15) is sent to the process. If the signal
# is blocked, then the subsequent SIGKILL (9) terminates it.
#
# Based on the Bash documentation example.

# Hello Chet,
# please find attached a "little easier" :-) to comprehend
# time-out example. If you find it suitable, feel free to include
# anywhere: the very same logic as in the original examples/scripts, a
# little more transparent implementation to my taste.
#
# Dmitry V Golovashkin <Dmitry.Golovashkin@sas.com>

scriptName="${0##*/}"

declare -i DEFAULT_TIMEOUT=9
declare -i DEFAULT_INTERVAL=1
declare -i DEFAULT_DELAY=1

# Timeout.
declare -i timeout=DEFAULT_TIMEOUT
# Interval between checks if the process is still alive.
declare -i interval=DEFAULT_INTERVAL
# Delay between posting the SIGTERM signal and destroying the process by SIGKILL.
declare -i delay=DEFAULT_DELAY

function printUsage() {
cat <<EOF

Synopsis
$scriptName [-t timeout] [-i interval] [-d delay] command
Execute a command with a time-out.
Upon time-out expiration SIGTERM (15) is sent to the process. If SIGTERM
signal is blocked, then the subsequent SIGKILL (9) terminates it.

-t timeout
Number of seconds to wait for command completion.
Default value: $DEFAULT_TIMEOUT seconds.

-i interval
Interval between checks if the process is still alive.
Positive integer, default value: $DEFAULT_INTERVAL seconds.

-d delay
Delay between posting the SIGTERM signal and destroying the
process by SIGKILL. Default value: $DEFAULT_DELAY seconds.

As of today, Bash does not support floating point arithmetic (sleep does),
therefore all delay/time values must be integers.
EOF
}

# Options.
while getopts ":t:i:d:" option; do
case "$option" in
t) timeout=$OPTARG ;;
i) interval=$OPTARG ;;
d) delay=$OPTARG ;;
*) printUsage; exit 1 ;;
esac
done
shift $((OPTIND - 1))

# $# should be at least 1 (the command to execute), however it may be strictly
# greater than 1 if the command itself has options.
if (($# == 0 || interval <= 0)); then
printUsage
exit 1
fi

# kill -0 pid Exit code indicates if a signal may be sent to $pid process.
(
((t = timeout))

while ((t > 0)); do
sleep $interval
kill -0 $$ || exit 0
((t -= interval))
done

# Be nice, post SIGTERM first.
# The 'exit 0' below will be executed if any preceeding command fails.
kill -s SIGTERM $$ && kill -0 $$ || exit 0
sleep $delay
kill -s SIGKILL $$
) 2> /dev/null &

exec "$@"

@(at) sign in file path/string

It has nothing to do with filepath. It changes the escaping behavior of strings.

In a string literal prefixed with @ the escape sequences starting with \ are disabled. This is convenient for filepaths since \ is the path separator and you don't want it to start an escape sequence.

In a normal string you would have to escape \ into \\ so your example would look like this "pdf\\". But since it's prefixed with @ the only character that needs escaping is " (which is escaped as "") and the \ can simply appear.

This feature is convenient for strings literals containing \ such as filepaths or regexes.

For your simple example the gain isn't that big, but image you have a full path "C:\\ABC\\CDE\\DEF" then @"C:\ABC\CDE\DEF" looks a lot nicer.

For regular expressions it's almost a must. A regex typically contains several \ escaping other characters already and often becomes almost unreadable if you need to escape them.

Running ssh command on server from Jenkins

Finally, I found the solution. One problem was in restart.sh, because is needed to force from cmd to specify the log file. So, nohup is ignored/unused, and the command become:

java -Xms8g -Xmx8g -jar app-name-1.0-allinone.jar </dev/null>> logfile.log 2>&1 &

Another problem was with killing the previous jar process. Be very carrefour, because using project name as path in jenkins script, this will create a new process for your user and will be accidentally killed when you will want to stop your application:

def statusCode = sh returnStatus: true, script: 'sshpass -p "password" ssh -o "StrictHostKeyChecking=no" username@server "cd ../../to/app/path/app-folder; sh redeploy.sh;"'

if (statusCode != 0) {
currentBuild.result = 'FAILURE'
echo "FAILURE"
}

stop.sh

if pgrep -u username -f app-name
then
pkill -u username -f app-name
fi
# (app-name is a string, some words from the running cmd to open tha application)

Because app-folder from Jenkins script and app-name from stop.sh are equals (or even app-folder contains app-name value), when you'll try to kill app-name process, accidentally you'll kill the ssh connection and Jenkins will get 255 status code, but the redeploy.sh script from server will be done successfully because it will be executed independently.

The solution is so simple, but hard to be discovered. You should be sure that you give an explicit name for your search command which will find only and only the process id of your application.

Finally, stop.sh must be as:

if pgrep -u username -f my-app-v1.0.jar
then
pkill -u username -f my-app-v1.0.jar
fi

How to add a progress bar to a shell script?

You can implement this by overwriting a line. Use \r to go back to the beginning of the line without writing \n to the terminal.

Write \n when you're done to advance the line.

Use echo -ne to:

  1. not print \n and
  2. to recognize escape sequences like \r.

Here's a demo:

echo -ne '#####                     (33%)\r'
sleep 1
echo -ne '############# (66%)\r'
sleep 1
echo -ne '####################### (100%)\r'
echo -ne '\n'

In a comment below, puk mentions this "fails" if you start with a long line and then want to write a short line: In this case, you'll need to overwrite the length of the long line (e.g., with spaces).



Related Topics



Leave a reply



Submit