Parsing Shell Script Arguments

How do I parse command line arguments in Bash?


Bash Space-Separated (e.g., --option argument)

cat >/tmp/demo-space-separated.sh <<'EOF'
#!/bin/bash

POSITIONAL_ARGS=()

while [[ $# -gt 0 ]]; do
case $1 in
-e|--extension)
EXTENSION="$2"
shift # past argument
shift # past value
;;
-s|--searchpath)
SEARCHPATH="$2"
shift # past argument
shift # past value
;;
--default)
DEFAULT=YES
shift # past argument
;;
-*|--*)
echo "Unknown option $1"
exit 1
;;
*)
POSITIONAL_ARGS+=("$1") # save positional arg
shift # past argument
;;
esac
done

set -- "${POSITIONAL_ARGS[@]}" # restore positional parameters

echo "FILE EXTENSION = ${EXTENSION}"
echo "SEARCH PATH = ${SEARCHPATH}"
echo "DEFAULT = ${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)

if [[ -n $1 ]]; then
echo "Last line of file specified as non-opt/last argument:"
tail -1 "$1"
fi
EOF

chmod +x /tmp/demo-space-separated.sh

/tmp/demo-space-separated.sh -e conf -s /etc /etc/hosts
Output from copy-pasting the block above
FILE EXTENSION  = conf
SEARCH PATH = /etc
DEFAULT =
Number files in SEARCH PATH with EXTENSION: 14
Last line of file specified as non-opt/last argument:
#93.184.216.34 example.com
Usage
demo-space-separated.sh -e conf -s /etc /etc/hosts


Bash Equals-Separated (e.g., --option=argument)

cat >/tmp/demo-equals-separated.sh <<'EOF'
#!/bin/bash

for i in "$@"; do
case $i in
-e=*|--extension=*)
EXTENSION="${i#*=}"
shift # past argument=value
;;
-s=*|--searchpath=*)
SEARCHPATH="${i#*=}"
shift # past argument=value
;;
--default)
DEFAULT=YES
shift # past argument with no value
;;
-*|--*)
echo "Unknown option $i"
exit 1
;;
*)
;;
esac
done

echo "FILE EXTENSION = ${EXTENSION}"
echo "SEARCH PATH = ${SEARCHPATH}"
echo "DEFAULT = ${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)

if [[ -n $1 ]]; then
echo "Last line of file specified as non-opt/last argument:"
tail -1 $1
fi
EOF

chmod +x /tmp/demo-equals-separated.sh

/tmp/demo-equals-separated.sh -e=conf -s=/etc /etc/hosts
Output from copy-pasting the block above
FILE EXTENSION  = conf
SEARCH PATH = /etc
DEFAULT =
Number files in SEARCH PATH with EXTENSION: 14
Last line of file specified as non-opt/last argument:
#93.184.216.34 example.com
Usage
demo-equals-separated.sh -e=conf -s=/etc /etc/hosts

To better understand ${i#*=} search for "Substring Removal" in this guide. It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.



Using bash with getopt[s]

getopt(1) limitations (older, relatively-recent getopt versions):

  • can't handle arguments that are empty strings
  • can't handle arguments with embedded whitespace

More recent getopt versions don't have these limitations. For more information, see these docs.



POSIX getopts

Additionally, the POSIX shell and others offer getopts which doen't have these limitations. I've included a simplistic getopts example.

cat >/tmp/demo-getopts.sh <<'EOF'
#!/bin/sh

# A POSIX variable
OPTIND=1 # Reset in case getopts has been used previously in the shell.

# Initialize our own variables:
output_file=""
verbose=0

while getopts "h?vf:" opt; do
case "$opt" in
h|\?)
show_help
exit 0
;;
v) verbose=1
;;
f) output_file=$OPTARG
;;
esac
done

shift $((OPTIND-1))

[ "${1:-}" = "--" ] && shift

echo "verbose=$verbose, output_file='$output_file', Leftovers: $@"
EOF

chmod +x /tmp/demo-getopts.sh

/tmp/demo-getopts.sh -vf /etc/hosts foo bar
Output from copy-pasting the block above
verbose=1, output_file='/etc/hosts', Leftovers: foo bar
Usage
demo-getopts.sh -vf /etc/hosts foo bar

The advantages of getopts are:

  1. It's more portable, and will work in other shells like dash.
  2. It can handle multiple single options like -vf filename in the typical Unix way, automatically.

The disadvantage of getopts is that it can only handle short options (-h, not --help) without additional code.

There is a getopts tutorial which explains what all of the syntax and variables mean. In bash, there is also help getopts, which might be informative.

Parsing command line arguments in a shell script function

Function can have arguments passed. so $@ inside function becomes functions args not the shell arguments at command line.

ALL_ARGS_PASSED="$@" is wrong $@ is already quoted args list. If you quote again it becomes single string and hence only first argument is parsed rest are the value.

""aa=1 bb=2 cc=3""
so if you parse this aa is key and value is "1 bb=2 cc=3"

so the solution is not to quote for ALL_ARGS_PASSED

ALL_PASSED_ARGS=$@

Parsing shell script arguments

There are lots of ways to parse arguments in sh. Getopt is good. Here's a simple script that parses things by hand:

#!/bin/sh
# WARNING: see discussion and caveats below
# this is extremely fragile and insecure

while echo $1 | grep -q ^-; do
# Evaluating a user entered string!
# Red flags!!! Don't do this
eval $( echo $1 | sed 's/^-//' )=$2
shift
shift
done

echo host = $host
echo user = $user
echo pass = $pass
echo args = $@

A sample run looks like:

$ ./a.sh -host foo -user me -pass secret some args
host = foo
user = me
pass = secret
args = some args

Note that this is not even remotely robust and massively open to security
holes since the script eval's a string constructed by the user. It is merely
meant to serve as an example for one possible way to do things. A simpler method is to require the user to pass the data in the environment. In a bourne shell (ie, anything that is not in the csh family):

$ host=blah user=blah pass=blah myscript.sh

works nicely, and the variables $host, $user, $pass will be available in the script.

#!/bin/sh
echo host = ${host:?host empty or unset}
echo user = ${user?user not set}
...

Shell script argument parsing

You want to use getopt with long and short options. An example from working code:

# Parse arguments
TEMP=$(getopt -n $PROGRAM_NAME -o p:P:cCkhnvVS \
--long domain-password:,pop3-password:\
,create,cron,kill,help,no-sync-passwords,version,verbose,skip-pop3 \
-- "$@")

# Die if they fat finger arguments, this program will be run as root
[ $? = 0 ] || die "Error parsing arguments. Try $PROGRAM_NAME --help"

eval set -- "$TEMP"
while true; do
case $1 in
-c|--create)
MODE="CREATE"; shift; continue
;;
-C|--cron)
MODE="CRON"; shift; continue
;;
-k|--kill)
MODE="KILL"; shift; continue
;;
-h|--help)
usage
exit 0
;;
-n|--no-sync-passwords)
SYNC_VHOST=0; shift; continue
;;
-p|--domain-password)
DOMAIN_PASS="$2"; shift; shift; continue
;;
-P|--pop3-password)
POP3_PASS="$2"; shift; shift; continue
;;
-v|--version)
printf "%s, version %s\n" "$PROGRAM_NAME" "$PROGRAM_VERSION"
exit 0
;;
-v|--verbose)
VERBOSE=1; shift; continue
;;
-S|--skip-pop3)
SKIP_POP=1; shift; continue
;;
--)
# no more arguments to parse
break
;;
*)
printf "Unknown option %s\n" "$1"
exit 1
;;
esac
done

Note, die is a function that was defined previously (not shown).

The -n option tells getopt to report errors as the name of my program, not as getopt. -o defines a list of short options (: after an option indicates a needed argument) and --long specifies the list of long options (corresponding in order to the short options).

The rest is just a simple switch, calling shift appropriately to advance the argument pointer. Note, calling shift; shift; is just a die hard habit. In the currently modern world, shift 2 would probably suffice.

The modern getopt is pretty consistent over newer platforms, however you may encounter some portability problems on older (circa pre Redhat 9) systems. See man getopt for information about backwards compatibility. However it's unlikely that you'll run into the need for it.

Finally, after parsing options, you can once again call:

eval set -- "$@"

This will move the argument pointer to anything else left on the command line after getopt was done parsing options. You can then just shift to keep reading them. For instance, if a command looked like this:

./foo --option bar file1.txt file2.txt file3.txt

Don't forget to make a handy -h / --help option to print your new fancy options once you're done. :) If you make that output help2man friendly, you have an instant man page to go with your new tool.

Edit

On most distributions, you can find more example getopt code in /usr/share/doc/util-linux/examples, which should have been installed by default.

Argument parsing in bash


VMMOUNT=""
BOOTSTRAP=""
IMAGE_FILE=""
TARGET_EXE=""
INTERNAL_EXE=""
while : ; do
case "$1" in
--vmmount)
[ -n "${VMMOUNT}" ] && usage
VMMOUNT="$2"
shift 2 ;;
--bootstrap)
[ -n "${BOOTSTRAP}" ] && usage
BOOTSTRAP="$2"
shift 2 ;;
--image)
[ -n "${IMAGE_FILE}" ] && usage
IMAGE_FILE="$2"
shift 2 ;;
--target-exe)
[ -n "${TARGET_EXE}" ] && usage
TARGET_EXE="$2"
shift 2 ;;
--internal-exe)
[ -n "${INTERNAL_EXE}" ] && usage
INTERNAL_EXE="true"
shift ;;
*)
break ;;
esac
done
my_method "${IMAGE_FILE}" "${VMMOUNT}" "${BOOTSTRAP}" "${TARGET_EXE}" "${INTERNAL_EXE}" "$@"

Don't forget to enclose $@ in double quotes.

Using Python to parse complex arguments to shell script

Edit: I haven't used it (yet), but if I were posting this answer today I would probably recommend https://github.com/docopt/docopts instead of a custom approach like the one described below.


I've put together a short Python script that does most of what I want. I'm not convinced it's production quality yet (notably error handling is lacking), but it's better than nothing. I'd welcome any feedback.

It takes advantage of the set builtin to re-assign the positional arguments, allowing the remainder of the script to still handle them as desired.

bashparse.py

#!/usr/bin/env python

import optparse, sys
from pipes import quote

'''
Uses Python's optparse library to simplify command argument parsing.

Takes in a set of optparse arguments, separated by newlines, followed by command line arguments, as argv[2] and argv[3:]
and outputs a series of bash commands to populate associated variables.
'''

class _ThrowParser(optparse.OptionParser):
def error(self, msg):
"""Overrides optparse's default error handling
and instead raises an exception which will be caught upstream
"""
raise optparse.OptParseError(msg)

def gen_parser(usage, opts_ls):
'''Takes a list of strings which can be used as the parameters to optparse's add_option function.
Returns a parser object able to parse those options
'''
parser = _ThrowParser(usage=usage)
for opts in opts_ls:
if opts:
# yes, I know it's evil, but it's easy
eval('parser.add_option(%s)' % opts)
return parser

def print_bash(opts, args):
'''Takes the result of optparse and outputs commands to update a shell'''
for opt, val in opts.items():
if val:
print('%s=%s' % (opt, quote(val)))
print("set -- %s" % " ".join(quote(a) for a in args))

if __name__ == "__main__":
if len(sys.argv) < 2:
sys.stderr.write("Needs at least a usage string and a set of options to parse")
sys.exit(2)
parser = gen_parser(sys.argv[1], sys.argv[2].split('\n'))

(opts, args) = parser.parse_args(sys.argv[3:])
print_bash(opts.__dict__, args)

Example usage:

#!/bin/bash

usage="[-f FILENAME] [-t|--truncate] [ARGS...]"
opts='
"-f"
"-t", "--truncate",action="store_true"
'

echo "$(./bashparse.py "$usage" "$opts" "$@")"
eval "$(./bashparse.py "$usage" "$opts" "$@")"

echo
echo OUTPUT

echo $f
echo $@
echo $0 $2

Which, if run as: ./run.sh one -f 'a_filename.txt' "two' still two" three outputs the following (notice that the internal positional variables are still correct):

f=a_filename.txt
set -- one 'two'"'"' still two' three

OUTPUT
a_filename.txt
one two' still two three
./run.sh two' still two

Disregarding the debugging output, you're looking at approximately four lines to construct a powerful argument parser. Thoughts?

/bin/sh: parsing command line arguments results in: shift: can't shift that many

The for loop keeps its own private copy of the positional parameter list that you can't alter using shift or set (see Modifying positional parameters while iterating over them in POSIX sh).

Use a while loop instead.

parse_args()
while test $# -gt 0; do
case $1 in
(-P)
p=$2
shift ;;
(*)
f=$1
esac
shift
done

p= f=
parse_args "$@"

How Bash parse multi-flag commands?

The shell does not attempt to parse command arguments; that's the responsibility of the utility. The range of possible command argument syntaxes, both in use and potentially useful, is far too great to attempt that.

On Unix-like systems, the shell identifies individual arguments from the command line, mostly by splitting at whitespace but also taking into account the use of quotes and a variety of other transformations, such as "glob expansion". It then makes a vector of these arguments ("argv") and passes the vector to execve, which hands them to the newly created process.

On Windows systems, the shell doesn't even do that. It just hands over the command-line as a string, and leaves it to the command-line tool to do everything. (In order to provide a modicum of compatibility, there's an intermediate layer which is called by the application initialization code, which eventually calls main(). This does some basic argument-splitting, although its quoting algorithm is quite a bit simplified from that used by a Unix shell.)

No command-line shell that I know of attempts to identify command-line flags. And neither should you.

For a bit of extracurricular reading, here's the description of shell parsing from the Posix standard: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html. Trying to implement all that goes far beyond the requirements given to you for this assignment, and I'm certainly not recommending that you do that. But it might still be interesting, and understanding it will help you immensely if you start using a shell.

Alternatively, you could try reading the Bash manual, which might be easier to understand. Note that Bash implements a lot of extensions to the Posix standard.



Related Topics



Leave a reply



Submit