Parsing Command-Line Arguments as Wildcards

Parsing command-line arguments as wildcards

You didn't mention which operating system you are using (it matters).

On Windows, whatever you type on the command line gets passed to the program without modification. So if you type

./script.rb -a /path/*

then the arguments to the program contain "-a" and "/path/*".

On Unix and other systems with similar shells, the shell does argument expansion that automatically expands wildcards in the command line. So when you type the same command above, the shell looks to find the files in the /path/* directory and expands the command line arguments before your program runs. So the arguments to your program might be "-a", "/path/file1", and "/path/file2".

An important point is that the script cannot find out whether argument expansion happened, or whether the user actually typed all those filenames out on the command line.

Passing arguments with wildcards to a Python script

You can use the glob module, that way you won't depend on the behavior of a particular shell (well, you still depend on the shell not expanding the arguments, but at least you can get this to happen in Unix by escaping the wildcards :-) ).

from glob import glob
filelist = glob('*.csv') #You can pass the sys.argv argument

Using a Wildcard Match as a Command Line Argument in Bash

Does this do what you need:

dest='/desktop/'
for ARG in "$@"; do
/some/other/script "$ARG" "$dest$ARG.new"
done

EDIT: To remove the path on ARG

dest='/desktop/'
for ARG in "$@"; do
/some/other/script "$ARG" "$dest$(basename "$ARG").new"
done

Stop expanding wildcard symbols in command line arguments to Java

Background: On Linux, it is not Java that is expanding the wildcards in the command arguments. The shell does it before the java command is launched.

The way to stop the shell from expanding wildcards is to quote the arguments. How you do this depends on the shell you are using.


Now for the Windows case ... which is what you are really asking about.

From what I have read, the standard "cmd.exe" shell (in its various versions / flavours) does NOT do wildcard expansion. It is left to the application to do expansion (or not) on an ad-hoc basis.

Obviously this is problematic for the Java "write once, run everywhere" philosophy, so the Java designers have tried to make wild-cards in command line arguments work on Windows like they do on Unix and Linux. But unfortunately, they can't do a perfect job of this ... hence this anomaly.

However, according to this page, putting double quotes around an argument tells Java to not do wild-card expansion.

But if this doesn't help, you are probably out of luck.


Here are some links to Oracle documentation on this topic, taken from Oracle Java bug report #5036373:

Java Wildcard expansion on Windows platform has been documented. See
the following links:

http://docs.oracle.com/javase/7/docs/technotes/tools/windows/java.html

http://docs.oracle.com/javase/7/docs/technotes/tools/windows/classpath.html

Wildcard expansion does not work in a Windows command shell for a
single element classpath due to the Microsoft bug described in:
http://connect.microsoft.com/VisualStudio/feedback/details/98756/vs2005-setargv-obj-wildcard-handling-broken.

The limitations are also mentioned in 7u10 release notes:
http://www.oracle.com/technetwork/java/javase/7u10-relnotes-1880995.html

However, I think that the Oracle employee who wrote that was being deliberately obtuse, because the wildcard expansion in general is patently NOT documented in those "manual" pages. They only talk about wildcard expansion in the -cp argument.

How do I parse command line arguments in Bash?

Bash Space-Separated (e.g., --option argument)

cat >/tmp/demo-space-separated.sh <<'EOF'
#!/bin/bash

POSITIONAL_ARGS=()

while [[ $# -gt 0 ]]; do
case $1 in
-e|--extension)
EXTENSION="$2"
shift # past argument
shift # past value
;;
-s|--searchpath)
SEARCHPATH="$2"
shift # past argument
shift # past value
;;
--default)
DEFAULT=YES
shift # past argument
;;
-*|--*)
echo "Unknown option $1"
exit 1
;;
*)
POSITIONAL_ARGS+=("$1") # save positional arg
shift # past argument
;;
esac
done

set -- "${POSITIONAL_ARGS[@]}" # restore positional parameters

echo "FILE EXTENSION = ${EXTENSION}"
echo "SEARCH PATH = ${SEARCHPATH}"
echo "DEFAULT = ${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)

if [[ -n $1 ]]; then
echo "Last line of file specified as non-opt/last argument:"
tail -1 "$1"
fi
EOF

chmod +x /tmp/demo-space-separated.sh

/tmp/demo-space-separated.sh -e conf -s /etc /etc/hosts
Output from copy-pasting the block above
FILE EXTENSION  = conf
SEARCH PATH = /etc
DEFAULT =
Number files in SEARCH PATH with EXTENSION: 14
Last line of file specified as non-opt/last argument:
#93.184.216.34 example.com
Usage
demo-space-separated.sh -e conf -s /etc /etc/hosts


Bash Equals-Separated (e.g., --option=argument)

cat >/tmp/demo-equals-separated.sh <<'EOF'
#!/bin/bash

for i in "$@"; do
case $i in
-e=*|--extension=*)
EXTENSION="${i#*=}"
shift # past argument=value
;;
-s=*|--searchpath=*)
SEARCHPATH="${i#*=}"
shift # past argument=value
;;
--default)
DEFAULT=YES
shift # past argument with no value
;;
-*|--*)
echo "Unknown option $i"
exit 1
;;
*)
;;
esac
done

echo "FILE EXTENSION = ${EXTENSION}"
echo "SEARCH PATH = ${SEARCHPATH}"
echo "DEFAULT = ${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)

if [[ -n $1 ]]; then
echo "Last line of file specified as non-opt/last argument:"
tail -1 $1
fi
EOF

chmod +x /tmp/demo-equals-separated.sh

/tmp/demo-equals-separated.sh -e=conf -s=/etc /etc/hosts
Output from copy-pasting the block above
FILE EXTENSION  = conf
SEARCH PATH = /etc
DEFAULT =
Number files in SEARCH PATH with EXTENSION: 14
Last line of file specified as non-opt/last argument:
#93.184.216.34 example.com
Usage
demo-equals-separated.sh -e=conf -s=/etc /etc/hosts

To better understand ${i#*=} search for "Substring Removal" in this guide. It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.



Using bash with getopt[s]

getopt(1) limitations (older, relatively-recent getopt versions):

  • can't handle arguments that are empty strings
  • can't handle arguments with embedded whitespace

More recent getopt versions don't have these limitations. For more information, see these docs.



POSIX getopts

Additionally, the POSIX shell and others offer getopts which doen't have these limitations. I've included a simplistic getopts example.

cat >/tmp/demo-getopts.sh <<'EOF'
#!/bin/sh

# A POSIX variable
OPTIND=1 # Reset in case getopts has been used previously in the shell.

# Initialize our own variables:
output_file=""
verbose=0

while getopts "h?vf:" opt; do
case "$opt" in
h|\?)
show_help
exit 0
;;
v) verbose=1
;;
f) output_file=$OPTARG
;;
esac
done

shift $((OPTIND-1))

[ "${1:-}" = "--" ] && shift

echo "verbose=$verbose, output_file='$output_file', Leftovers: $@"
EOF

chmod +x /tmp/demo-getopts.sh

/tmp/demo-getopts.sh -vf /etc/hosts foo bar
Output from copy-pasting the block above
verbose=1, output_file='/etc/hosts', Leftovers: foo bar
Usage
demo-getopts.sh -vf /etc/hosts foo bar

The advantages of getopts are:

  1. It's more portable, and will work in other shells like dash.
  2. It can handle multiple single options like -vf filename in the typical Unix way, automatically.

The disadvantage of getopts is that it can only handle short options (-h, not --help) without additional code.

There is a getopts tutorial which explains what all of the syntax and variables mean. In bash, there is also help getopts, which might be informative.

Is there a way to get Perl to support wildcard command-line arguments like *.txt on Windows?

Core module File::DosGlob provides the tools to expand wildcards in the manner a Windows user would expect, so it's just a question to use the glob provided by this module as follows:

use File::DosGlob qw( glob );

@ARGV = map glob, @ARGV;

Note that doing this using the builtin glob would break paths that contain spaces, a relatively common occurrence on Windows. It would also mishandle *.*, which is expected to return all files.

Note that it's best to expand the patterns after processing command-line options to avoid risking expanding the pattern into a command-line option.

use File::DosGlob qw( glob );
use Getopt::Long qw( GetOptions );

GetOptions(...)
or die_usage();

@ARGV = map glob, @ARGV;

For a one-liner, you could use the following:

perl -MFile::DosGlob=glob -ne"BEGIN { @ARGV = map glob, @ARGV } ..." ...

The BEGIN ensures the code is run before the input-reading loop created by -n starts.

How do I pass a wildcard parameter to a bash file

The parent shell, the one invoking bash show_files.sh *, expands the * for you.

In your script, you need to use:

for dir in "$@"
do
echo "$dir"
done

The double quotes ensure that multiple spaces etc in file names are handled correctly.

See also How to iterate over arguments in a bash shell script.


Potentially confusing addendum

If you're truly sure you want to get the script to expand the *, you have to make sure that * is passed to the script (enclosed in quotes, as in the other answers), and then make sure it is expanded at the right point in the processing (which is not trivial). At that point, I'd use an array.

names=( $@ )
for file in "${names[@]}"
do
echo "$file"
done

I don't often use $@ without the double quotes, but this is one time when it is more or less the correct thing to do. The tricky part is that it won't handle wild cards with spaces in very well.

Consider:

$ > "double  space.c"
$ > "double space.h"
$ echo double\ \ space.?
double space.c double space.h
$

That works fine. But try passing that as a wild-card to the script and ... well, let's just say it gets to be tricky at that point.

If you want to extract $2 separately, then you can use:

names=( $1 )
for file in "${names[@]}"
do
echo "$file"
done
# ... use $2 ...


Related Topics



Leave a reply



Submit