What Is File Globbing

What is file globbing?

Globbing is the * and ? and some other pattern matchers you may be familiar with.

Globbing interprets the standard wild card characters * and ?, character lists in square brackets, and certain other special characters (such as ^ for negating the sense of a match).

When the shell sees a glob, it will perform pathname expansion and replace the glob with matching filenames when it invokes the program.

For an example of the * operator, say you want to copy all files with a .jpg extension in the current directory to somewhere else:

cp *.jpg /some/other/location

Here *.jpg is a glob pattern that matches all files ending in .jpg in the current directory. It's equivalent to (and much easier than) listing the current directory and typing in each file you want manually:

$ ls
cat.jpg dog.jpg drawing.png recipes.txt zebra.jpg

$ cp cat.jpg dog.jpg zebra.jpg /some/other/location

Note that it may look similar, but it is not the same as Regular Expressions.

You can find more detailed information here and here

In what scenarios does file globbing not work in bash?

The comments already mentioned it in brief, but to have a decent answer on the question and a more detailed explanation:

In the command

sudo du -sh /var/lib/docker/*

The globbing you are performing takes place before the sudo is executed. Only the call of du is done with root permissions. That means if the directory /var/lib/docker/ is restricted for you as a normal user and you cannot read it (missing r permissions), then the globbing asterisk will not evaluate to anything. Default setting in the bash then is to leave it unchanged, so the string remains /var/lib/docker/*.

Then the arguments du, -sh, and /var/lib/docker/* are passed to sudo which then executes du with root permissions and passes the arguments -sh and /var/lib/docker/*. du then tries to find a file with exactly this name and will probably find nothing because no file in there is named *.

Do achieve what you want you need to make the globbing be done with root permissions also. For this you need to start a shell (only shells do the globbing) with root permissions:

sudo bash -c 'du -sh /var/lib/docker/*'

This way, the arguments bash, -c and du -sh /var/lib/docker/* are passed to the command sudo. Then sudo starts the bash with root permissions and passes the commands -c, du -sh /var/lib/docker/*. Then the bash understands because of the -c options that is supposed to evaluate and execute the command du -sh /var/lib/docker/*. It then splits the command by spaces into the "words" du, -sh, and /var/lib/docker/*. Now it performs any necessary globbing expansion (with root permissions, so it is allowed to read the contents of the directory /var/lib/docker/) on each of the words. It will replace the last word by /var/lib/docker/aufs, /var/lib/docker/builder, /var/lib/docker/buildkit, and several more. As a last step it will call du with -sh and the result of the globbing expansion. Root permissions will be inherited for this, so du then also runs with them.

Is there a globbing pattern to match by file extension, both PWD and recursively?

With shell globing it is possible to only get directories by adding a / at the end of the glob, but there's no way to exclusively get files (zsh being an exception)

Illustration:

With the given tree:

file.php
inc.php/include.php
lib/lib.php

Supposing that the shell supports the non-standard ** glob:

  • **/*.php/ expands to inc.php/

  • **/*.php expands to file.php inc.php inc.php/include.php lib/lib.php

  • For getting file.php inc.php/include.php lib/lib.php, you cannot use a glob.

    => with zsh it would be **/*.php(.)

Standard work-around (any shell, any OS)

The POSIX way to recursively get the files that match a given standard glob and then apply a command to them is to use find -type f -name ... -exec ...:

  • ls -l <all .php files> would be:
find . -type f -name '*.php' -exec ls -l {} +
  • grep "finde me" <all .php files> would be:
find . -type f -name '*.php' -exec grep "finde me" {} +
  • cp <all .php files> ~/destination/ would be:
find . -type f -name '*.php' -type f -exec sh -c 'cp "$@" ~/destination/' _ {} +

remark: This one is a little more tricky because you need ~/destination/ to be after the file arguments, and find's syntax doesn't allow find -exec ... {} ~/destination/ +

Linux file names & file globbing


*[89][0-9][0-9].enc

That uses Bash's "pathname expansion" feature (aka "globbing") to match all files ending with a number between 800 and 999 followed by ".enc". (This is not a regular expression).

For example, using the above expression you can do this in your script:

mv *[89][0-9][0-9].enc path/to/destination/

If you need it to also match a file named like "cp850-1.enc", then you would need to change the expression to:

*[89][0-9][0-9]*.enc

File Glob Patterns in Linux terminal

A nice way to do this is to use extended globs. With them, you can perform regular expressions on Bash.

To start you have to enable the extglob feature, since it is disabled by default:

shopt -s extglob

Then, write a regex with the required condition: stuff + ka + either v or bh + i + stuff. All together:

ls -l *ka@(v|bh)i*

The syntax is a bit different from the normal regular expressions, so you need to read in Extended Globs that...

@(list): Matches one of the given patterns.

Test

$ ls
a.php AABB AAkabhiBB AAkabiBB AAkaviBB s.sh
$ ls *ka@(v|bh)i*
AAkabhiBB AAkaviBB

Filename and File Globbing

You can copy from dir1 to dir2 each file that exists between 0 - 29108273357520896 fairly easily:

#!/bin/bash

declare -i maxval=29108273357520896

function usage {

cat >&2 << TAG

Copy all files from 'srcdir' to 'tgtdir' with numeric names less than 'maxname'.

Usage: "${0//*\//}" srcdir tgtdir [maxname] (maxname default: $maxval)

TAG

exit 1
}

## test required input
if [ -z "$1" -o -z "$2" ]; then
printf "\n error: insufficient input.\n"
usage
fi

## assign variables
srcdir="$1"
tgtdir="$2"
declare -i maxname="${3:-$maxval}" # default maxval

## validate srcdir
if [ ! -d "$srcdir" ]; then
printf "\n error: source dir does not exist.\n"
usage
fi

## validate or create tgtdir
[ -d "$tgtdir" ] || mkdir -p "$tgtdir"
if [ ! -d "$tgtdir" ]; then
printf "\n error: tgtdir does not exist and cannot be created, check permissions.\n"
usage
fi

## validate maxname
if [ $maxname -gt $maxval ]; then
printf "\n error: invalid 'maxname'. value exceeds maximum allowed: %s\n" "$maxval"
usage
fi

## for 0 - $maxname, check that file exists, if so copy to tgtdir
for ((i=0; i<$maxname; i++)); do
[ -f "$i" ] && cp -a "${srcdir}/${i}" "${tgtdir}"
done

exit 0

As a one-liner in the dir with the files

for ((i=0; i<29108273357520896; i++)); do [ -f "$i" ] && cp -a "$i" "/path/to/new/dir"; done

Globbing patterns in windows command prompt/ powershell

In PowerShell you can use Resolve-Path which Resolves the wildcard characters in a path, and displays the path contents.

Example: I want to locate signtool.exe from the Windows SDK which typically resides in "c:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64\signtool.exe" where there could be any other version(s) installed.

So I could use: Resolve-Path 'c:\program*\Windows Kits\10\bin\*\x64\signtool.exe'

EDIT:

If you want to execute it directly you can use the & invocation operator e.g.

&(Resolve-Path 'c:\wind?ws\n?tepad.exe')



Related Topics



Leave a reply



Submit