Match all files under all nested directories with shell globbing
In Bash 4, with shopt -s globstar
, and zsh you can use **/*
which will include everything except hidden files. You can do shopt -s dotglob
in Bash 4 or setopt dotglob
in zsh to cause hidden files to be included.
In ksh, set -o globstar
enables it. I don't think there's a way to include dot files implicitly, but I think **/{.[^.],}*
works.
Is there a globbing pattern to match by file extension, both PWD and recursively?
With shell globing it is possible to only get directories by adding a /
at the end of the glob, but there's no way to exclusively get files (zsh
being an exception)
Illustration:
With the given tree:
file.php
inc.php/include.php
lib/lib.php
Supposing that the shell supports the non-standard **
glob:
**/*.php/
expands toinc.php/
**/*.php
expands tofile.php inc.php inc.php/include.php lib/lib.php
For getting
file.php inc.php/include.php lib/lib.php
, you cannot use a glob.
=> withzsh
it would be**/*.php(.)
Standard work-around (any shell, any OS)
The POSIX way to recursively get the files that match a given standard glob and then apply a command to them is to use find -type f -name ... -exec ...
:
ls -l <all .php files>
would be:
find . -type f -name '*.php' -exec ls -l {} +
grep "finde me" <all .php files>
would be:
find . -type f -name '*.php' -exec grep "finde me" {} +
cp <all .php files> ~/destination/
would be:
find . -type f -name '*.php' -type f -exec sh -c 'cp "$@" ~/destination/' _ {} +
remark: This one is a little more tricky because you need ~/destination/
to be after the file arguments, and find
's syntax doesn't allow find -exec ... {} ~/destination/ +
What expands to all files in current directory recursively?
This will work in Bash 4:
ls -l {,**/}*.ext
In order for the double-asterisk glob to work, the globstar
option needs to be set (default: on):
shopt -s globstar
From man bash
:
globstar
If set, the pattern ** used in a filename expansion con‐
text will match a files and zero or more directories and
subdirectories. If the pattern is followed by a /, only
directories and subdirectories match.
Now I'm wondering if there might have once been a bug in globstar processing, because now using simply ls **/*.ext
I'm getting correct results.
Regardless, I looked at the analysis kenorb did using the VLC repository and found some problems with that analysis and in my answer immediately above:
The comparisons to the output of the find
command are invalid since specifying -type f
doesn't include other file types (directories in particular) and the ls
commands listed likely do. Also, one of the commands listed, ls -1 {,**/}*.*
- which would seem to be based on mine above, only outputs names that include a dot for those files that are in subdirectories. The OP's question and my answer include a dot since what is being sought is files with a specific extension.
Most importantly, however, is that there is a special issue using the ls
command with the globstar pattern **
. Many duplicates arise since the pattern is expanded by Bash to all file names (and directory names) in the tree being examined. Subsequent to the expansion the ls
command lists each of them and their contents if they are directories.
Example:
In our current directory is the subdirectory A
and its contents:
A
└── AB
└── ABC
├── ABC1
├── ABC2
└── ABCD
└── ABCD1
In that tree, **
expands to "A A/AB A/AB/ABC A/AB/ABC/ABC1 A/AB/ABC/ABC2 A/AB/ABC/ABCD A/AB/ABC/ABCD/ABCD1" (7 entries). If you do echo **
that's the exact output you'd get and each entry is represented once. However, if you do ls **
it's going to output a listing of each of those entries. So essentially it does ls A
followed by ls A/AB
, etc., so A/AB
gets shown twice. Also, ls
is going to set each subdirectory's output apart:
...
<blank line>
directory name:
content-item
content-item
So using wc -l
counts all those blank lines and directory name section headings which throws off the count even farther.
This a yet another reason why you should not parse ls
.
As a result of this further analysis, I recommend not using the globstar pattern in any circumstance other than iterating over a tree of files in this manner:
for entry in **
do
something "$entry"
done
As a final comparison, I used a Bash source repository I had handy and did this:
shopt -s globstar dotglob
diff <(echo ** | tr ' ' '\n') <(find . | sed 's|\./||' | sort)
0a1
> .
I used tr
to change spaces to newlines which is only valid here since no names include spaces. I used sed
to remove the leading ./
from each line of output from find
. I sorted the output of find
since it is normally unsorted and Bash's expansion of globs is already sorted. As you can see, the only output from diff
was the current directory .
output by find
. When I did ls ** | wc -l
the output had almost twice as many lines.
How can I recursively find all files in current and subfolders based on wildcard matching?
Use find
:
find . -name "foo*"
find
needs a starting point, so the .
(dot) points to the current directory.
How to ls all the files in the subdirectories using wildcard?
3 solutions :
Simple glob
ls */*.pdb
Recursive using bash
shopt -s globstar
ls **/*.pdb
Recursive using find
find . -type f -name '*.pdb'
How can I search sub-folders using glob.glob module?
In Python 3.5 and newer use the new recursive **/
functionality:
configfiles = glob.glob('C:/Users/sam/Desktop/file1/**/*.txt', recursive=True)
When recursive
is set, **
followed by a path separator matches 0 or more subdirectories.
In earlier Python versions, glob.glob()
cannot list files in subdirectories recursively.
In that case I'd use os.walk()
combined with fnmatch.filter()
instead:
import os
import fnmatch
path = 'C:/Users/sam/Desktop/file1'
configfiles = [os.path.join(dirpath, f)
for dirpath, dirnames, files in os.walk(path)
for f in fnmatch.filter(files, '*.txt')]
This'll walk your directories recursively and return all absolute pathnames to matching .txt
files. In this specific case the fnmatch.filter()
may be overkill, you could also use a .endswith()
test:
import os
path = 'C:/Users/sam/Desktop/file1'
configfiles = [os.path.join(dirpath, f)
for dirpath, dirnames, files in os.walk(path)
for f in files if f.endswith('.txt')]
Globbing pattern to include all files in the intermediate folder
There is no "webapp" in your directory structure :) Maybe you want something like this?
$ find . -wholename "**/web/libs/*"
./src2/web/libs/t
./src2/web/libs/tt
./src/web/libs/ttt
Bash - What is a good way to recursively find the type of all files in a directory and its subdirectories?
This may help: How to recursively list subdirectories in Bash without using "find" or "ls" commands?
That said, I modified it to accept user input as follows:
#!/bin/bash
recurse() {
for i in "$1"/*;do
if [ -d "$i" ];then
echo "dir: $i"
recurse "$i"
elif [ -f "$i" ]; then
echo "file: $i"
fi
done
}
recurse $1
If you didn't want the files portion (which it appears you don't) then just remove the elif and line below it. I left it in as the original post had it also. Hope this helps.
How to use glob() to find files recursively?
pathlib.Path.rglob
Use pathlib.Path.rglob
from the the pathlib
module, which was introduced in Python 3.5.
from pathlib import Path
for path in Path('src').rglob('*.c'):
print(path.name)
If you don't want to use pathlib, use can use glob.glob('**/*.c')
, but don't forget to pass in the recursive
keyword parameter and it will use inordinate amount of time on large directories.
For cases where matching files beginning with a dot (.
); like files in the current directory or hidden files on Unix based system, use the os.walk
solution below.
os.walk
For older Python versions, use os.walk
to recursively walk a directory and fnmatch.filter
to match against a simple expression:
import fnmatch
import os
matches = []
for root, dirnames, filenames in os.walk('src'):
for filename in fnmatch.filter(filenames, '*.c'):
matches.append(os.path.join(root, filename))
Related Topics
Mmap: Will the Mapped File Be Loaded into Memory Immediately
How to Get Notified for Ip Address Changes Automatically
Shell Script Calls Sudo; How to Suppress the Password Prompt
Which Stack Is Used by Interrupt Handler - Linux
How to Limit the Cache Used by Copying So There Is Still Memory Available for Other Caches
Auto Exit Telnet Command Back to Prompt Without Human Intervention ^] Quit Close Exit Code 1
How to Disable Editing My History in Bash
Maven: Bash Mvn Permission Denied
Codeigniter Url Rewriting .Htaccess Is Not Working on Centos
Installing G++ on Windows Subsystem for Linux
Is There Any Significant Difference Between Tcp_Cork and Tcp_Nodelay in This Use-Case
Getting Exit Status Code from 'Ftp' Command in Linux Shell
Adding a New System Call in Linux Kernel 3.3
How to Run Linux Docker Images on Windows Server 2016
How to Make Sure the Floating Point Arithmetic Result the Same in Both Linux and Windows
How to Add a String to the Beginning of Each File in a Folder in Bash
Bumping Version Numbers for New Releases in Associated Files (Documentation)