Finding human-readable files on Unix
find and file are your friends here:
find /dir/to/search -type f -exec sh -c 'file -b {} | grep text &>/dev/null' \; -print
This will find any files (NOTE: it will not find symlinks directories sockets, etc., only regular files) in /dir/to/search and run sh -c 'file -b {} | grep text &>/dev/null' ; which looks at the type of file and looks for text in the description. If this returns true (i.e., text is in the line) then it prints the filename.
NOTE: using the -b flag to file means that the filename is not printed and therefore cannot create any issues with the grep. E.g., without the -b
flag the binary file gettext would erroneously be detected as a textfile.
For example,
root@osdevel-pete# find /bin -exec sh -c 'file -b {} | grep text &>/dev/null' \; -print
/bin/gunzip
/bin/svnshell.sh
/bin/unicode_stop
/bin/unicode_start
/bin/zcat
/bin/redhat_lsb_init
root@osdevel-pete# find /bin -type f -name *text*
/bin/gettext
If you want to look in compressed files use the --uncompress
flag to file. For more information and flags to file see man file.
Viewing all directories that are world readable
You can use find -perm
like this:
find /base/path -type d -perm +o+r
+o+r
will only list directories with word (others) read bit on.
Linux command: How to 'find' only text files?
I know this is an old thread, but I stumbled across it and thought I'd share my method which I have found to be a very fast way to use find
to find only non-binary files:
find . -type f -exec grep -Iq . {} \; -print
The -I
option to grep tells it to immediately ignore binary files and the .
option along with the -q
will make it immediately match text files so it goes very fast. You can change the -print
to a -print0
for piping into an xargs -0
or something if you are concerned about spaces (thanks for the tip, @lucas.werkmeister!)
Also the first dot is only necessary for certain BSD versions of find
such as on OS X, but it doesn't hurt anything just having it there all the time if you want to put this in an alias or something.
EDIT: As @ruslan correctly pointed out, the -and
can be omitted since it is implied.
Recursively find files that are not publicly readable
Use the find
command:
find . ! -perm -o=r
Will search for files within the current directory and subdirectories that has a file permission so that the "others" group cannot read that file.
The manual page for find
gives some examples of these options.
You can run this command as the www-data
user:
find . ! -readable
To find all files that are NOT readable by the web server.
How to search for a text in specific files in unix
It might be better to use find
, since grep
's include/exclude can get a bit confusing:
find -type f -name "*.xml" -exec grep -l 'hello' {} +
This looks for files whose name finishes with .xml
and performs a grep 'hello'
on them. With -l
(L) we make the file name to be printed, without the matched line.
Explanation
find -type f
this finds files in the given directory structure.-name "*.xml"
selects those files whose name finishes with.xml
.-exec
execute a command on every result of thefind
command.-exec grep -l 'hello' {} +
executegrep -l 'hello'
on the given file. With{} +
we are refering to the matched name (it is like doinggrep 'hello' file
but refering to the name of the file provided by thefind
command). Also,grep -l
(L) returns the file name, not the match itself.
How to find all files containing specific text (string) on Linux?
Do the following:
grep -rnw '/path/to/somewhere/' -e 'pattern'
-r
or-R
is recursive,-n
is line number, and-w
stands for match the whole word.-l
(lower-case L) can be added to just give the file name of matching files.-e
is the pattern used during the search
Along with these, --exclude
, --include
, --exclude-dir
flags could be used for efficient searching:
This will only search through those files which have .c or .h extensions:
grep --include=\*.{c,h} -rnw '/path/to/somewhere/' -e "pattern"
This will exclude searching all the files ending with .o extension:
grep --exclude=\*.o -rnw '/path/to/somewhere/' -e "pattern"
For directories it's possible to exclude one or more directories using the
--exclude-dir
parameter. For example, this will exclude the dirsdir1/
,dir2/
and all of them matching*.dst/
:grep --exclude-dir={dir1,dir2,*.dst} -rnw '/path/to/search/' -e "pattern"
This works very well for me, to achieve almost the same purpose like yours.
For more options, see man grep
.
Related Topics
How Use Qt in Visual Studio Code
What's the Meaning of a ! Before a Command in the Shell
How to Enable Bash in Windows 10 Developer Preview
Linux Execute Command Remotely
Limit on File Name Length in Bash
Linux Tool to Send Raw Data to a Tcp Server
Move Files to Directories Based on Extension
How to Access an Environment Variable in a .Desktop File's Exec Line
Why Disabling Interrupts Disables Kernel Preemption and How Spin Lock Disables Preemption
Rm: Cannot Remove: Permission Denied
How to Add a Line to a File in a Shell Script
How to Find Files Modified in Last X Minutes (Find -Mmin Does Not Work as Expected)
Maximum Number of Files/Directories on Linux
How to Set Process Group of a Shell Script