Grep Recursively for a Specific File Type on Linux

How can I grep recursively, but only in files with certain extensions?

Just use the --include parameter, like this:

grep -inr --include \*.h --include \*.cpp CP_Image ~/path[12345] | mailx -s GREP email@domain.example

That should do what you want.

To take the explanation from HoldOffHunger's answer below:

  • grep: command

  • -r: recursively

  • -i: ignore-case

  • -n: each output line is preceded by its relative line number in the file

  • --include \*.cpp: all *.cpp: C++ files (escape with \ just in case you have a directory with asterisks in the filenames)

  • ./: Start at current directory.

grep recursively for a specific file type on Linux

Consider checking this answer and that one.

Also this might help you: grep certain file types recursively | commandlinefu.com.

The command is:

grep -r --include="*.[ch]" pattern .

And in your case it is:

grep -r --include="*.html" "onblur" .

How do I grep recursively in files with a certain extension?

find allows you to run a program on each file it finds using the -exec option:

find -name '*.out' -exec grep -H pattern {} \;

{} indicates the file name, and ; tells find that that's the end of the arguments to grep. -H tells grep to always print the file name, which it normally does only when there are multiple files to process.

How do I recursively grep all directories and subdirectories?


grep -r "texthere" .

The first parameter represents the regular expression to search for, while the second one represents the directory that should be searched. In this case, . means the current directory.

Note: This works for GNU grep, and on some platforms like Solaris you must specifically use GNU grep as opposed to legacy implementation. For Solaris this is the ggrep command.

grep recursive filename matching (grep -ir xyz *.cpp) does not work

Grep will recurse through any directories you match with your glob pattern. (In your case, you probably do not have any directories that match the pattern "*.cpp") You could explicitly specify them: grep -ir "xyz" *.cpp */*.cpp */*/*.cpp */*/*/*.cpp, etc. You can also use the --include option (see the example below)

If you are using GNU grep, then you can use the following:

grep -ir --include "*.cpp" "xyz" .

The command above says to search recursively starting in current directory ignoring case on the pattern and to only search in files that match the glob pattern "*.cpp".

OR if you are on some other Unix platform, you can use this:

find ./ -type f -name "*.cpp" -print0 | xargs -0 grep -i "xyz"

If you are sure that none of your files have spaces in their names, you can omit the -print0 argument to find and the -0 to xargs

The command above says the following: find all files (-type f) under the current directory (./) that match the name glob/wildcard "*.cpp" (-name "*.cpp") and then print them out delimited by a null (-print0). That list of files found should be written to the stdin of the next command: xargs. xargs should read from stdin (default behavior) and split its input on nulls (-0) and then call the grep command with the specified options (grep -i "xyz") on that list of files.

If you are interested in learning more about why grep -ir "xyz" *.cpp does not work the way you think it should, you should search for "shell globbing" (here is a good first article on the subject). I'll also try to provide a quick explanation. When you type in the command grep -ir "xyz" *.cpp and hit enter, there are two programs that are involved in executing your command. The first program is your shell (and unless you've done something to customize things, you are probably usually the bash shell - if you've never heard of a shell or bash, that's where you should start looking, there are tons of good articles). Suffice it say that a shell is just a program that is designed to let you navigate the filesystem on your computer and run other programs. (In Windows, when you double click on an icon to launch a program, or open a folder to access a file, the program that you are running is explorer.exe and it is the Windows graphical shell). So, when you type the command grep -ir "xyz" *.cpp, before grep is run, the shell handles reading your command and does a few things. One of the things is does is expand glob patterns (things like *.txt or [0-9]+.pdf). Like I said, if you want to understand it, go read more about it, but the thing you should take away is that the grep command never sees the *.cpp. What happens is, the shell looks in the current directory for any files or directories with a name that match the pattern *.cpp and then replaces them on the command line BEFORE it runs the grep command. (If it doesn't find anything that matches, then it will leave the *.cpp there and grep will see it, but grep because doesn't normally do glob matching, this doesn't do anything for you).

Alternatively, when you type in grep -ir "xyz" *, what happens is that the shell replaces the * with the name of every file and directory in the current directory (because * matches anything). Let's say you had a directory that contained file1, file2, and dir1, and dir2, then the shell would perform its replacements and then execute a command that looked like this grep -ir "xyz" file1 file2 dir1 dir2, which means grep would search file1 and file2 for a line with the string xyz, and because of the -ir it also search recursively through dir1 and dir2 and search any files found for that string as well. Lastly, if you've followed everything I've said so far, then it will make sense to you that grep does have a way to use glob patterns on recursive searches, and that is to use the --include option, as in the command I described earlier: grep -ir --include "*.cpp" "xyz" ., and the reason why we put the *.cpp in quotes in that command is to prevent the shell from trying to expand the glob pattern before we run the command.

How to grep for a file extension

Test for the end of the line with $ and escape the second . with a backslash so it only matches a period and not any character.

grep ".*\.zip$"

However ls *.zip is a more natural way to do this if you want to list all the .zip files in the current directory or find . -name "*.zip" for all .zip files in the sub-directories starting from (and including) the current directory.

How to use grep to search only in a specific file types?


grep -rn --include=*.yml "MYVAR" your_directory

please note that grep is case sensitive by default (pass -i to tell to ignore case), and accepts Regular Expressions as well as strings.

Using grep to recursively search through subdirectories for specific keyword inside specific filename

Continuing from my comment, you can use find to locate the file vsim.log if you do not know its exact location and then use the -execdir option to find to grep the file for the term Elapsed time, e.g.

find path -type f -name "vsim.log" -execdir grep -H 'Elapsed time' '{}' +

That will return the filename along with the matched text which you can simply parse to isolate the filename if desired. You can process all files that match if you anticipate more than one by feeding the results of the find command into a while read -r loop, e.g.

while read -r match; do
# process "$match" as desired
echo "Term 'Elapsed time' found in file ${match%:*}"
done < <(find path -type f -name "vsim.log" -execdir grep -H 'Elapsed time' '{}' +)

Where:

  • find is the swiss-army knife for finding files on your system

  • path can be any relative or absolute path to search (e.g. $HOME or /home/dorojevets) to search all files in your home directory

  • the option -type f tells find to only locate files (see man find for link handling)

  • the option -name "foo" tell find to only locate files named foo (wildcards allowed)

  • the -exec and -execdir options allow you to execute the command that follows on each file (represented by '{}')

  • the grep -H 'Elapsed time' '{}' being the command to execute on each filename

  • the + being what tells find it has reached the end of the command (\; used with -exec)

  • finally, the ${match%:*} parameter expansion on the variable $match is used to parse the filename from filename:Elapsed time returned by grep -H (the %:* simply being used to trim everything to the first : from the right of $match)

Give that a try and compare the execution time to a recursive grep of the file tree. What you may be missing in this discussion, is that you use find if you know some part of the filename (or file mod time, or set of permissions, etc) that contains the information you need. It can search millions of files in a file tree vastly quicker than you can recursively grep every single file. If you have no clue what file may contain the needed info -- then use grep and just wait...

How to find all files containing specific text (string) on Linux

Do the following:

grep -rnw '/path/to/somewhere/' -e 'pattern'
  • -r or -R is recursive,
  • -n is line number, and
  • -w stands for match the whole word.
  • -l (lower-case L) can be added to just give the file name of matching files.
  • -e is the pattern used during the search

Along with these, --exclude, --include, --exclude-dir flags could be used for efficient searching:

  • This will only search through those files which have .c or .h extensions:
grep --include=\*.{c,h} -rnw '/path/to/somewhere/' -e "pattern"
  • This will exclude searching all the files ending with .o extension:
grep --exclude=\*.o -rnw '/path/to/somewhere/' -e "pattern"
  • For directories it's possible to exclude one or more directories using the --exclude-dir parameter. For example, this will exclude the dirs dir1/, dir2/ and all of them matching *.dst/:
grep --exclude-dir={dir1,dir2,*.dst} -rnw '/path/to/somewhere/' -e "pattern"

This works very well for me, to achieve almost the same purpose like yours.

For more options, see man grep.



Related Topics



Leave a reply



Submit