How to Perform Grep Operation on All Files in a Directory

How to grep all files that have a keyword and files should be created today

Use the find command as the source of the list of files to search.

grep -n "pass" $(find . -daystart -ctime 0 -name '*.json')

-ctime says "creation date should have been 0 days ago", and -daystart says to base it on the beginning of the day for "days ago", not "24 hours ago" ago.

How to grep all files beside current dir, parent dir and one definded?

Better to use find with -not option instead of error prone ls | grep:

find . -maxdepth 1 -mindepth 1 -not -name dist

btw just for resolving your attempt, correct ls | grep would be:

ls -a | grep -Ev '^(dist|\.\.?)$'

how can I grep a directory and write the full line found to a file

Your -l flag is the culprit here. It suppresses the contents in the matched lines and just prints out the files containing the match because of the -r recursive search. Even without the -l flag you get a filename:pattern result.

So to just store the patterns in the file, use an extra awk to the pipeline as below.

grep -r "Found an error with id:" . | awk -F: '{ print $2 }' > results.log

Or with just grep just skip printing the filenames with the -h flag ( both FreeBSD and GNU variants support this )

grep -rh "Found an error with id:" . > results.log

and BTW awk is powerful in itself, it can do pattern search by itself on the file system. So you could do

awk '/Found an error with id:/' * > results.log 2>/dev/null

Grep for string in all files in dir, append output file name to txt if found?

Try putting the directory path in "" marks as follows:

#!/bin/bash

files="/path/to/files/*"
for f in $files
do
if grep -i "example" $f; then
echo "found"
echo $f >> ~/found.txt
fi
done

I've just tested the above script and worked each time changing the word grep was searching for.

Grep a word within all files in a list that contains $HOME

You will need a little bit of magic from envsubst:

$ grep word $(cat list.txt | envsubst)
...

Grep inside all files created within date range

This is a little different from Banthar's solution, but it will work with versions of find that don't support -newermt and it shows how to use the xargs command, which is a very useful tool.

You can use the find command to locate files "of a certain age". This will find all files modified between 5 and 10 days ago:

 find /directory -type f -mtime -10 -mtime +5

To then search those files for a string:

 find /directory -type f -mtime -10 -mtime +5 -print0 |
xargs -0 grep -l expression

You can also use the -exec switch, but I find xargs more readable (and it will often perform better, too, but possibly not in this case).

(Note that the -0 flag is there to let this command operate on files with embedded spaces, such as this is my filename.)

Update for question in comments

When you provide multiple expressions to find, they are ANDed together. E.g., if you ask for:

find . -name foo -size +10k

...find will only return files that are both (a) named foo and (b) larger than 10 kbytes. Similarly, if you specify:

find . -mtime -10 -mtime +5

...find will only return files that are (a) newer than 10 days ago and (b) older than 5 days ago.

For example, on my system it is currently:

$ date
Fri Aug 19 12:55:21 EDT 2016

I have the following files:

$ ls -l
total 0
-rw-rw-r--. 1 lars lars 0 Aug 15 00:00 file1
-rw-rw-r--. 1 lars lars 0 Aug 10 00:00 file2
-rw-rw-r--. 1 lars lars 0 Aug 5 00:00 file3

If I ask for "files modified more than 5 days ago (-mtime +5) I get:

$ find . -mtime +5
./file3
./file2

But if I ask for "files modified more than 5 days ago but less than 10 days ago" (-mtime +5 -mtime -10), I get:

$ find . -mtime +5 -mtime -10
./file2

perform an operation for *each* item listed by grep

If I understand your specification, you want:

grep --null -l '<pattern>' directory/*.extension1 | \
xargs -n 1 -0 -I{} bash -c 'rm "$1" "${1%.*}.extension2"' -- {}

This is essentially the same as what @triplee's comment describes, except that it's newline-safe.

What's going on here?

grep with --null will return output delimited with nulls instead of newline. Since file names can have newlines in them delimiting with newline makes it impossible to parse the output of grep safely, but null is not a valid character in a file name and thus makes a nice delimiter.

xargs will take a stream of newline-delimited items and execute a given command, passing as many of those items (one as each parameter) to a given command (or to echo if no command is given). Thus if you said:

printf 'one\ntwo three \nfour\n' | xargs echo

xargs would execute echo one 'two three' four. This is not safe for file names because, again, file names might contain embedded newlines.

The -0 switch to xargs changes it from looking for a newline delimiter to a null delimiter. This makes it match the output we got from grep --null and makes it safe for processing a list of file names.

Normally xargs simply appends the input to the end of a command. The -I switch to xargs changes this to substitution the specified replacement string with the input. To get the idea try this experiment:

printf 'one\ntwo three \nfour\n' | xargs -I{} echo foo {} bar

And note the difference from the earlier printf | xargs command.

In the case of my solution the command I execute is bash, to which I pass -c. The -c switch causes bash to execute the commands in the following argument (and then terminate) instead of starting an interactive shell. The next block 'rm "$1" "${1%.*}.extension2"' is the first argument to -c and is the script which will be executed by bash. Any arguments following the script argument to -c are assigned as the arguments to the script. This, if I were to say:

bash -c 'echo $0' "Hello, world"

Then Hello, world would be assigned to $0 (the first argument to the script) and inside the script I could echo it back.

Since $0 is normally reserved for the script name I pass a dummy value (in this case --) as the first argument and, then, in place of the second argument I write {}, which is the replacement string I specified for xargs. This will be replaced by xargs with each file name parsed from grep's output before bash is executed.

The mini shell script might look complicated but it's rather trivial. First, the entire script is single-quoted to prevent the calling shell from interpreting it. Inside the script I invoke rm and pass it two file names to remove: the $1 argument, which was the file name passed when the replacement string was substituted above, and ${1%.*}.extension2. This latter is a parameter substitution on the $1 variable. The important part is %.* which says

  • % "Match from the end of the variable and remove the shortest string matching the pattern.
  • .* The pattern is a single period followed by anything.

This effectively strips the extension, if any, from the file name. You can observe the effect yourself:

foo='my file.txt'
bar='this.is.a.file.txt'
baz='no extension'
printf '%s\n'"${foo%.*}" "${bar%.*}" "${baz%.*}"

Since the extension has been stripped I concatenate the desired alternate extension .extension2 to the stripped file name to obtain the alternate file name.

Grep and extract specific data in multiple log files

grep can return the whole line or the string which matched. For extracting a different piece of data from the matching lines, turn to sed or Awk.

awk -v search="/libs/granite/omnisearch" '$0 ~ search { s = $0; sub(/.*fulltext=/, "", s); sub(/&.*/, "", s); print $1, s }' ~/Downloads/ReqLogs/*

or

sed -n '\%/libs/granite/omnisearch%s/ .*fulltext=\([^&]*\)&.*/\1/p' ~/Downloads/ReqLogs/*

The sed version is more succinct, but also somewhat more oblique.

\%...% uses the alternate delimiter % so that we can use literal slashes in our search expression.

The s/ .../\1/p then says to replace everything on the matching lines after the first space, capturing anything between fulltext= and &, and replace with the captured substring, then print the resulting line.

The -n flag turns off the default printing action, so that we only print the lines where the search expression matched.

The wildcard ~/Downloads/ReqLogs/* matches all files in that directory; if you really need to traverse subdirectories, too, perhaps add find to the mix.

find ~/Downloads/ReqLogs -type f -exec sed -n '\%/libs/granite/omnisearch%s/ .*fulltext=\([^&]*\)&.*/\1/p' {} +

or similarly with the Awk command after -exec. The placeholder {} tells find where to add the name of the found file(s) and + says to put as many as possible in one go, rather than running a separate -exec for each found file. (If you want that, use \; instead of +.)



Related Topics



Leave a reply



Submit