Grep Files Based on Time Stamp

grep files based on time stamp

To find 'pattern' in all files newer than some_file in the current directory and its sub-directories recursively:

find -newer some_file -type f -exec grep 'pattern' {} +

You could specify the timestamp directly in date -d format and use other find tests e.g., -name, -mmin.

The file list could also be generate by your build system if find is too slow.

More specific tools such as ack, etags, GCCSense might be used instead of grep.

How to grep a directory based on the timestamp of the file?

find . -type f -mtime -2 -exec grep -H 0xdataH20 {} \;

How to print file details of files matching grep pattern

grep -Zl 3456 * | xargs -0 ls -l

with GNU grep. The options are:

  • grep -Z and xargs -0: separate output names by a NULL byte instead of by whitespace. This way you can handle filenames that include spaces.
  • grep -l: print only the filenames that match
  • ls -l: Standard ls long output, which appears to be what you are asking for.

Tested on latest cygwin.

How to grep files in date order

You may use this pipeline to achieve this with gnu utilities:

find . -maxdepth 1 -name '*.py' -printf '%T@:%p\0' |
sort -z -t : -rnk1 |
cut -z -d : -f2- |
xargs -0 grep 'pattern'

This will handle filenames with special characters such as space, newline, glob etc.

  1. find finds all *.py files in current directory and prints modification time (epoch value) + : + filename + NUL byte
  2. sort command performs reverse numeric sort on first column that is timestamp
  3. cut command removes 1st column (timestamp) from output
  4. xargs -0 grep command searches pattern in each file

How do I get the dates of files in my output from a grep search?

The problem is that grep is outputting the line that match your string, not the file name, so that in your second example your trying to call stat on a string, not on a file!

You should add a -l parameter to your grep command in order to not output the matching line but the file that contains it. Try this:

grep -lrnw '/my_path/' -e 'search_string' | stat -c %n':'%z > list.txt

[EDIT] Anyway this would not work because the stat command does not accept input from a pipe. The solution is then

stat -c %n':'%z $(grep -lrnw '/my_path/' -e 'search_string') > list.txt

GREP date from email header and make it the files creation date

What follows assumes you are using the default macOS utilities (touch, date...) As they are completely outdated some adjustments will be needed if you use more recent versions (e.g. macports or brew). It also assumes that you are using bash.

If you have sub-folders ls is not the right tool. And anyway, the output of ls is not for computers, it is for humans. So, the first thing to do is find all email files. Guess what? The utility that does this is named find:

$ find . -type f -name '*.emlx'
foo/bar.emlx
baz.emlx
...

searches for true files (-type f) starting from the current directory (.) and which name is anything.emlx (-name '*.emlx'). Adapt to your situation. If all files are email files you can skip the -name ... part.

Next we need to loop over all these files and process each of them. This is a bit more complex than for f in ... for several reasons (large number of files, file names with spaces...) A robust way to do this is to redirect the output of a find command to a while loop:

while IFS= read -r -d '' f; do
<process file "$f">
done < <(find . -type f -name '*.emlx' -print0)

The -print0 option of find is used to separate the file names with a null character instead of the default newline character. The < <(find...) part is a way to redirect the output of find to the input of the while loop. The while IFS= read -r -d '' f; do reads each file name produced by find, stores it in shell variable f, preserving the leading and trailing spaces if any (IFS=), the backslashes (-r) and using the null character as separator (-d '').

Now we must code the processing of each file. Let's first retrieve the delivery time, assuming it is always the second word of the last line starting with X-Delivery-Time::

awk '/^X-Delivery-Time:/ {t = $2} END {print t}' "$f"

does that. If you don't know awk already it's time to learn a bit of it. It's one of the very useful Swiss knives of text processing (sed is another). But let's improve it a bit such that it returns the first encountered delivery time instead of the last, stops as soon as it encountered it, and also checks that the timestamp is a real timestamp (digits):

awk '/^X-Delivery-Time:[[:space:]]+[[:digit:]]+$/ {print $2; exit}' "$f"

The [[:space:]]+ part of the regular expression matches 1 or more spaces, tabs,... and the [[:digit:]]+ matches 1 or more digits. ^ and $ match the beginning and the end of the line, respectively. The result can be assigned to a shell variable:

t="$(awk '/^X-Delivery-Time:[[:space:]]+[[:digit:]]+$/ {print $2; exit}' "$f")"

Note that if there was no match the t variable will store the empty string. We will use this later to skip such files.

Once we have this delivery time, which looks like a UNIX timestamp (seconds since 1970/01/01) in your example, we must use it to change the last modification time of the email file. The command that does this is touch:

$ man touch
...
touch [-A [-][[hh]mm]SS] [-acfhm] [-r file] [-t [[CC]YY]MMDDhhmm[.SS]] file ...
...

Unfortunately touch wants a time in the CCYYMMDDhhmm.SS format. No worry, the date utility can be used to convert a UNIX timestamp in any format we like. For instance, with your example timestamp (1535436541):

$ date -r 1535436541 +%Y%m%d%H%M.%S
201808280809.01

We are almost done:

while IFS= read -r -d '' f; do
# uncomment for debugging
# echo "processing $f"
t="$(awk '/^X-Delivery-Time:[[:space:]]+[[:digit:]]+$/ {print $2; exit}' "$f")"
if [ -z "$t" ]; then
echo "no delivery time found in $f"
continue
fi
# uncomment for debugging
# echo touch -t "$(date -r "$t" +%Y%m%d%H%M.%S)" "$f"
touch -t "$(date -r "$t" +%Y%m%d%H%M.%S)" "$f"
done < <(find . -type f -name '*.emlx' -print0)

Note how we test if t is the empty string (if [ -z "$t" ]). If it is, we print a message and jump to the next file (continue). Just put all this in a file with a shebang line and run...

If, instead of the X-Delivery-Time field, you must use a Date field with a more complex and variable format (e.g. Date: Mon, 11 Jun 2018 10:36:14 +0200), the best would be to install a decently recent version of touch with the coreutils package of Mac Ports or Homebrew. Then:

while IFS= read -r -d '' f; do
t="$(awk '/^Date:/ {print gensub(/^Date:[[:space:]+](.*)$/,"\\1","1"); exit}' "$f")"
if [ -z "$t" ]; then
echo "no delivery time found in $f"
continue
fi
touch -d "$t" "$f"
done < <(find . -type f -name '*.emlx' -print0)

The awk command is slightly more complex. It prints the matching line without the Date: prefix. The following sed command would do the same in a more compact form but would not really be more readable:

t="$(sed -rn 's/^Date:\s*(.*)/\1/p;Ta;q;:a' "$f")"

grep for timestamp and word

The main problem with your code is that you're greping for date stamps that are exactly 30 minutes ago instead of any time in the last 30 minutes.

You could accomplish what you want with Awk by passing the current shell's date in seconds as a variable. Then you can convert the datestamp in the log to date in seconds and subtract that from the current date variable to see if it's in the last 30 minutes (1800 seconds).

awk -v current=$(date +%s) -v IGNORECASE=1 -F\. 'BEGIN {
n=0
}
/^[0-9]{1,4}-[0-9]{1,2}-[0-9]{1,2} [0-9]{2}:[0-9]{2}:[0-9]{2}./ && /exception/ {
time=$1
format="\\1 \\2 \\3 \\4 \\5 \\6"
seconds=mktime(gensub(/(....)-(..)-(..) (..):(..):(..)/, format, "", time))
if((current - seconds) <= 1800)
n++
}END{
print n
}' /path/to/the/derp

Grep a time stamp in the H:MM:SS format

Here is the fix:

grep '[0-9]:[0-9][0-9]:[0-9][0-9]'

If you need get timestamp only, and your grep is gnu grep.

grep -o '[0-9]:[0-9][0-9]:[0-9][0-9]'

and if you work more harder, limit on time format only:

grep '[0-2][0-9]:[0-5][0-9]:[0-5][0-9]'


Related Topics



Leave a reply



Submit