Pass Command-Line Arguments to Grep as Search Patterns and Print Lines Which Match Them All

Pass command-line arguments to grep as search patterns and print lines which match them all

suggesting to use awk pattern logic:

 awk '/RegExp-pattern-1/ && /RegExp-pattern-2/ && /RegExp-pattern-3/ 1' input.txt

The advantages: you can play with logic operators && || on RegExp patterns. And your are scanning the whole file once.

The disadvantages: must provide files list (can't traverse sub directories), and limited RegExp syntax compared to grep -E or grep -P

creating a shell script that does mutilple grep operation (AND operation)

You need this:

#!/bin/bash
myArray=( "$@" ) # store all parameters as an array
grep="grep ${myArray[0]} somefile" # use the first param as the first grep
unset myArray[0]; # then unset the first param
for arg in "${myArray[@]}"; do
grep="$grep | grep '$arg'" # cycle through the rest of the params to build the AND grep logic
done
eval "$grep" # and finally execute the built line

How to grep for two words existing on the same line?

Why do you pass -c? That will just show the number of matches. Similarly, there is no reason to use -r. I suggest you read man grep.

To grep for 2 words existing on the same line, simply do:

grep "word1" FILE | grep "word2"

grep "word1" FILE will print all lines that have word1 in them from FILE, and then grep "word2" will print the lines that have word2 in them. Hence, if you combine these using a pipe, it will show lines containing both word1 and word2.

If you just want a count of how many lines had the 2 words on the same line, do:

grep "word1" FILE | grep -c "word2"

Also, to address your question why does it get stuck : in grep -c "word1", you did not specify a file. Therefore, grep expects input from stdin, which is why it seems to hang. You can press Ctrl+D to send an EOF (end-of-file) so that it quits.

How to grep and execute a command (for every match)

grep file foo | while read line ; do echo "$line" | date %s.%N ; done

More readably in a script:

grep file foo | while read line
do
echo "$line" | date %s.%N
done

For each line of input, read will put the value into the variable $line, and the while statement will execute the loop body between do and done. Since the value is now in a variable and not stdin, I've used echo to push it back into stdin, but you could just do date %s.%N "$line", assuming date works that way.

Avoid using for line in `grep file foo` which is similar, because for always breaks on spaces and this becomes a nightmare for reading lists of files:

 find . -iname "*blah*.dat" | while read filename; do ....

would fail with for.

Print only a part of a match with grep

You can use a Perl one-liner to match each line of the file against a single regex with an appropriate capture group, and for each line that matches you can print the submatch.

There are several ways to use Perl for this task. I suggest going with the perl -ne {program} idiom, which implicitly loops over the lines of stdin and executes the one-liner {program} once for each line, with the current line made available as the $_ special variable. (Note: The -n option does not cause the final value of $_ to be automatically printed at the end of each iteration of the implicit loop, which is what the -p option would do; that is, perl -pe {program}.)

Below is the solution. Note that I decided to pass the target hostname using the obscure -s option, which enables parsing of variable assignment specifications after the {program} argument, similar to awk's -v option. (It is not possible to pass normal command-line arguments with the -n option because the implicit while (<>) { ... } loop gobbles up all such arguments for file names, but the -s mechanism provides an excellent solution. See Is it possible to pass command-line arguments to @ARGV when using the -n or -p options?.) This design prevents the need to embed the $DHCP_HOSTNAME variable in the {program} string itself, which allows us to single-quote it and save a few (actually 8) backslashes.

DHCP_HOSTNAME='client3';
perl -nse 'print($1) if m(^\s*host\s*$host\s*\{.*\bhardware\s*ethernet\s*(..:..:..:..:..:..));' -- -host="$DHCP_HOSTNAME" <dhcpd.cfg;
## AB:CD:EF:01:23:45

I often prefer Perl to sed for the following reasons:

  • Perl provides a complete general-purpose programming environment, whereas sed is more limited.
  • Perl has an enormous repository of publicly available modules on CPAN which can easily be installed and then used with the -M{module} option. sed is not extensible.
  • Perl has a much more powerful regular expression engine than sed, with lookaround assertions, backtracking control verbs, within-regex and replacement Perl code, many more options and special escapes, embedded group options, and more. See perlre.
  • Counter-intuitively, despite its greater sophistication, Perl is often much faster than sed due to its two-pass process and highly optimized opcode implementation. See http://rc3.org/2014/08/28/surprisingly-perl-outperforms-sed-and-awk/ for example.
  • I often find that the equivalent Perl implementation is more intuitive than that of sed, since sed has a more primitive set of commands for manipulating the underlying text.

How to print the file names from which I grep some lines

the problem is that I don't know which line comes from which file

Well no, you don't, because you have concatenated the contents of all the files into a single stream. If you want to be able to identify at the point of pattern matching which file each line comes from then you have to give that information to grep in the first place. Like this, for example:

find ./*/*/folderA/*DTI*.json |
xargs grep -i -E -H '(phaseencodingdirection|phaseencodingaxis)' > phase_direction

The xargs program converts lines read from its standard input into arguments to the specified command (grep in this case). The -H option to grep causes it to list the filename of each match along with the matching line itself.

Alternatively, this variation on the same thing is a little simpler, and closer in some senses to the original:

grep -i -E -H '(phaseencodingdirection|phaseencodingaxis)' \
$(find ./*/*/folderA/*DTI*.json) > phase_direction

That takes xargs out of the picture, and moves the command substitution directly to the argument list of grep.

But now observe that if the pattern ./*/*/folderA/*DTI*.json does not match any directories then find isn't actually doing anything useful for you. There is then no directory recursion to be done, and you haven't specified any tests, so the command substitution will simply expand to all the paths that match the pattern, just like the pattern would do if expanded without find. Thus, this is probably best of all:

grep -i -E -H '(phaseencodingdirection|phaseencodingaxis)' \
./*/*/folderA/*DTI*.json > phase_direction

Match two strings in one line with grep

You can use

grep 'string1' filename | grep 'string2'

Or

grep 'string1.*string2\|string2.*string1' filename

Always include first line in grep

You could include an alternate pattern match for the one of the column names. If a column was called COL then this would work:

$ grep -E 'COL|pattern' file.csv

grep: show lines surrounding each match

For BSD or GNU grep you can use -B num to set how many lines before the match and -A num for the number of lines after the match.

grep -B 3 -A 2 foo README.txt

If you want the same number of lines before and after you can use -C num.

grep -C 3 foo README.txt

This will show 3 lines before and 3 lines after.



Related Topics



Leave a reply



Submit