Pipe Output to Use as the Search Specification for Grep on Linux

Pipe output to use as the search specification for grep on Linux

If using Bash then you can use backticks:

> grep -e "`grep ... ...`" files

the -e flag and the double quotes are there to ensure that any output from the initial grep that starts with a hyphen isn't then interpreted as an option to the second grep.

Note that the double quoting trick (which also ensures that the output from grep is treated as a single parameter) only works with Bash. It doesn't appear to work with (t)csh.

Note also that backticks are the standard way to get the output from one program into the parameter list of another. Not all programs have a convenient way to read parameters from stdin the way that (f)grep does.

grep from the beginning of a file (grep -f )

grep -f is for file containing the patterns.

You cannot call grep -f '^-' because it will not take the '^-' as pattern, but as file name

If you dont want to use a file, you can use the pipe

grep -f -, where the - is signal for taking the stdin/pipe and not file.

Here is an example

echo ^a | grep -f - file.txt is the same as grep '^a' file.txt

Better usage is taking only some patterns from some file and this patterns use for your file

grep '^PAT' patterns.txt | grep -f - myfile

This will take all patterns from file patterns.txt starting with PAT and use this patterns from the next grep to search in myfile.

So you can have dictionary in the file patterns.txt and use it for searching in myfile file.

If you have some kind of dictionary (list of strings in file, separated by newlines) and want use this as patterns containing the string in the beginning of the line and you dont have the ^ in the dictionary, you can use sed

grep '^abc' dict.txt | sed 's/^/^/g' | grep -f - myfile

So, given the file dict.txt

a
abc
abcd
fbdf

will first grep take "abc" and "abcd", prefix them with ^

and call something like grep -e '^abc' -e '^abcd' myfile

Note that ^abcd is a subset of ^abc. So you would probably have a space (or another delimiter) at the end of your string

grep '^abc' dict.txt | sed 's/^/^/;s/$/\ /' | grep -f - myfile

Linux commands pipe: using iwlist with grep to display two piece of information

You can modify the regular expression to just catch multiple words, like this:

sudo iwlist wlp2s0 scan | grep 'ESSID\|Signal level'

See the documentation of grep online or using man grep in your terminal.

Using grep in linux to pipe all urls contained in an xml file to a seperate file

If you like to try gnu awk (due to RS)

awk -v RS="url" -F\" 'NR>1{print $2}' file >newfile
http://www.google.com
http://www.yahoo.com
http://www.bing.com

A simple awk

awk -F\" '/url/{print $4}' file
http://www.google.com
http://www.yahoo.com
http://www.bing.com

This works only if format is same all the time.

How to use the grep with globs patterns? (like we use it in find command)

Your problem is that find uses "globs" while grep uses regular expressions. With find, * means "a string of any length". With grep, * means "any number of times the element (character) that precedes". This way, your command:

grep -rni "gst*Node"  ./

searches for any string that starts with gs, any number of times t, and Node (which is presumably not what you want). Try rather:

grep -rni "gst.*Node"  ./

The . means "any character", so .* really means "a string of any length".

perform an operation for *each* item listed by grep

If I understand your specification, you want:

grep --null -l '<pattern>' directory/*.extension1 | \
xargs -n 1 -0 -I{} bash -c 'rm "$1" "${1%.*}.extension2"' -- {}

This is essentially the same as what @triplee's comment describes, except that it's newline-safe.

What's going on here?

grep with --null will return output delimited with nulls instead of newline. Since file names can have newlines in them delimiting with newline makes it impossible to parse the output of grep safely, but null is not a valid character in a file name and thus makes a nice delimiter.

xargs will take a stream of newline-delimited items and execute a given command, passing as many of those items (one as each parameter) to a given command (or to echo if no command is given). Thus if you said:

printf 'one\ntwo three \nfour\n' | xargs echo

xargs would execute echo one 'two three' four. This is not safe for file names because, again, file names might contain embedded newlines.

The -0 switch to xargs changes it from looking for a newline delimiter to a null delimiter. This makes it match the output we got from grep --null and makes it safe for processing a list of file names.

Normally xargs simply appends the input to the end of a command. The -I switch to xargs changes this to substitution the specified replacement string with the input. To get the idea try this experiment:

printf 'one\ntwo three \nfour\n' | xargs -I{} echo foo {} bar

And note the difference from the earlier printf | xargs command.

In the case of my solution the command I execute is bash, to which I pass -c. The -c switch causes bash to execute the commands in the following argument (and then terminate) instead of starting an interactive shell. The next block 'rm "$1" "${1%.*}.extension2"' is the first argument to -c and is the script which will be executed by bash. Any arguments following the script argument to -c are assigned as the arguments to the script. This, if I were to say:

bash -c 'echo $0' "Hello, world"

Then Hello, world would be assigned to $0 (the first argument to the script) and inside the script I could echo it back.

Since $0 is normally reserved for the script name I pass a dummy value (in this case --) as the first argument and, then, in place of the second argument I write {}, which is the replacement string I specified for xargs. This will be replaced by xargs with each file name parsed from grep's output before bash is executed.

The mini shell script might look complicated but it's rather trivial. First, the entire script is single-quoted to prevent the calling shell from interpreting it. Inside the script I invoke rm and pass it two file names to remove: the $1 argument, which was the file name passed when the replacement string was substituted above, and ${1%.*}.extension2. This latter is a parameter substitution on the $1 variable. The important part is %.* which says

  • % "Match from the end of the variable and remove the shortest string matching the pattern.
  • .* The pattern is a single period followed by anything.

This effectively strips the extension, if any, from the file name. You can observe the effect yourself:

foo='my file.txt'
bar='this.is.a.file.txt'
baz='no extension'
printf '%s\n'"${foo%.*}" "${bar%.*}" "${baz%.*}"

Since the extension has been stripped I concatenate the desired alternate extension .extension2 to the stripped file name to obtain the alternate file name.

Use result of pipeline as argument for another command

You can use xargs to pass the input to a new command. In your example you need to include curly braces in your awk argument as well.

./command1 | grep '^\[' | awk '{ print $2 } ' | xargs ./command2

Or more concisely

.command1 | awk '/^\[/ { print $2 }' | xargs ./command2

Example:

echo "[1000]  3000" | awk '/^\[/ { print $2 }' | xargs echo

Output:

3000

Print only a part of a match with grep

You can use a Perl one-liner to match each line of the file against a single regex with an appropriate capture group, and for each line that matches you can print the submatch.

There are several ways to use Perl for this task. I suggest going with the perl -ne {program} idiom, which implicitly loops over the lines of stdin and executes the one-liner {program} once for each line, with the current line made available as the $_ special variable. (Note: The -n option does not cause the final value of $_ to be automatically printed at the end of each iteration of the implicit loop, which is what the -p option would do; that is, perl -pe {program}.)

Below is the solution. Note that I decided to pass the target hostname using the obscure -s option, which enables parsing of variable assignment specifications after the {program} argument, similar to awk's -v option. (It is not possible to pass normal command-line arguments with the -n option because the implicit while (<>) { ... } loop gobbles up all such arguments for file names, but the -s mechanism provides an excellent solution. See Is it possible to pass command-line arguments to @ARGV when using the -n or -p options?.) This design prevents the need to embed the $DHCP_HOSTNAME variable in the {program} string itself, which allows us to single-quote it and save a few (actually 8) backslashes.

DHCP_HOSTNAME='client3';
perl -nse 'print($1) if m(^\s*host\s*$host\s*\{.*\bhardware\s*ethernet\s*(..:..:..:..:..:..));' -- -host="$DHCP_HOSTNAME" <dhcpd.cfg;
## AB:CD:EF:01:23:45

I often prefer Perl to sed for the following reasons:

  • Perl provides a complete general-purpose programming environment, whereas sed is more limited.
  • Perl has an enormous repository of publicly available modules on CPAN which can easily be installed and then used with the -M{module} option. sed is not extensible.
  • Perl has a much more powerful regular expression engine than sed, with lookaround assertions, backtracking control verbs, within-regex and replacement Perl code, many more options and special escapes, embedded group options, and more. See perlre.
  • Counter-intuitively, despite its greater sophistication, Perl is often much faster than sed due to its two-pass process and highly optimized opcode implementation. See http://rc3.org/2014/08/28/surprisingly-perl-outperforms-sed-and-awk/ for example.
  • I often find that the equivalent Perl implementation is more intuitive than that of sed, since sed has a more primitive set of commands for manipulating the underlying text.


Related Topics



Leave a reply



Submit