What Does Grep -Po '...\K...' Do? How Else Can That Effect Be Achieved

What does grep -Po '...\K...' do? How else can that effect be achieved?

  • grep -P enables PCRE syntax. (This is a non-standard extension -- not even all builds of GNU grep support it, as it depends on the optional libpcre library, and whether to link this in is a compile-time option).
  • grep -o emits only matched text, and not the entire line containing said text, in output. (This too is nonstandard, though more widely available than -P).
  • \K is a PCRE extension to regex syntax discarding content prior to that point from being included in match output.

Since your shell is bash, you have ERE support built in. As an alternative that uses only built-in functionality (no external tools, grep, awk or otherwise):

#!/usr/bin/env bash
regex='value="([^"]*)"' # store regex (w/ match group) in a variable
results=( ) # define an empty array to store results
while IFS= read -r line; do # iterate over lines on input
if [[ $line =~ $regex ]]; then # ...and, when one matches the regex...
results+=( "${BASH_REMATCH[1]}" ) # ...put the group's contents in the array
fi
done <"$1" # with stdin coming from the file named in $1
printf '%s\n' "${results[*]}" # combine array results with spaces and print

See http://wiki.bash-hackers.org/syntax/ccmd/conditional_expression for a discussion of =~, and http://wiki.bash-hackers.org/syntax/shellvars#bash_rematch for a discussion of BASH_REMATCH. See BashFAQ #1 for a discussion of reading files line-by-line with a while read loop.

Grep for substring from piped output lookbehind assertion

With GNU grep:

xinput | grep -Po 'AlpsPS/2 ALPS DualPoint Stick *id=\K[0-9]+'

Output:


11

See: What does grep -Po '…\K…' do? How else can that effect be achieved?

How do you grep a file and get the next 5 lines

You want:

grep -A 5 '19:55' file

From man grep:

Context Line Control

-A NUM, --after-context=NUM

Print NUM lines of trailing context after matching lines.
Places a line containing a gup separator (described under --group-separator)
between contiguous groups of matches. With the -o or --only-matching
option, this has no effect and a warning is given.

-B NUM, --before-context=NUM

Print NUM lines of leading context before matching lines.
Places a line containing a group separator (described under --group-separator)
between contiguous groups of matches. With the -o or --only-matching
option, this has no effect and a warning is given.

-C NUM, -NUM, --context=NUM

Print NUM lines of output context. Places a line containing a group separator
(described under --group-separator) between contiguous groups of matches.
With the -o or --only-matching option, this has no effect and a warning
is given.

--group-separator=SEP

Use SEP as a group separator. By default SEP is double hyphen (--).

--no-group-separator

Use empty string as a group separator.

Get JSON value using grep, sed, or awk

Instead of gnu-grep, you can use sed like this:

sed -nE 's/^ *"private_key": "([^"]+)".*/\1/p' file.json

-----BEGIN PRIVATE KEY-----\nMyKey\n-----END PRIVATE KEY-----\n

using only 'grep' command to get specific column

You may use a GNU grep with a PCRE pattern:

grep -Po '^([^,]*,){2}\K[^,]*' file

Here,

  • ^ - start of string
  • ([^,]*,){2} - two occurrences of any zero or more chars other than , and then a ,
  • \K - match reset operator discarding all text matched so far
  • [^,]* - zero or more chars other than a comma.

Extract specific number from command outout

To get the last number, you can add a .* in front, that will match as much as possible, eating away all the other numbers. However, to exclude that part from the output, you need GNU grep or pcregrep or sed.

grep -Po '.* \K[0-9.]+'

Or

sed -En 's/.* ([0-9.]+).*/\1/p'

How to parse the following date using grep command in bash

With your shown samples with GNU grep's PCRE option, you could try following regex to match both of the timings.

grep -oP '(?:"ts":)?"\d{4}-\d{2}-\d{2}T(?:[0-1][1-9]|2[0-4]):(?:[0-4][0-9]|5[0-9])[+:](?:[0-4][0-9]|5[0-9])(?:Z"|\+(?:[0-4][0-9]|5[0-9]):(?:[0-4][0-9]|5[0-9])")' Input_file

Explanation: Adding detailed explanation for above.

(?:"ts":)?                ##In a non-capturing group matching "ts": keeping it optional here.
"\d{4}-\d{2}-\d{2}T ##Matching " followed by 4 digits-2digits-2digits T here.
(?: ##Starting 1st non-capturing group here.
[0-1][1-9]|2[0-4] ##Matching 0 to 19 and 20 to 24 here to cover 24 hours.
): ##Closing 1st non-capturing group followed by colon here.
(?: ##Starting 2nd non-capturing group here.
[0-4][0-9]|5[0-9] ##Matching 00 to 59 for mins here.
) ##Closing 2nd non-capturing group here.
[+:] ##Matching either + or : here.
(?: ##Starting 3rd capturing group here.
[0-4][0-9]|5[0-9] ##Matching 00 to 59 for seconds here.
) ##Closing 3rd non-capturing group here.
(?: ##Starting 4th non-capturing group here.
Z"|\+ ##Matching Z" OR +(literal character) here.
(?: ##Starting non-capturing group here.
[0-4][0-9]|5[0-9] ##Matching 00 to 59 here.
) ##Closing non-capturing group here.
: ##Matching colon here.
(?: ##Starting non-capturing group here.
[0-4][0-9]|5[0-9] ##Matching 00 to 59 here.
)" ##Closing non-capturing group here, followed by "
) ##Closing 4th non-capturing group here.


Related Topics



Leave a reply



Submit