How to grep for value in a key-value store from plain text
Use a look behind:
$ grep -Po '(?<=^FOO=)\w*$' file
foo
I also like awk
for it:
$ awk -v FS="FOO=" 'NF>1{print $2}' file
foo
Or even better:
$ awk -F= -v key="FOO" '$1==key {print $2}' file
foo
With sed
:
$ sed -n 's/^FOO=//p' file
foo
Or even with Bash -ONLY if you are confident about the file not containing any weird values-, you can source the file and echo
the required value:
$ (source file; echo "$FOO")
foo
grep a file to read a key:value
awk
is right tool for this as your data is delimited by a common character and structured in columns and rows. You may use this awk command:
awk -F: '$1 == "1234-A0"{print $2}' file
1234_12345678_987
Extract value from a list of key-value pairs using grep
You may use
grep -oP "(?:^|,)$KEY:\K[^,]+"
The -o
option outputs matches. -P
enables PCRE engine. The double quotes are necessary for string interpolation so that $KEY
could be interpolated.
The pattern matches:
(?:^|,)
- start of string or comma$KEY
- theKEY
variable:
- colon\K
- match reset operator that discards the whole text matched so far[^,]+
- 1+ chars other than,
How to extract specific key value pairs from a grep output
I would do it using GNU AWK
following way. Let file.txt
content be
./Data1/TEST_Data1.xml:<def-query collection="FT_R1Event" count="-1" desc="" durationEnd="1" durationStart="0" durationType="CAL" fromWS="Data1" id="_q1" timeUnit="D">
./Data2/TEST_Data2.xml:<def-query collection="FT_R2Event" count="-1" desc="" durationEnd="2" durationStart="0" durationType="ABS" fromWS="Data2" id="_q1" timeUnit="M">
then
awk 'BEGIN{OFS=", ";FPAT="(^[^ ]+xml)|((durationEnd|timeUnit)=\"[^\"]+\")"}{gsub(/\.([/]|xml)/, "", $1);print}' file.txt
output
Data1/TEST_Data1, durationEnd="1", timeUnit="D"
Data2/TEST_Data2, durationEnd="2", timeUnit="M"
Explanation: I used FPAT
to extract interesting elements of input, namely these which from start does not contain spaces and are following by xml
or ((durationEnd
or timeUnit
) followed by "
non-"
"
). Then I remove .
followed by /
or xml
(note that .
has to be literal .
so it is escaped). Then I print everything, which is joined by ,
as I set it as output field seperator (OFS
).
Disclaimer: I tested it only with shown samples.
(tested in gawk 4.2.1)
Find the value of key from JSON
If you have a grep that can do Perl compatible regular expressions (PCRE):
$ grep -Po '"id": *\K"[^"]*"' infile.json
"4dCYd4W9i6gHQHvd"
-P
enables PCRE-o
retains nothing but the match"id": *
matches"id"
and an arbitrary amount of spaces\K
throws away everything to its left ("variable size positive look-behind")"[^"]*"
matches two quotes and all the non-quotes between them
If your grep can't do that, you an use
$ grep -o '"id": *"[^"]*"' infile.json | grep -o '"[^"]*"$'
"4dCYd4W9i6gHQHvd"
This uses grep twice. The result of the first command is "id": "4dCYd4W9i6gHQHvd"
; the second command removes everything but a pair of quotes and the non-quotes between them, anchored at the end of the string ($
).
But, as pointed out, you shouldn't use grep for this, but a tool that can parse JSON – for example jq:
$ jq '.data.id' infile.json
"4dCYd4W9i6gHQHvd"
This is just a simple filter for the id
key in the data
object. To get rid of the double quotes, you can use the -r
("raw output") option:
$ jq -r '.data.id' infile.json
4dCYd4W9i6gHQHvd
jq can also neatly pretty print your JSON:
$ jq . infile.json
{
"data": {
"name": "test",
"id": "4dCYd4W9i6gHQHvd",
"domains": [
"www.test.domain.com",
"test.domain.com"
],
"serverid": "bbBdbbHF8PajW221",
"ssl": null,
"runtime": "php5.6",
"sysuserid": "4gm4K3lUerbSPfxz",
"datecreated": 1474597357
},
"actionid": "WXVAAHQDCSILMYTV"
}
Grep the entire text after a certain word using grep/awk/sed
With your shown attempts, please try following code.
your_API_command |
awk -v RS= 'match($0,/-+BEGIN.*END RSA PRIVATE KEY-+/){print substr($0,RSTART,RLENGTH)}'
Explanation: Simple explanation would be, run your API command and send its output as an standard input to awk
command. Where using nullify RS then using match
function to match string from -
(1 or more occurrences) followed by BEGIN
till string END RSA PRIVATE KEY
followed by 1 or more occurrences of -
.
2nd solution: A little tweaked form of 1st solution here, written and tested in GNU awk
.
your_API_command | awk -v RS='-+BEGIN.*END RSA PRIVATE KEY-+' 'RT{print RT}'
awk command to read a key value pair from a file
Since there are multiple :
in your input, getting $2
will not work in awk
because it will just give you 2nd field. You actually need an equivalent of cut -d: -f2-
but you also need to check key name that comes before first :
.
This awk
should work for you:
awk -F: '$1 == "GOOGLE_URL" {sub(/^[^:]+:/, ""); print}' input.txt
https://www.google.com/
Or this non-regex awk
approach that allows you to pass key name from command line:
awk -F: -v k='GOOGLE_URL' '$1==k{print substr($0, length(k FS)+1)}' input.txt
Or using gnu-grep
:
grep -oP '^GOOGLE_URL:\K.+' input.txt
https://www.google.com/
Related Topics
Signal Handling in Asm: Why am I Receiving Sigsegv When Invoking the Sys_Pause Syscall
Removing Sensitive Data from Git. "Fatal: Ambiguous Argument 'Rm'"
Compiler Can't Find Libxml/Parser.H
Multiplication with Expr in Shell Script
How to Call Accept() for One Socket from Several Threads Simultaneously
Why Using Pipe for Sort (Linux Command) Is Slow
Mixing Static Libraries and Shared Libraries
Using Bash Script to Feed Input to Command Line
Where Is the Stack Memory Allocated from for a Linux Process
Specifying Non-Standard Baud Rate for Ftdi Virtual Serial Port Under Linux
Read a File and Split Each Line into Multiple Variables
How to Build and Compile Cross Platform Xamarin Apps on Linux
Getting Memory Map of Every Device in Linux
Tracking Actively Used Memory in Linux Programs
Sed Replace In-Line a Specific Column Number Value at a Specific Line Number
What Is the Use of _Iomem in Linux While Writing Device Drivers