Extract one word after a specific word on the same line
With awk
:
awk '{for(i=1;i<=NF;i++) if ($i=="--pe_cnt") print $(i+1)}' inputFile
Basically loop over each word of the line. When you find the first you are looking for, grab the next word and print it.
With grep
:
grep -oP "(?<=--pe_cnt )[^ ]+" inputFile
Extract one word after a specific word on the same line but there is no space between them
You may use sed
:
sed -n '/\/files\// s~.*/files/\([^.]*\)\..*~\1~p' file
d
You may also use awk
command:
awk -F/ '$(NF-1) == "files"{sub(/\..*/, "", $NF); print $NF}' file
Read word after a specific word on the same line dont have space between them
You could use GNU grep
with PCRE :
grep -oP '/files/\K[^.]+' file
The -P
flag makes grep
use PCRE, the -o
makes it display only the matched part rather than the full line, and the \K
in the regex omits what precedes from the displayed matched part.
Alternatively if you don't have access to GNU grep, the following perl
command will have the same effect :
perl -nle 'print $& if m{/files/\K[^.]+}' file
Sample run.
Extract a string or value based on specific word before and a % sign after in R
gsub solution
I think your gsub solution was pretty close, but didn't bring along the percentage sign as it's outside the brackets. So something like this should work (the result is assigned to the capacity
column):
aa$capacity <- gsub(".*[Capacity]([^.]+%).*", "\\1", aa$TEXT)
Alternative method
The gsub approach will match the whole string when there is no operator match. To avoid this, we can use the stringr package with a more specific regular expression:
library(magrittr)
library(dplyr)
library(stringr)
aa %>%
mutate(capacity = str_extract(TEXT, "(?<=Capacity\\s)\\W\\s?\\d+\\s?%")) %>%
mutate(Capacity = str_squish(Capacity)) # Remove excess white space
This code will give NA
when there is no match, which I believe is your desired behaviour.
Extract first word after specific word
Very close:
my ($w_after) = ($words =~ /anywhere\s+(\S+)/);
^ ^ ^^^
+--------+ |
Note 1 Note 2
Note 1: =~
returns a list of captured items, so the assignment target needs to be a list.
Note 2: allow one or more blanks after anywhere
R extract specific word after keyword
You can combine the text together as one string and extract the values based on pattern in the data. This approach will work irrespective of the line number in the data provided the pattern in the data is always valid for all the files.
my_txt <- readLines(paste(path, "/input.txt", sep = ""))
#Collapse data in one string
text <- paste0(my_txt, collapse = '\n')
#Extract text after FirstName till '\n'
fName <- sub('.*FirstName (.*?)\n.*', '\\1', text)
fName
#[1] "John Woo"
#Extract text after Surname till '\n'
SName <- sub('.*Surname (.*?)\n.*', '\\1', text)
SName
#[1] "T"
#Extract text after Father's Name till '\n'
FatherNm <- sub(".*Father's Name (.*?)\n.*", '\\1', text)
FatherNm
#[1] "Bill Woo"
#Extract numbers which come after Date of Birth.
dob <- sub(".*Date of Birth (\\d+/\\d+/\\d+).*", '\\1', text)
dob
#[1] "13/07/1970"
Regex, extract word before and after another one
^([0-9a-zA-Z]{3})\s+limk$|^limk\s+([0-9a-zA-Z]{3})$
- ^ Matches the beginning of the line
- [0-9a-zA-Z]{3} Matches upper and lower case ascii characters plus digits of length 3
- \s+ Matches 1 or more spaces
- matches limk
- $ Matches the end of the line
- | Start of the second alternative:
- ^ Matches the start of the line
- Matches limk
- \s+ Matches one or more spaces
- [0-9a-zA-Z]{3} Matches upper and lower case ascii characters plus digits of length 3
- $ Matches the end of the line
The code:
import re
s = """limk ab1
limk ab2 helo
rest helo
ab3 limk helo
ab4 limk"""
matches = [x[0] if x[0] != '' else x[1] for x in re.findall(r'(?m)^([0-9a-zA-Z]{3})\s+limk$|^limk\s+([0-9a-zA-Z]{3})$', s)]
for match in matches:
print(match)
Prints:
ab1
ab4
See Demo
How to extract the number after specific word using awk?
Using sed
$ sed '1s/$/\tGPUAllocated/;s~.*gres/gpu=\([0-9]\).*~& \t\1~;1!{\~gres/gpu=[0-9]~!s/$/ \t0/}' input_file
Index AllocTres CPUTotal GPUAllocated
1 cpu=1,mem=256G 18 0
2 cpu=2,mem=1024M 16 0
3 4 0
4 cpu=12,gres/gpu=3 12 3
5 8 0
6 9 0
7 cpu=13,gres/gpu=4,gres/gpu:ret6000=2 20 4
8 mem=12G,gres/gpu=3,gres/gpu:1080ti=1 21 3
How to Extract Words Following a Key Word
You need to make sure "our" is with space boundaries, like this:
our = '(^|\s+)our(\s+)?\W+(?P<after>(?:\w+\W+){,4})'
specifically (^|\s+)our(\s+)?
is where you need to play, the example only handles spaces and start of sentence, but you might need to extend this to have quotes or other special characters.
Related Topics
How to Change the Environment Variables of Another Process in Unix
How to Remove the Lines Which Appear on File B from Another File A
Bash Script Processing Limited Number of Commands in Parallel
Best Practices When Running Node.Js With Port 80 (Ubuntu/Linode)
How to Declare 2D Array in Bash
What Is Double Dot(..) and Single Dot(.) in Linux
Using Printf in Assembly Leads to Empty Output When Piping, But Works on the Terminal
Read Values into a Shell Variable from a Pipe
How to Link to a Specific Glibc Version
Understanding Linux /Proc/Pid/Maps or /Proc/Self/Maps
Difference Between Clock_Realtime and Clock_Monotonic
How to Automatically Redirect Http to Https on Apache Servers
Sorting Multiple Keys With Unix Sort
&&' Vs. '&' With the 'Test' Command in Bash
Nasm Segmentation Fault on Ret in _Start
Why Does /Bin/Sh Behave Differently to /Bin/Bash Even If One Points to the Other