Extract One Word After a Specific Word on the Same Line

Extract one word after a specific word on the same line

With awk:

awk '{for(i=1;i<=NF;i++) if ($i=="--pe_cnt") print $(i+1)}' inputFile

Basically loop over each word of the line. When you find the first you are looking for, grab the next word and print it.

With grep:

grep -oP "(?<=--pe_cnt )[^ ]+" inputFile

Extract one word after a specific word on the same line but there is no space between them

You may use sed:

sed -n '/\/files\// s~.*/files/\([^.]*\)\..*~\1~p' file

d

You may also use awk command:

awk -F/ '$(NF-1) == "files"{sub(/\..*/, "", $NF); print $NF}' file

Read word after a specific word on the same line dont have space between them

You could use GNU grep with PCRE :

grep -oP '/files/\K[^.]+' file

The -P flag makes grep use PCRE, the -o makes it display only the matched part rather than the full line, and the \K in the regex omits what precedes from the displayed matched part.

Alternatively if you don't have access to GNU grep, the following perl command will have the same effect :

perl -nle 'print $& if m{/files/\K[^.]+}' file

Sample run.

Extract a string or value based on specific word before and a % sign after in R

gsub solution

I think your gsub solution was pretty close, but didn't bring along the percentage sign as it's outside the brackets. So something like this should work (the result is assigned to the capacity column):

aa$capacity <- gsub(".*[Capacity]([^.]+%).*", "\\1", aa$TEXT)

Alternative method

The gsub approach will match the whole string when there is no operator match. To avoid this, we can use the stringr package with a more specific regular expression:

library(magrittr)
library(dplyr)
library(stringr)

aa %>%
mutate(capacity = str_extract(TEXT, "(?<=Capacity\\s)\\W\\s?\\d+\\s?%")) %>%
mutate(Capacity = str_squish(Capacity)) # Remove excess white space

This code will give NA when there is no match, which I believe is your desired behaviour.

Extract first word after specific word

Very close:

my ($w_after) = ($words =~ /anywhere\s+(\S+)/);
^ ^ ^^^
+--------+ |
Note 1 Note 2

Note 1: =~ returns a list of captured items, so the assignment target needs to be a list.

Note 2: allow one or more blanks after anywhere

R extract specific word after keyword

You can combine the text together as one string and extract the values based on pattern in the data. This approach will work irrespective of the line number in the data provided the pattern in the data is always valid for all the files.

my_txt <- readLines(paste(path, "/input.txt", sep = ""))
#Collapse data in one string
text <- paste0(my_txt, collapse = '\n')
#Extract text after FirstName till '\n'
fName <- sub('.*FirstName (.*?)\n.*', '\\1', text)
fName
#[1] "John Woo"

#Extract text after Surname till '\n'
SName <- sub('.*Surname (.*?)\n.*', '\\1', text)
SName
#[1] "T"

#Extract text after Father's Name till '\n'
FatherNm <- sub(".*Father's Name (.*?)\n.*", '\\1', text)
FatherNm
#[1] "Bill Woo"

#Extract numbers which come after Date of Birth.
dob <- sub(".*Date of Birth (\\d+/\\d+/\\d+).*", '\\1', text)
dob
#[1] "13/07/1970"

Regex, extract word before and after another one

^([0-9a-zA-Z]{3})\s+limk$|^limk\s+([0-9a-zA-Z]{3})$
  1. ^ Matches the beginning of the line
  2. [0-9a-zA-Z]{3} Matches upper and lower case ascii characters plus digits of length 3
  3. \s+ Matches 1 or more spaces
  4. matches limk
  5. $ Matches the end of the line
  6. | Start of the second alternative:
  7. ^ Matches the start of the line
  8. Matches limk
  9. \s+ Matches one or more spaces
  10. [0-9a-zA-Z]{3} Matches upper and lower case ascii characters plus digits of length 3
  11. $ Matches the end of the line

The code:

import re

s = """limk ab1
limk ab2 helo
rest helo
ab3 limk helo
ab4 limk"""

matches = [x[0] if x[0] != '' else x[1] for x in re.findall(r'(?m)^([0-9a-zA-Z]{3})\s+limk$|^limk\s+([0-9a-zA-Z]{3})$', s)]
for match in matches:
print(match)

Prints:

ab1
ab4

See Demo

How to extract the number after specific word using awk?

Using sed

$ sed '1s/$/\tGPUAllocated/;s~.*gres/gpu=\([0-9]\).*~& \t\1~;1!{\~gres/gpu=[0-9]~!s/$/ \t0/}' input_file
Index AllocTres CPUTotal GPUAllocated
1 cpu=1,mem=256G 18 0
2 cpu=2,mem=1024M 16 0
3 4 0
4 cpu=12,gres/gpu=3 12 3
5 8 0
6 9 0
7 cpu=13,gres/gpu=4,gres/gpu:ret6000=2 20 4
8 mem=12G,gres/gpu=3,gres/gpu:1080ti=1 21 3

How to Extract Words Following a Key Word

You need to make sure "our" is with space boundaries, like this:

our = '(^|\s+)our(\s+)?\W+(?P<after>(?:\w+\W+){,4})'

specifically (^|\s+)our(\s+)? is where you need to play, the example only handles spaces and start of sentence, but you might need to extend this to have quotes or other special characters.



Related Topics



Leave a reply



Submit