Find string between two substrings
import re
s = 'asdf=5;iwantthis123jasd'
result = re.search('asdf=5;(.*)123jasd', s)
print(result.group(1))
Extracting a string between other two strings in R
You may use str_match
with STR1 (.*?) STR2
(note the spaces are "meaningful", if you want to just match anything in between STR1
and STR2
use STR1(.*?)STR2
, or use STR1\\s*(.*?)\\s*STR2
to trim the value you need). If you have multiple occurrences, use str_match_all
.
Also, if you need to match strings that span across line breaks/newlines add (?s)
at the start of the pattern: (?s)STR1(.*?)STR2
/ (?s)STR1\\s*(.*?)\\s*STR2
.
library(stringr)
a <- " anything goes here, STR1 GET_ME STR2, anything goes here"
res <- str_match(a, "STR1\\s*(.*?)\\s*STR2")
res[,2]
[1] "GET_ME"
Another way using base R regexec
(to get the first match):
test <- " anything goes here, STR1 GET_ME STR2, anything goes here STR1 GET_ME2 STR2"
pattern <- "STR1\\s*(.*?)\\s*STR2"
result <- regmatches(test, regexec(pattern, test))
result[[1]][2]
[1] "GET_ME"
Extract all strings between two strings
private static List<string> ExtractFromBody(string body, string start, string end)
{
List<string> matched = new List<string>();
int indexStart = 0;
int indexEnd = 0;
bool exit = false;
while (!exit)
{
indexStart = body.IndexOf(start);
if (indexStart != -1)
{
indexEnd = indexStart + body.Substring(indexStart).IndexOf(end);
matched.Add(body.Substring(indexStart + start.Length, indexEnd - indexStart - start.Length));
body = body.Substring(indexEnd + end.Length);
}
else
{
exit = true;
}
}
return matched;
}
Extract text between two strings if a substring exists between the two strings using Regex in Python
You can fix the code using
pat1 = '{0}\s*((?:(?!{0}).)*?{1}.*?)\s*{2}'.format(target1,target2,target3)
The pattern (see demo) is
StartString\s*((?:(?!StartString).)*?substring 1.*?)\s*EndString
Details
StartString
- left-hand delimiter\s*
- 0+ whitespaces((?:(?!StartString).)*?substring 1.*?)
- Group 1:(?:(?!StartString).)*?
- any char, 0 or more but as few as possible, that does not start with the left-hand delimitersubstring 1
- third string.*?
- any 0+ chars, as few as possible
\s*EndString
- 0+ whitespaces and the right-hand delimiter.
See the Python demo:
import re
text_data='ghsauaigyssts twh\n\nghguy hja StartString I want this text (1) if substring 1 lies in between the two strings EndString bhghk [jhbn] xxzh StartString I want this text (2) as a different variable if substring 2 lies in between the two strings EndString ghjyjgu'
target1 = 'StartString'
target2 = 'substring 1'
target3 = 'EndString'
pat1 = '{0}\s*((?:(?!{0}).)*?{1}.*?)\s*{2}'.format(target1,target2,target3)
pattern = re.compile(pat1, flags=re.DOTALL)
print(pattern.findall(text_data))
# => ['I want this text (1) if substring 1 lies in between the two strings']
Regex extract string between 2 strings, that contains 3rd string
Try this pattern:
TG00[^#]*TG40 155963[^#]*#
This pattern just says to find the string TG40 155963
in between TG00
and an ending #
. For the sample data in your demo there were 3 matches.
Demo
Find all strings in between two strings in Go
In Go, since its RE2-based regexp
does not support lookarounds, you need to use capturing mechanism with regexp.FindAllStringSubmatch
function:
left := "LEFT_DELIMITER_TEXT_HERE"
right := "RIGHT_DELIMITER_TEXT_HERE"
rx := regexp.MustCompile(`(?s)` + regexp.QuoteMeta(left) + `(.*?)` + regexp.QuoteMeta(right))
matches := rx.FindAllStringSubmatch(str, -1)
Note the use of regexp.QuoteMeta
that automatically escapes all special regex metacharacters in the left- and right-hand delimiters.
The (?s)
makes .
match across lines and (.*?)
captures all between ABC
and XYZ
into Group 1.
So, here you can use
package main
import (
"fmt"
"regexp"
)
func main() {
str:= "Movies: A B C Food: 1 2 3"
r := regexp.MustCompile(`Movies:\s*(.*?)\s*Food`)
matches := r.FindAllStringSubmatch(str, -1)
for _, v := range matches {
fmt.Println(v[1])
}
}
See the Go demo. Output: A B C
.
Find all strings that are in between two sub strings
Use re.findall()
to get every occurrence of your substring. $
is considered a special character in regular expressions meaning — "the end of the string" anchor, so you need to escape $
to match a literal character.
>>> import re
>>> s = '@@ cat $$ @@dog$^'
>>> re.findall(r'@@(.*?)\$', s)
[' cat ', 'dog']
To remove the leading and trailing whitespace, you can simply match it outside of the capture group.
>>> re.findall(r'@@\s*(.*?)\s*\$', s)
['cat', 'dog']
Also, if the context has a possibility of spanning across newlines, you may consider using negation.
>>> re.findall(r'@@\s*([^$]*)\s*\$', s)
Regular expression to get a string between two strings in Javascript
A lookahead (that (?=
part) does not consume any input. It is a zero-width assertion (as are boundary checks and lookbehinds).
You want a regular match here, to consume the cow
portion. To capture the portion in between, you use a capturing group (just put the portion of pattern you want to capture inside parenthesis):
cow(.*)milk
No lookaheads are needed at all.
Related Topics
Entity Framework - Stored Procedure Return Value
Best Way to Dynamically Set an Appender File Path
Launching Process in C# Without Distracting Console Window
Benefits of Using Async and Await Keywords
.Net Application Cannot Start and Receive Xamlparseexception
HTML Agility Pack - Removing Unwanted Tags Without Removing Content
Resharper Complains When Method Can Be Static, But Isn'T
Mapping Database Views to Ef 5.0 Code First W/Migrations
Can Someone Explain How Bcrypt Verifies a Hash
How to Use Datareceived Event of the Serialport Port Object in C#
File Write Permission Issue Under "Program Files" Folder
How to Drag a Usercontrol Inside a Canvas
What Are the Naming Conventions in C#
How to Check If a String Is a Number
How to Get Around Lack of Covariance with Ireadonlydictionary
Frombluetoothaddressasync Iasyncoperation Does Not Contain a Definition for 'Getawaiter' Error