Java - Best Way to Grab All Strings Between Two Strings? (Regex)

Java Regex Extract String between two Strings

I write test as in Extract string between two strings in java and this is working. I think Your input string don't matches:

 @Test
public void regex() {
String str = "Nom for 3 Oscar, dom for 234235 Oscars";
Pattern pattern = Pattern.compile("for(.*?)Oscar");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}

Output:

    3 
234235

After my answer You edited Your question and I see, in Your input String "oscar" starts with lowecase "o", in Pattern with uppercase "O".

Java - Best way to grab ALL Strings between two Strings? (regex?)

You can construct the regex to do this for you:

// pattern1 and pattern2 are String objects
String regexString = Pattern.quote(pattern1) + "(.*?)" + Pattern.quote(pattern2);

This will treat the pattern1 and pattern2 as literal text, and the text in between the patterns is captured in the first capturing group. You can remove Pattern.quote() if you want to use regex, but I don't guarantee anything if you do that.

You can add some customization of how the match should occurs by adding flags to the regexString.

  • If you want Unicode-aware case-insensitive matching, then add (?iu) at the beginning of regexString, or supply Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE flag to Pattern.compile method.
  • If you want to capture the content even if the two delimiting strings appear across lines, then add (?s) before (.*?), i.e. "(?s)(.*?)", or supply Pattern.DOTALL flag to Pattern.compile method.

Then compile the regex, obtain a Matcher object, iterate through the matches and save them into a List (or any Collection, it's up to you).

Pattern pattern = Pattern.compile(regexString);
// text contains the full text that you want to extract data
Matcher matcher = pattern.matcher(text);

while (matcher.find()) {
String textInBetween = matcher.group(1); // Since (.*?) is capturing group 1
// You can insert match into a List/Collection here
}

Testing code:

String pattern1 = "hgb";
String pattern2 = "|";
String text = "sdfjsdkhfkjsdf hgb sdjfkhsdkfsdf |sdfjksdhfjksd sdf sdkjfhsdkf | sdkjfh hgb sdkjfdshfks|";

Pattern p = Pattern.compile(Pattern.quote(pattern1) + "(.*?)" + Pattern.quote(pattern2));
Matcher m = p.matcher(text);
while (m.find()) {
System.out.println(m.group(1));
}

Do note that if you search for the text between foo and bar in this input foo text foo text bar text bar with the method above, you will get one match, which is  text foo text .

Regex Match all characters between two strings

For example

(?<=This is)(.*)(?=sentence)

Regexr

I used lookbehind (?<=) and look ahead (?=) so that "This is" and "sentence" is not included in the match, but this is up to your use case, you can also simply write This is(.*)sentence.

The important thing here is that you activate the "dotall" mode of your regex engine, so that the . is matching the newline. But how you do this depends on your regex engine.

The next thing is if you use .* or .*?. The first one is greedy and will match till the last "sentence" in your string, the second one is lazy and will match till the next "sentence" in your string.

Update

Regexr

This is(?s)(.*)sentence

Where the (?s) turns on the dotall modifier, making the . matching the newline characters.

Update 2:

(?<=is \()(.*?)(?=\s*\))

is matching your example "This is (a simple) sentence". See here on Regexr

Extract a complex String from between two Strings

Pattern p = Pattern.compile("\\[Text:(.*?)\\]");
Matcher m = p.matcher("[Qual:3] [Text:PIX 1252471471953/YHYF/PPP121.40/10RTY10/NOLXX08X1] [Elem:123]");
m.find();
System.out.println(m.group(1));

Gives:

PIX 1252471471953/YHYF/PPP121.40/10RTY10/NOLXX08X1

The \\[ and \\] are to escape the brackets, which are special characters in regexes. The .*? is a non-greedy quantifier, so it stops gobbling up characters when it reaches the closing bracket. This part of the regex is given inside a capturing group (), which you can access with m.group(1).

Java Get String Between Two Strings

You can use regex to accomplish this, by searching for the ~PHONE_characters=digits pattern, like so:

String str = "~PHONE_IDX=200~PHONE_DD=100~PHONE_KK=50~";
Pattern p = Pattern.compile("~PHONE_(?<attribute>\\w+)=(?<value>\\d+)");
Matcher m = p.matcher(str);//matcher for string
while(m.find())
{
System.out.println("Next group: "+m.group());
System.out.println("Attribute: "+m.group("attribute"));
System.out.println("Value: "+m.group("value"));
}

This code will output the following:

Next group: ~PHONE_IDX=200
Attribute: IDX
Value: 200
Next group: ~PHONE_DD=100
Attribute: DD
Value: 100
Next group: ~PHONE_KK=50
Attribute: KK
Value: 50

Regular expression to get a string between two strings in Javascript

A lookahead (that (?= part) does not consume any input. It is a zero-width assertion (as are boundary checks and lookbehinds).

You want a regular match here, to consume the cow portion. To capture the portion in between, you use a capturing group (just put the portion of pattern you want to capture inside parenthesis):

cow(.*)milk

No lookaheads are needed at all.

Regular Expression to find a string included between two characters while EXCLUDING the delimiters

Easy done:

(?<=\[)(.*?)(?=\])

Technically that's using lookaheads and lookbehinds. See Lookahead and Lookbehind Zero-Width Assertions. The pattern consists of:

  • is preceded by a [ that is not captured (lookbehind);
  • a non-greedy captured group. It's non-greedy to stop at the first ]; and
  • is followed by a ] that is not captured (lookahead).

Alternatively you can just capture what's between the square brackets:

\[(.*?)\]

and return the first captured group instead of the entire match.

Extract a sub string between : and WORD in java using regex in java

You need to find a ; and then match any 0+ chars other than ; as few as possible up to the first occurrence of WORD. You may do that using

;([^;]*?)WORD

See the regex demo. Note that the leading/trailing whitespace can be easily trimmed off with .trim() after a match is found.

See the Java demo below:

List<String> strs = Arrays.asList("(XYZTRR: KTTT 4.0.1; TVS A3003 WORD/LLLLL ; pj ;)", 
"(XcdcdRR: dTff 5.4.1; TVS A3003 WORD/UJHKKKHH fpp)",
"(LLLhf22; 776332 8.7.1; TVS A3003 WORD/UHHGFVV phhp) );");
Pattern pattern = Pattern.compile(";([^;]*?)WORD");
while (String s : strs) {
Matcher matcher = pattern.matcher(s);
if (matcher.find()){
System.out.println(matcher.group(1).trim());
}
}

Output:

TVS A3003
TVS A3003
TVS A3003

Regex Help for text/pattern between two keywords

(?s) for new line characters, check this regex-match-all-characters-between-two-strings

import re

print(re.findall('RIASWIX(?s)(.*?)Sky Access', str1))


Related Topics



Leave a reply



Submit