Using Java to Find Substring of a Bigger String Using Regular Expression

Using Java to find substring of a bigger string using Regular Expression

You should be able to use non-greedy quantifiers, specifically *?. You're going to probably want the following:

Pattern MY_PATTERN = Pattern.compile("\\[(.*?)\\]");

This will give you a pattern that will match your string and put the text within the square brackets in the first group. Have a look at the Pattern API Documentation for more information.

To extract the string, you could use something like the following:

Matcher m = MY_PATTERN.matcher("FOO[BAR]");
while (m.find()) {
String s = m.group(1);
// s now contains "BAR"
}

How to grab piece of a string from a bigger string using regex or any method

This should do the trick

String s = "if value=StartTopic topic=testParser, multiCopy=false, required=true, all=false, path=/Return/ReturnData/IRSW2";
String regex= "path=[^\\,]*";

Pattern p = Pattern.compile(regex);

Matcher m = p.matcher(s);

if(m.find()) {
System.out.println(m.group());
}

Search substring in a string using regex

In order to find regex matches, you should use the regex classes. Pattern and Matcher.

String term = "term";
ArrayList<String> a = new ArrayList<String>();
a.add("123term456"); //true
a.add("A123Term5"); //false
a.add("term456"); //true
a.add("123term"); //true
Pattern p = Pattern.compile("^[^A-Za-z]*(" + term + ")[^A-Za-z]*$");
for(String text : a) {
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println("Found: " + m.group(1) );
//since the term you are adding is the second matchable portion, you're looking for group(1)
}
else System.out.println("No match for: " + term);
}

}

In the example there, we create an instance of a https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html to find matches in the text you are matching against.

Note that I adjusted the regex a bit. The choice in this code excludes all letters A-Z and the lowercase versions from the initial matching part. It will also allow for situations where there are no characters at all before or after the match term. If you need to have something there, use + instead of *. I also limited the regex to force the match to only contain matches for these three groups by using ^ and $ to verify end the end of the matching text. If this doesn't fit your use case, you may need to adjust.

To demonstrate using this with a variety of different terms:

ArrayList<String> terms = new ArrayList<String>();
terms.add("term");
terms.add("the book is on the table");
terms.add("1981 was the best year ever!");
ArrayList<String> a = new ArrayList<String>();
a.add("123term456");
a.add("A123Term5");
a.add("the book is on the table456");
a.add("1@#!231981 was the best year ever!9#");
for (String term: terms) {

Pattern p = Pattern.compile("^[^A-Za-z]*(" + term + ")[^A-Za-z]*$");

for(String text : a) {

Matcher m = p.matcher(text);
if (m.find()) {
System.out.println("Found: " + m.group(1) + " in " + text);
//since the term you are adding is the second matchable portion, you're looking for group(1)
}
else System.out.println("No match for: " + term + " in " + text);
}
}

Output for this is:
Found: term in 123term456
No match for: term in A123Term5
No match for: term in the book is on the table456....

In response to the question about having String term being case insensitive, here's a way that we can build a string by taking advantage of java.lang.Character to options for upper and lower case letters.

String term = "This iS the teRm.";
String matchText = "123This is the term.";
StringBuilder str = new StringBuilder();
str.append("^[^A-Za-z]*(");
for (int i = 0; i < term.length(); i++) {
char c = term.charAt(i);
if (Character.isLetter(c))
str.append("(" + Character.toLowerCase(c) + "|" + Character.toUpperCase(c) + ")");
else str.append(c);
}
str.append(")[^A-Za-z]*$");

System.out.println(str.toString());

Pattern p = Pattern.compile(str.toString());
Matcher m = p.matcher(matchText);
if (m.find()) System.out.println("Found!");
else System.out.println("Not Found!");

This code outputs two lines, the first line is the regex string that's being compiled in the Pattern. "^[^A-Za-z]*((t|T)(h|H)(i|I)(s|S) (i|I)(s|S) (t|T)(h|H)(e|E) (t|T)(e|E)(r|R)(m|M).)[^A-Za-z]*$" This adjusted regex allows for letters in the term to be matched regardless of case. The second output line is "Found!" because the mixed case term is found within matchText.

How to extract a substring using regex

Assuming you want the part between single quotes, use this regular expression with a Matcher:

"'(.*?)'"

Example:

String mydata = "some string with 'the data i want' inside";
Pattern pattern = Pattern.compile("'(.*?)'");
Matcher matcher = pattern.matcher(mydata);
if (matcher.find())
{
System.out.println(matcher.group(1));
}

Result:


the data i want

Regular Expression To Find a Substring from a text with a particular matches, the sequence may come multiple times in the text

Since I didn't get my desired answer rather got multiple downvotes, I am accepting this one...
(\$\{\w+\.\w+\})
Thank you all for your answers.

Extract all substrings beginning and ending with a regex from large string

Something like

Pattern r = Pattern.compile("abcdef[\\s\\S]*?fedcba");
Matcher m = r.matcher(sInput);
if (m.find( )) {
System.out.println("Found value: " + m.group() );
}

where sInput is your string to search.

[\s\S]*? will match any number of any character up to the following fedcba. Thanks to the ? it's a non-greedy match, which means it won't continue until the last fedcba (as it would if it was greedy), thus giving you the separate strings.

How to extract specific substring from a bigger string java

It can be done in one line:

String[] parts = input.replaceAll("(^.*\\()|(\\).*$)", "").split("\\)\\(");

The call to replaceAll() strips off the leasing and trailing brackets (plus any other junk characters before/after those first/last brackets), then you just split() on bracket pairs.

Java Regex: getting multiple parts of a string as substrings

I see no reason to use regex to replace any parts in the matched String. Just extract the values from the corresponding Matcher groups.

String file = m.group(4) + m.group(5) + m.group(7) + m.group(8)
+ "_" + m.group(9);
String path = m.group(13);

System.out.println(file);
System.out.println(path);

prints

MAIL_20140320_0000000002
XYZ

Using Java+regex, I want to find repeating characters in a string and replace that substring(s) with character found and # of times it was found

You may wrap the quantified backreference with a capturing group to be able to access this value later, and use a Matcher#appendReplacement to actually modify the matches inside the string:

String text = "fghhhhjkjkljhdd";
String regex = "(\\w)(\\1+)";
Pattern r = Pattern.compile(regex);
Matcher m = r.matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, m.group(1) + (m.group(2).length()+1));
}
m.appendTail(sb);
System.out.println(sb); // => fgh4jkjkljhd2

See the Java demo.



Related Topics



Leave a reply



Submit