Using Java to find substring of a bigger string using Regular Expression
You should be able to use non-greedy quantifiers, specifically *?. You're going to probably want the following:
Pattern MY_PATTERN = Pattern.compile("\\[(.*?)\\]");
This will give you a pattern that will match your string and put the text within the square brackets in the first group. Have a look at the Pattern API Documentation for more information.
To extract the string, you could use something like the following:
Matcher m = MY_PATTERN.matcher("FOO[BAR]");
while (m.find()) {
String s = m.group(1);
// s now contains "BAR"
}
How to grab piece of a string from a bigger string using regex or any method
This should do the trick
String s = "if value=StartTopic topic=testParser, multiCopy=false, required=true, all=false, path=/Return/ReturnData/IRSW2";
String regex= "path=[^\\,]*";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(s);
if(m.find()) {
System.out.println(m.group());
}
Search substring in a string using regex
In order to find regex matches, you should use the regex classes. Pattern and Matcher.
String term = "term";
ArrayList<String> a = new ArrayList<String>();
a.add("123term456"); //true
a.add("A123Term5"); //false
a.add("term456"); //true
a.add("123term"); //true
Pattern p = Pattern.compile("^[^A-Za-z]*(" + term + ")[^A-Za-z]*$");
for(String text : a) {
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println("Found: " + m.group(1) );
//since the term you are adding is the second matchable portion, you're looking for group(1)
}
else System.out.println("No match for: " + term);
}
}
In the example there, we create an instance of a https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html to find matches in the text you are matching against.
Note that I adjusted the regex a bit. The choice in this code excludes all letters A-Z and the lowercase versions from the initial matching part. It will also allow for situations where there are no characters at all before or after the match term. If you need to have something there, use +
instead of *
. I also limited the regex to force the match to only contain matches for these three groups by using ^
and $
to verify end the end of the matching text. If this doesn't fit your use case, you may need to adjust.
To demonstrate using this with a variety of different terms:
ArrayList<String> terms = new ArrayList<String>();
terms.add("term");
terms.add("the book is on the table");
terms.add("1981 was the best year ever!");
ArrayList<String> a = new ArrayList<String>();
a.add("123term456");
a.add("A123Term5");
a.add("the book is on the table456");
a.add("1@#!231981 was the best year ever!9#");
for (String term: terms) {
Pattern p = Pattern.compile("^[^A-Za-z]*(" + term + ")[^A-Za-z]*$");
for(String text : a) {
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println("Found: " + m.group(1) + " in " + text);
//since the term you are adding is the second matchable portion, you're looking for group(1)
}
else System.out.println("No match for: " + term + " in " + text);
}
}
Output for this is:
Found: term in 123term456
No match for: term in A123Term5
No match for: term in the book is on the table456....
In response to the question about having String term being case insensitive, here's a way that we can build a string by taking advantage of java.lang.Character
to options for upper and lower case letters.
String term = "This iS the teRm.";
String matchText = "123This is the term.";
StringBuilder str = new StringBuilder();
str.append("^[^A-Za-z]*(");
for (int i = 0; i < term.length(); i++) {
char c = term.charAt(i);
if (Character.isLetter(c))
str.append("(" + Character.toLowerCase(c) + "|" + Character.toUpperCase(c) + ")");
else str.append(c);
}
str.append(")[^A-Za-z]*$");
System.out.println(str.toString());
Pattern p = Pattern.compile(str.toString());
Matcher m = p.matcher(matchText);
if (m.find()) System.out.println("Found!");
else System.out.println("Not Found!");
This code outputs two lines, the first line is the regex string that's being compiled in the Pattern. "^[^A-Za-z]*((t|T)(h|H)(i|I)(s|S) (i|I)(s|S) (t|T)(h|H)(e|E) (t|T)(e|E)(r|R)(m|M).)[^A-Za-z]*$"
This adjusted regex allows for letters in the term to be matched regardless of case. The second output line is "Found!" because the mixed case term is found within matchText.
How to extract a substring using regex
Assuming you want the part between single quotes, use this regular expression with a Matcher
:
"'(.*?)'"
Example:
String mydata = "some string with 'the data i want' inside";
Pattern pattern = Pattern.compile("'(.*?)'");
Matcher matcher = pattern.matcher(mydata);
if (matcher.find())
{
System.out.println(matcher.group(1));
}
Result:
the data i want
Regular Expression To Find a Substring from a text with a particular matches, the sequence may come multiple times in the text
Since I didn't get my desired answer rather got multiple downvotes, I am accepting this one...(\$\{\w+\.\w+\})
Thank you all for your answers.
Extract all substrings beginning and ending with a regex from large string
Something like
Pattern r = Pattern.compile("abcdef[\\s\\S]*?fedcba");
Matcher m = r.matcher(sInput);
if (m.find( )) {
System.out.println("Found value: " + m.group() );
}
where sInput
is your string to search.
[\s\S]*?
will match any number of any character up to the following fedcba
. Thanks to the ?
it's a non-greedy match, which means it won't continue until the last fedcba
(as it would if it was greedy), thus giving you the separate strings.
How to extract specific substring from a bigger string java
It can be done in one line:
String[] parts = input.replaceAll("(^.*\\()|(\\).*$)", "").split("\\)\\(");
The call to replaceAll()
strips off the leasing and trailing brackets (plus any other junk characters before/after those first/last brackets), then you just split()
on bracket pairs.
Java Regex: getting multiple parts of a string as substrings
I see no reason to use regex to replace any parts in the matched String
. Just extract the values from the corresponding Matcher
groups.
String file = m.group(4) + m.group(5) + m.group(7) + m.group(8)
+ "_" + m.group(9);
String path = m.group(13);
System.out.println(file);
System.out.println(path);
prints
MAIL_20140320_0000000002
XYZ
Using Java+regex, I want to find repeating characters in a string and replace that substring(s) with character found and # of times it was found
You may wrap the quantified backreference with a capturing group to be able to access this value later, and use a Matcher#appendReplacement
to actually modify the matches inside the string:
String text = "fghhhhjkjkljhdd";
String regex = "(\\w)(\\1+)";
Pattern r = Pattern.compile(regex);
Matcher m = r.matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, m.group(1) + (m.group(2).length()+1));
}
m.appendTail(sb);
System.out.println(sb); // => fgh4jkjkljhd2
See the Java demo.
Related Topics
Is a Java String Really Immutable
El Access a Map Value by Integer Key
Why Do I Get the "Unhandled Exception Type Ioexception"
Execute Method on Startup in Spring
Is an Array a Primitive Type or an Object (Or Something Else Entirely)
Com.Jcraft.Jsch.Jschexception: Unknownhostkey
How to Map a Postgresql Array with Hibernate
Getting Java.Sql.Sqlexception: Operation Not Allowed After Resultset Closed
Differencebetween a Javabean and a Pojo
What Is the List of Valid @Suppresswarnings Warning Names in Java
In Java, What Is a Shallow Copy
Why to Use Swingutilities.Invokelater in Main Method
Break or Return from Java 8 Stream Foreach
How to Connect SQLite with Java
Initialising Mock Objects - Mockito
What Is the Significance of Url-Pattern in Web.Xml and How to Configure Servlet