Java: Split string when an uppercase letter is found
You may use a regexp with zero-width positive lookahead - it finds uppercase letters but doesn't include them into delimiter:
String s = "thisIsMyString";
String[] r = s.split("(?=\\p{Upper})");
Y(?=X)
matches Y
followed by X
, but doesn't include X
into match. So (?=\\p{Upper})
matches an empty sequence followed by a uppercase letter, and split
uses it as a delimiter.
See javadoc for more info on Java regexp syntax.
EDIT: By the way, it doesn't work with thisIsMyÜberString
. For non-ASCII uppercase letters you need a Unicode uppercase character class instead of POSIX one:
String[] r = s.split("(?=\\p{Lu})");
Split by capital letters in Java
It looks like you want to convert camelcase into readable language. Is that the case?
If so, this solution should work for you - How do I convert CamelCase into human-readable names in Java?
If you want subsequent words lowercased, you'll have to split to handle that yourself.
java regular expression: conditionally spilt string by capital letters
Since you can have multiple consecutive upper case letters, you want to have lookbehind for lower case as well as lookahead for upper case:
(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])
If you want support for other languages, you should use posix character classes:
(?<=\\p{Lower})(?=\\p{Upper})|(?<=\\p{Upper})(?=\\p{Upper}\\p{Lower})
The first alternation will match if you are between lowercase and uppercase letters. The second one - if you are between an upper case and another upper case, followed by lower case.
Split String based on Uppercase and Numbers
public static void main(String args[]) {
String s = "HeFNeO2H3Be1H";
String[] r = s.split("(?=[A-Z0-9])");
for (int i = 0; i<r.length; i++){
System.out.println(""+r[i]);
}
}
Split the string from first Upper case letter
boolean hadDot=false;//this makes sure we don't split before finding the file extension
String file="",date="";
for(int i=0;i<text.length();i++){
if(text.charAt(i)=='.'){
hadDot=true;
continue;
}
if(hadDot&&Character.isUpperCase(text.charAt(i))){
file=text.substring(0,i);
date=text.substring(i);
break;
}
}
How to split a string based of capital letters?
You need to match these chunks with /[A-Z]+[^A-Z]*|[^A-Z]+/g
instead of splitting with a zero-width assertion pattern, because the latter (in your case, it is a positive lookahead only regex) will have to check each position inside the string and it is impossible to tell the regex to skip a position once the lookaround pattern is found.
s = 'and some text hereOzievRQ7O37SB5qG3eLB';console.log(s.match(/[A-Z]+[^A-Z]*|[^A-Z]+/g));
split a string based on pattern in java - capital letters and numbers
You can actually do this in regex alone using look ahead and look behind
(see special constructs on this page: http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html )
/**
* We'll use this pattern as divider to split the string into an array.
* Usage: myString.split(DIVIDER_PATTERN);
*/
private static final String DIVIDER_PATTERN =
"(?<=[^\\p{Lu}])(?=\\p{Lu})"
// either there is anything that is not an uppercase character
// followed by an uppercase character
+ "|(?<=[\\p{Ll}])(?=\\d)"
// or there is a lowercase character followed by a digit
;
@Test
public void testStringSplitting() {
assertEquals(2, "3/4Word".split(DIVIDER_PATTERN).length);
assertEquals(7, "ManyManyWordsInThisBigThing".split(DIVIDER_PATTERN).length);
assertEquals(7, "This123/4Mixed567ThingIsDifficult"
.split(DIVIDER_PATTERN).length);
}
So what you can do is something like this:
for(String word: myString.split(DIVIDER_PATTERN)){
System.out.println(word);
}
Sean
Split a string at uppercase letters, but only if a lowercase letter follows in Python
We can try using re.sub
here for a regex approach:
inp = "2018Annual ReportInvesting for Growth and Market LeadershipOur CEO will provide you with all further details below."
inp = re.sub(r'(?<![A-Z\W])(?=[A-Z])', ' ', inp)
print(inp)
This prints:
2018 Annual Report Investing for Growth and Market Leadership Our CEO will provide you with all further details below.
The regex used here says to insert a space at any point for which:
(?<![A-Z\W]) what precedes is a word character EXCEPT
for capital letters
(?=[A-Z]) and what follows is a capital letter
Related Topics
Extended Server_Name (Sni Extension) Not Sent with Jdk1.8.0 But Send with Jdk1.7.0
Spring - How to Use Multiple Transaction Managers in the Same Application
Scanner Class Skips Over Whitespace
Writing in the Beginning of a Text File Java
Optimizing Memory Leakage in Javafx
Do Not Use System.Out.Println in Server Side Code
Get the Week Start and End Date Given a Current Date and Week Start
Java.Io.Streamcorruptedexception: Invalid Type Code: 00
Parsing Dates of the Format "January 10Th, 2010" in Java? (With Ordinal Indicators, St|Nd|Rd|Th)
How to Convert a Java Object to Xml with Open Source APIs
Drag and Drop Custom Object from Jlist into Jlabel
How to Set Color to a Certain Row If Certain Conditions Are Met Using Java
Netbeans - Error: Could Not Find or Load Main Class
Which Is More Effective: If (Null == Variable) or If (Variable == Null)