Split String With Dot as Delimiter

Java string split with . (dot)

You need to escape the dot if you want to split on a literal dot:

String extensionRemoved = filename.split("\\.")[0];

Otherwise you are splitting on the regex ., which means "any character".

Note the double backslash needed to create a single backslash in the regex.


You're getting an ArrayIndexOutOfBoundsException because your input string is just a dot, ie ".", which is an edge case that produces an empty array when split on dot; split(regex) removes all trailing blanks from the result, but since splitting a dot on a dot leaves only two blanks, after trailing blanks are removed you're left with an empty array.

To avoid getting an ArrayIndexOutOfBoundsException for this edge case, use the overloaded version of split(regex, limit), which has a second parameter that is the size limit for the resulting array. When limit is negative, the behaviour of removing trailing blanks from the resulting array is disabled:

".".split("\\.", -1) // returns an array of two blanks, ie ["", ""]

ie, when filename is just a dot ".", calling filename.split("\\.", -1)[0] will return a blank, but calling filename.split("\\.")[0] will throw an ArrayIndexOutOfBoundsException.

Split string with dot as delimiter

split() accepts a regular expression, so you need to escape . to not consider it as a regex meta character. Here's an example :

String[] fn = filename.split("\\."); 
return fn[0];

How to split a string using the period/dot/decimal point '.' as a delimiter in R

We need to escape (\\.) or use fixed = TRUE as . is a metacharacter in regex and it can match any character

strsplit(s, '.', fixed = TRUE)[[1]][2]
[1] "334"

According to ?strsplit

split - character vector (or object which can be coerced to such) containing regular expression(s) (unless fixed = TRUE) to use for splitting. If empty matches occur, in particular if split has length 0, x is split into single characters. If split has length greater than 1, it is re-cycled along x.

Also, as strsplit, returns a list, extract the list with [[ and get the second element ([2])


Or wrap with fixed in str_split

library(stringr)
str_split(s, fixed('.'))[[1]][2]
[1] "334"

We can also get the output with trimws

trimws(s, whitespace = ".*\\.")
[1] "334"

Or with sub

sub(".*\\.", "", s)
[1] "334"

String split with Dot character not working

Did you escape the dot? string.split("\\.") or string.split("[.]")

In shell, split a portion of a string with dot as delimiter

First, note that you don't use $ when assigning to a parameter in the shell. Your first line should be just this:

AU_NAME=AU_MSM3-3.7-00.01.02.03

The $ is used to get the value of the parameter once assigned. And the bit after the $ can be an expression in curly braces with extra stuff besides just the name, allowing you to perform various operations on the value. For example, you can do something like this:

IFS=. read major minor micro build <<EOF
${AU_NAME##*-}
EOF

where the ##*- strips off everything from the beginning of the string through the last '-', leaving just "00.01.02.03", and the IFS (Internal Field Separator) parameter tells the shell where to break the string into fields.

In bash, zsh, and ksh93+, you can get that onto one line by shortening the here-document to a here-string:

IFS=. read major minor micro build <<<"${AU_NAME##*-}"

More generally, in those same shells, you can split into an arbitrarily-sized array instead of distinct variables:

IFS=. components=(${AU_NAME##*-})

(Though that syntax won't work in especially-ancient versions of ksh; in them you have to do this instead:

IFS=. set -A components ${AU_NAME##*-}

)

That gets you this equivalence (except in zsh, which by default numbers the elements 1-4 instead of 0-3):

major=${components[0]}
minor=${components[1]}
micro=${components[2]}
build=${components[3]}

spliting a string by space and dot and comma at the same time

The method split can be used with a Regex pattern, so you can match more elaborated cases to split your string.

A matching pattern for your case would be:

[ \.,]+

Regex Exaplanation:

[ .,]+ - The brackets create Character Set, that will match any character in the set.

[ .,]+ - The plus sign is a Quantifier, it will match the previous token (the character set) one or more times, this solves the problem where the tokens are following one another, creating empty strings in the array.

You can test it with the following code:

class Main {
public static void main(String[] args) {
String str = "Hello, World!, StackOverflow. Test Regex";
String[] split = str.split("[ .,]+");
for(String s : split){
System.out.println(s);
}
}
}

The output is:

Hello

World!

StackOverflow

Test

Regex

String split by dot - Java

This handles all delim values:

String str = "21.12.2015";
String delim = "."; // or "-" or "?" or ...
String[] st = str.split(java.util.regex.Pattern.quote(delim));

Split a string with dot as delimiter and only when substring starts with capital letter only

So long as you can guarantee sentences will start with capital letters, you can use a lookahead for [A-Z]. You'll probably also want to split with the whitespace as well, which you can do by including \s*? in the split:

import re
s = 'The fox is running. The cat is drinking. The phone runs on Android 4.3. How man days are left this month'

re.split(r'\.\s*?(?=[A-Z])', s)

Results:

['The fox is running',
'The cat is drinking',
'The phone runs on Android 4.3',
'How man days are left this month']


Related Topics



Leave a reply



Submit