Why Is Stringtokenizer Deprecated

Why is StringTokenizer deprecated?

From the javadoc for StringTokenizer:

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

If you look at String.split() and compare it to StringTokenizer, the relevant difference is that String.split() uses a regular expression, whereas StringTokenizer just uses verbatim split characters. So if I wanted to tokenize a string with more complex logic than single characters (e.g. split on \r\n), I can't use StringTokenizer but I can use String.split().

Is StringTokenizer more efficient in splitting strings in JAVA?

String.split() is more flexible and easier to use than StringTokenizer. StringTokenizer predates Java support for regular expression while String.split() supports regular expressions, this makes it a whole lot more powerful than StringTokenizer. Also the results of String.split is a string array which is usually how we want our results. StringTokenizer is indeed faster that String.split() but for most practical purposes String.split() is fast enough.

Check the answers on this question for more details Scanner vs. StringTokenizer vs. String.Split

String Tokenizer in Java

As many of the answers have suggested, String.split() will solve your problem. To escape the specific sequence you're trying to tokenize on you will have to escape the ')' in your sequence like this:

str.split("\\);");

Strange Behaviour of String Tokenizer

I assume you used new StringTokenizer(str,"~")

StringTokenizer uses the definition of token: A token is a maximum non empty char sequence sequence between delimiters.

Since the string between ~~ is empty, it cannot be a token (by this definition).

I used following code to verify that:

public static void main(String[] args) {
List<Object> obj = new ArrayList<>();
String str = "ABC~DEF~GHI~JKL~~MNO";// Input String
StringTokenizer stk = new StringTokenizer(str,"~");
while (stk.hasMoreTokens()) {
obj.add(stk.nextToken());
}
for (Object ob : obj) {
System.out.print(ob + "~>");
}
}

Actual Output (being consistent with the definition of token)

ABC~>DEF~>GHI~>JKL~>MNO~>

If the question is: Why is a token defined this way? Look at this example:

String str = "ABC DEF GHI"; // two spaces between

Stringtokenizer finds 3 Tokens. If you do not force a token to be non empty, this would return 5 Tokens (2 are ""). If you write a simple parser the current behaviour is more preferrable.

java string delimiter, keeping delimiter in token

From the docs, you can use StringTokenizer st = new StringTokenizer(str, "L", true); The last parameter is a boolean that specifies that delimiters have to be returned too.

Scanner vs. StringTokenizer vs. String.Split

They're essentially horses for courses.

  • Scanner is designed for cases where you need to parse a string, pulling out data of different types. It's very flexible, but arguably doesn't give you the simplest API for simply getting an array of strings delimited by a particular expression.
  • String.split() and Pattern.split() give you an easy syntax for doing the latter, but that's essentially all that they do. If you want to parse the resulting strings, or change the delimiter halfway through depending on a particular token, they won't help you with that.
  • StringTokenizer is even more restrictive than String.split(), and also a bit fiddlier to use. It is essentially designed for pulling out tokens delimited by fixed substrings. Because of this restriction, it's about twice as fast as String.split(). (See my comparison of String.split() and StringTokenizer.) It also predates the regular expressions API, of which String.split() is a part.

You'll note from my timings that String.split() can still tokenize thousands of strings in a few milliseconds on a typical machine. In addition, it has the advantage over StringTokenizer that it gives you the output as a string array, which is usually what you want. Using an Enumeration, as provided by StringTokenizer, is too "syntactically fussy" most of the time. From this point of view, StringTokenizer is a bit of a waste of space nowadays, and you may as well just use String.split().

StringTokenizer returns null instead of String

StringTokenizer is deprecated so should use String split method and it works for you code

    import java.util.Arrays;

public class MainClass {
public static void main(String[] args) {
String[] parameter = new String[10];
String rawTxt = "Chicken|None|Beast|Any|0|1|1|Hey Chicken!";
String[] split = rawTxt.split("\\|");
System.out.println(Arrays.toString(split));
}

}

replace StringTokenizer by String.split(..)

Considering that the documentation for split doesn't specify this behavior and has only one optional parameter that tells how large the array should be.. no you can't.

Also looking at the only other class I can think of that could have this feature - a scanner - it doesn't either. So I think the easiest would be to continue using the Tokenizer, even if it's deprecated. Better than writing your own class - while that shouldn't be too hard (quite trivial really) I can think of better ways to spend ones time.



Related Topics



Leave a reply



Submit