Setting the default Java character encoding
Unfortunately, the file.encoding
property has to be specified as the JVM starts up; by the time your main method is entered, the character encoding used by String.getBytes()
and the default constructors of InputStreamReader
and OutputStreamWriter
has been permanently cached.
As Edward Grech points out, in a special case like this, the environment variable JAVA_TOOL_OPTIONS
can be used to specify this property, but it's normally done like this:
java -Dfile.encoding=UTF-8 … com.x.Main
Charset.defaultCharset()
will reflect changes to the file.encoding
property, but most of the code in the core Java libraries that need to determine the default character encoding do not use this mechanism.
When you are encoding or decoding, you can query the file.encoding
property or Charset.defaultCharset()
to find the current default encoding, and use the appropriate method or constructor overload to specify it.
Default charset for file encoding - Java
When I run your program on my Mac with -Dfile.encoding=UTF-16
, I get the following output (as a hex dump):
0000000 fe ff 00 54 00 68 00 65 00 20 00 64 00 65 00 66
0000020 00 61 00 75 00 6c 00 74 00 20 00 63 00 68 00 61
0000040 00 72 00 73 00 65 00 74 00 20 00 69 00 73 00 3a
0000060 00 20 00 55 00 54 00 46 00 2d 00 31 00 36 00 0a
So what is probably happening with you is: setting file.encoding
to UTF-16 causes Java to write UTF-16 sequences to the console and your console is not set up to handle UTF-16 output. The first two bytes (which together form the Unicode BYTE ORDER MARK) don't display properly (probably due to your console font and/or driver) and the remaining output is truncated at the first null byte (again, due to your console software).
You can try directing the output of your program to a file and looking at it with a hex editor or something too get a better idea of what's happening.
Finding the default file encoding of a remote jvm
Here is what I ended up doing... (roughly)
mbs = conn.getMBeanServerConnection();
ObjectName runtime = new ObjectName(ManagementFactory.RUNTIME_MXBEAN_NAME);
TabularDataSupport foo =
(TabularDataSupport) mbs.getAttribute(runtime, "SystemProperties");
for (Iterator<Object> it = foo.values().iterator();
it.hasNext() && null == retVal; ) {
CompositeDataSupport cds = (CompositeDataSupport) it.next();
for (Iterator<?> iter = cds.values().iterator() ;
iter.hasNext() && null == retVal ;) {
if ("file.encoding".equals(iter.next()) && iter.hasNext())
retVal = iter.next().toString();
}
I connected to the MBeanServer and then worked through the SystemProperties to find the file.encoding for the process on the other end of the connection.
How to better setting up JVM encoding properties to UTF-8
We can encode the source encoding and output encoding by passing runtime arguments to command as follows:
mvn -Dproject.build.sourceEncoding=UTF-8 -Dproject.reporting.outputEncoding=UTF-8 clean deploy
Or by adding line in pom.xml
:
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<redis.version>1.3.5.RELEASE</redis.version>
</properties>
JVM property -Dfile.encoding=UTF8 or UTF-8?
[INFO] BUILD SUCCESS
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
Anyway, it works for me:)
Related Topics
Quickest Way to Find Missing Number in an Array of Numbers
Given Final Block Not Properly Padded
Do JSON Keys Need to Be Unique
How to Implement a Re-Try-Catch
Does Setting Java Objects to Null Do Anything Anymore
Jdbc MySQL Connection Pooling Practices to Avoid Exhausted Connection Pool
Difference Between Spring @Controller and @Restcontroller Annotation
How Is an Instance Initializer Different from a Constructor
Java Date Parsing with Microsecond or Nanosecond Accuracy
Determine Which Jar File a Class Is From
Print an Integer in Binary Format in Java
When to Use Wrapper Class and Primitive Type
How to Read File from End to Start (In Reverse Order) in Java
When Should I Use File.Separator and When File.Pathseparator
Swingutilities.Invokelater() Why Is It Needed
In Java, How to Check If a String Contains a Substring (Ignoring Case)