What Is the Default Encoding of the Jvm

Setting the default Java character encoding

Unfortunately, the file.encoding property has to be specified as the JVM starts up; by the time your main method is entered, the character encoding used by String.getBytes() and the default constructors of InputStreamReader and OutputStreamWriter has been permanently cached.

As Edward Grech points out, in a special case like this, the environment variable JAVA_TOOL_OPTIONS can be used to specify this property, but it's normally done like this:

java -Dfile.encoding=UTF-8 … com.x.Main

Charset.defaultCharset() will reflect changes to the file.encoding property, but most of the code in the core Java libraries that need to determine the default character encoding do not use this mechanism.

When you are encoding or decoding, you can query the file.encoding property or Charset.defaultCharset() to find the current default encoding, and use the appropriate method or constructor overload to specify it.

Default charset for file encoding - Java

When I run your program on my Mac with -Dfile.encoding=UTF-16, I get the following output (as a hex dump):

0000000    fe  ff  00  54  00  68  00  65  00  20  00  64  00  65  00  66
0000020 00 61 00 75 00 6c 00 74 00 20 00 63 00 68 00 61
0000040 00 72 00 73 00 65 00 74 00 20 00 69 00 73 00 3a
0000060 00 20 00 55 00 54 00 46 00 2d 00 31 00 36 00 0a

So what is probably happening with you is: setting file.encoding to UTF-16 causes Java to write UTF-16 sequences to the console and your console is not set up to handle UTF-16 output. The first two bytes (which together form the Unicode BYTE ORDER MARK) don't display properly (probably due to your console font and/or driver) and the remaining output is truncated at the first null byte (again, due to your console software).

You can try directing the output of your program to a file and looking at it with a hex editor or something too get a better idea of what's happening.

Finding the default file encoding of a remote jvm

Here is what I ended up doing... (roughly)

  mbs = conn.getMBeanServerConnection();
ObjectName runtime = new ObjectName(ManagementFactory.RUNTIME_MXBEAN_NAME);
TabularDataSupport foo =
(TabularDataSupport) mbs.getAttribute(runtime, "SystemProperties");
for (Iterator<Object> it = foo.values().iterator();
it.hasNext() && null == retVal; ) {
CompositeDataSupport cds = (CompositeDataSupport) it.next();
for (Iterator<?> iter = cds.values().iterator() ;
iter.hasNext() && null == retVal ;) {
if ("file.encoding".equals(iter.next()) && iter.hasNext())
retVal = iter.next().toString();
}

I connected to the MBeanServer and then worked through the SystemProperties to find the file.encoding for the process on the other end of the connection.

How to better setting up JVM encoding properties to UTF-8

We can encode the source encoding and output encoding by passing runtime arguments to command as follows:

mvn -Dproject.build.sourceEncoding=UTF-8 -Dproject.reporting.outputEncoding=UTF-8 clean deploy 

Or by adding line in pom.xml:

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<redis.version>1.3.5.RELEASE</redis.version>
</properties>

JVM property -Dfile.encoding=UTF8 or UTF-8?


[INFO] BUILD SUCCESS

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8

Anyway, it works for me:)



Related Topics



Leave a reply



Submit