Setting the Default Java Character Encoding

Setting the default Java character encoding

Unfortunately, the file.encoding property has to be specified as the JVM starts up; by the time your main method is entered, the character encoding used by String.getBytes() and the default constructors of InputStreamReader and OutputStreamWriter has been permanently cached.

As Edward Grech points out, in a special case like this, the environment variable JAVA_TOOL_OPTIONS can be used to specify this property, but it's normally done like this:

java -Dfile.encoding=UTF-8 … com.x.Main

Charset.defaultCharset() will reflect changes to the file.encoding property, but most of the code in the core Java libraries that need to determine the default character encoding do not use this mechanism.

When you are encoding or decoding, you can query the file.encoding property or Charset.defaultCharset() to find the current default encoding, and use the appropriate method or constructor overload to specify it.

Default charset for file encoding - Java

When I run your program on my Mac with -Dfile.encoding=UTF-16, I get the following output (as a hex dump):

0000000    fe  ff  00  54  00  68  00  65  00  20  00  64  00  65  00  66
0000020 00 61 00 75 00 6c 00 74 00 20 00 63 00 68 00 61
0000040 00 72 00 73 00 65 00 74 00 20 00 69 00 73 00 3a
0000060 00 20 00 55 00 54 00 46 00 2d 00 31 00 36 00 0a

So what is probably happening with you is: setting file.encoding to UTF-16 causes Java to write UTF-16 sequences to the console and your console is not set up to handle UTF-16 output. The first two bytes (which together form the Unicode BYTE ORDER MARK) don't display properly (probably due to your console font and/or driver) and the remaining output is truncated at the first null byte (again, due to your console software).

You can try directing the output of your program to a file and looking at it with a hex editor or something too get a better idea of what's happening.

Setting default character encoding for all jsps in a Java Web Application

You can configure the default Character Encoding for all JSPs on the web.xml file, that way it's done globally

<jsp-config>   
<jsp-property-group id="defaultUtf8Encoder">
<url-pattern>*.jsp</url-pattern>
<page-encoding>UTF-8</page-encoding>
</jsp-property-group>
</jsp-config>

What you can also do, is to create a Filter which sets the response character encoding (and eventually content type), as such: (below the example does the character encoding)

public class CharsetFilter
implements Filter {

String encoding = "UTF-8";

public void destroy() {
/* Do nothing */
}

public void doFilter(ServletRequest request,
ServletResponse response,
FilterChain chain) throws IOException, ServletException {

response.setCharacterEncoding(encoding);
chain.doFilter(request, response);
}

public void init(FilterConfig config) throws ServletException {
}
}

Then you define the filter in the web.xml file

<filter>
<filter-name>
charsetFilter
</filter-name>
<filter-class>
your.filter.package.CharsetFilter
</filter-class>
</filter>

<filter-mapping>
<filter-name>charsetFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>

Notice that I'm applying the filter to /* which uses any web app resource. This may be handy if you want the filter to affect every single web resource

Hopefully that should sort you out

How to change a default charset for Java machine in Eclipse?

You can change the workspace default encoding for all files in the Preference page Workspace, about the bottom left of the dialog. All new projects and files will have that encoding.

However, if you are using existing projects, be vary of three things:

  1. Existing files saved with a different encoding might need conversion; it will not happen automatically.
  2. Projects (or files or folders in projects) might have alternative default encodings (set Resource page of the Properties dialog). Those settings are more specific, and will not be affected by the generic workspace settings. Furthermore, these settings are shared with the project, e.g. through version control, so if you collaborate with others, make sure to set up encoding together.
  3. This is not Java specific; there is no supported way to do it only for Java projects.

Edit: as you want to edit the default character encoding for your application, see the answer Setting the default Java character encoding?

In short, you have to set it up as a JVM parameter, like

java -Dfile.encoding=UTF-8 … com.x.Main

JVM property -Dfile.encoding=UTF8 or UTF-8?


[INFO] BUILD SUCCESS

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8

Anyway, it works for me:)



Related Topics



Leave a reply



Submit