Utf-8 Charset Doesn't Work with Javax.Mail

UTF-8 charset doesn't work with javax.mail

For all e-mails

There are a couple of system properties related to mailing, that can probably simplify your code. I am talking about this specific property actually: "mail.mime.charset".

The mail.mime.charset System property can be used to specify the default MIME charset to use for encoded words and text parts that don't otherwise specify a charset. Normally, the default MIME charset is derived from the default Java charset, as specified in the file.encoding System property. Most applications will have no need to explicitly set the default MIME charset. In cases where the default MIME charset to be used for mail messages is different than the charset used for files stored on the system, this property should be set.

As you can read above, by default there is no value for the mail.mime.charset and the file encoding (file.encoding property) is used.

For a specific e-mail

However, if you want to specify a specific encoding for a specific e-mail, then you should probably use the 2 parameter setSubject(subject,charset) and setText(text,charset) methods.

If that doesn't work, then probably your input is already corrupted before it reached this point. In other words, you probably used the wrong encoding to collect your data.

Mime types are complicated

The setContent(content, "UTF-8") (as other sources claim) will just not work. Just look at the signature of this method: setContent(Object content, String mimetype). Mime type and charset are 2 totally different things. Imho, you should really be using one of the setText(...) methods with a charset parameter.

But if you persist in using a mimetype to set the charset setContent(content,mimetype), then use the correct format. (not just "UTF-8", but something like "text/plain; charset=UTF-8"). But more importantly, be aware that every mime-type has its own way of handling charsets.

  • As specified in RFC-2046 the default charset for text/plain is US-ASCII, but can be overruled with an additional charset parameter.
  • However, in RFC-6657 makes clear that the text/xml type determines the charset using the content of the message. The charset parameter will just be ignored here.
  • And in RFC-2854 is stated that text/html should really always specify a charset. But if you don't, then it will use ISO-8859-1 (=Latin-1).

Java Mail API - Encoding problems

Basically, my code works just fine, as its supposed to. It was the cmd, that could not handle non-ascii letters. I used a bat file to access a jar. I think I'm just going to make a little GUI then... Thanks everyone for answering.

JavaMail API getSubject(), subject has multiple =?utf-8?B?~?=, how can I parse?

The problem is that the mailer that encoded this text encoded it incorrectly. What mailer was used to create this message?

The 16 bit Korean Unicode characters are converted to a stream of 8 bit bytes in UTF-8 format. The 8 bit bytes are then encoded using base64 encoding.

The MIME spec (RFC 2047) requires that each encoded word contain complete characters:

   Each 'encoded-word' MUST represent an integral number of characters.
A multi-octet character may not be split across adjacent 'encoded-
word's.

In your example above, the bytes representing one of the Korean characters are split across multiple encoded words. Combining them into one encoded word, as you have done, allows the text to be decoded correctly.

This is a bug in the mailer that created the message and should be reported to the owner of that mailer.

Unfortunately, there's no good workaround in JavaMail for such a broken mailer.

Java Mail: Setting Chinese Encoding

I have found the problem.
The problem was NOT in the Email class and only tangentially related to javamail.
(so the code above is correct)

The problem is in extracting the text from a hotwired instance of Spring's MessageBundle in my Controller.

I INCORRECTLY used the following code so that my logging class could log the Strings pulled from the message bundle.

byte[] barray =messageSource.getMessage(code, null, LocaleContextHolder.getLocale()).getBytes(Charset.forName("UTF-8"));
String s = new String(barray);

Log4j could read the Strings in the subject and message (built by StringBuilder) which led me to believe that the Strings were in correct UTF-8. However, javax.mail garbled the in transmission.

What I should have done is this:

messageSource.getMessage(code, null, LocaleContextHolder.getLocale())

Now my logger just gets ???, but the email sends just fine.

So, Keep It Simple Stupid.



Related Topics



Leave a reply



Submit