Httpservletrequest - Setcharacterencoding Seems to Do Nothing

HttpServletRequest - setCharacterEncoding seems to do nothing

If you are using tomcat, you should also set the URIEncoding to UTF-8 in your connectors:

<Server port="8105" shutdown="SHUTDOWN">
...
<Service name="Catalina">
<Connector port="8180" URIEncoding="UTF-8" />
<Engine name="Catalina" defaultHost="localhost">
<Host name="localhost" appBase="webapps" />
</Engine>
</Service>
</Server>

request.setcharacterencoding() method seems do nothing in the source code

This method is implemented by servlet container. For example for Tomcat 8.5 implementation reside in
org.apache.catalina.connector.Request#setCharacterEncoding
and looks like:

public void setCharacterEncoding(String enc) throws UnsupportedEncodingException {
if(!this.usingReader) {
B2CConverter.getCharset(enc);
this.coyoteRequest.setCharacterEncoding(enc);
}
}

As you can see it is just validates encoding name and set internal request implementation field in which encoding stored. You can search you servlet container source code for implements HttpServletRequest and look at implementation.

HttpServletRequest UTF-8 Encoding

Paul's suggestion seems like the best course of action, but if you're going to work around it, you don't need URLEncoder or URLDecoder at all:

String item = request.getParameter("param"); 

byte[] bytes = item.getBytes(StandardCharsets.ISO_8859_1);
item = new String(bytes, StandardCharsets.UTF_8);

// Java 6:
// byte[] bytes = item.getBytes("ISO-8859-1");
// item = new String(bytes, "UTF-8");

Update: Since this is getting a lot of votes, I want to stress BalusC's point that this definitely is not a solution; it is a workaround at best. People should not be doing this.

I don't know exactly what caused the original issue, but I suspect the URL was already UTF-8 encoded, and then was UTF-8 encoded again.

Why 'ServletContext#setRequestCharacterEncoding' does not have an effect on 'HttpServletRequest#getReader'?

It is an Apache Tomcat bug (specific to getReader()) that will be fixed in 9.0.21 onwards thanks to your report on the Tomcat users mailing list.

For the curious, here is the fix.

request.getQueryString() seems to need some encoding

I've run into this same problem before. Not sure what Java servlet container you're using, but at least in Tomcat 5.x (not sure about 6.x) the request.setCharacterEncoding() method doesn't really have an effect on GET parameters. By the time your servlet runs, GET parameters have already been decoded by Tomcat, so setCharacterEncoding won't do anything.

Two ways to get around this:

  1. Change the URIEncoding setting for your connector to UTF-8. See http://tomcat.apache.org/tomcat-5.5-doc/config/http.html.

  2. As BalusC suggests, decode the query string yourself, and manually parse it (as opposed to using the ServletRequest APIs) into a parameter map yourself.

Hope this helps!



Related Topics



Leave a reply



Submit