JSP: Encoding of input parameter in form: Difference between IE and Firefox
There were two problems which interfere with each other:
1) When using a normal post, i have to encode correct via
<%@page contentType="text/html;charset=utf-8" %>
and decode correct via
String encodedLastName = new String(request.getParameter("lastName").getBytes("iso-8859-1"), "UTF-8");
2) When using jquery, adding
contentType: 'application/x-www-form-urlencoded; charset=UTF-8'
in the $.ajax call.
HTML : Form does not send UTF-8 format inputs
I added the
meta
tag : nothing changed.
It indeed doesn't have any effect when the page is served over HTTP instead of e.g. from local disk file system (i.e. the page's URL is http://...
instead of e.g. file://...
). In HTTP, the charset in HTTP response header will be used. You've already set it as below:
<%@page pageEncoding="UTF-8"%>
This will not only write out the HTTP response using UTF-8, but also set the charset
attribute in the Content-Type
response header.
This one will be used by the webbrowser to interpret the response and encode any HTML form params.
I added the
accept-charset
attribute inform
: nothing changed.
It has only effect in Microsoft Internet Explorer browser. Even then it is doing it wrongly. Never use it. All real webbrowsers will instead use the charset
attribute specified in the Content-Type
header of the response. Even MSIE will do it the right way as long as you do not specify the accept-charset
attribute. As said before, you have already properly set it via pageEncoding
.
Get rid of both the meta
tag and accept-charset
attribute. They do not have any useful effect and they will only confuse yourself in long term and even make things worse when enduser uses MSIE. Just stick to pageEncoding
. Instead of repeating the pageEncoding
over all JSP pages, you could also set it globally in web.xml
as below:
<jsp-config>
<jsp-property-group>
<url-pattern>*.jsp</url-pattern>
<page-encoding>UTF-8</page-encoding>
</jsp-property-group>
</jsp-config>
As said, this will tell the JSP engine to write HTTP response output using UTF-8 and set it in the HTTP response header too. The webbrowser will use the same charset to encode the HTTP request parameters before sending back to server.
Your only missing step is to tell the server that it must use UTF-8 to decode the HTTP request parameters before returning in getParameterXxx()
calls. How to do that globally depends on the HTTP request method. Given that you're using POST method, this is relatively easy to achieve with the below servlet filter class which automatically hooks on all requests:
@WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
@Override
public void init(FilterConfig config) throws ServletException {
// NOOP.
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
@Override
public void destroy() {
// NOOP.
}
}
That's all. In Servlet 3.0+ (Tomcat 7 and newer) you don't need additional web.xml
configuration.
You only need to keep in mind that it's very important that setCharacterEncoding()
method is called before the POST request parameters are obtained for the first time using any of getParameterXxx()
methods. This is because they are parsed only once on first access and then cached in server memory.
So e.g. below sequence is wrong:
String foo = request.getParameter("foo"); // Wrong encoding.
// ...
request.setCharacterEncoding("UTF-8"); // Attempt to set it.
String bar = request.getParameter("bar"); // STILL wrong encoding!
Doing the setCharacterEncoding()
job in a servlet filter will guarantee that it runs timely (at least, before any servlet).
In case you'd like to instruct the server to decode GET (not POST) request parameters using UTF-8 too (those parameters you see after ?
character in URL, you know), then you'd basically need to configure it in the server end. It's not possible to configure it via servlet API. In case you're using for example Tomcat as server, then it's a matter of adding URIEncoding="UTF-8"
attribute in <Connector>
element of Tomcat's own /conf/server.xml
.
In case you're still seeing Mojibake in the console output of System.out.println()
calls, then chances are big that the stdout itself is not configured to use UTF-8. How to do that depends on who's responsible for interpreting and presenting the stdout. In case you're using for example Eclipse as IDE, then it's a matter of setting Window > Preferences > General > Workspace > Text File Encoding to UTF-8.
See also:
- Unicode - How to get the characters right?
How to enforce internet explorer to use encoding given in meta tag?
Looks like there is a small typo in your meta tag. It should say:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
^ you had 'content' here, and forgot to close the tag here ^
I don't have IE7 handy, so can't check if that's the reason. Both versions work fine on IE8
How to send form-post with iso-encoding, when no control of server
A bit dissapointed, but I went with the form-to-iframe solution, and set the accept-charset as stated in Setting the character encoding in form submit for Internet Explorer
<form accept-charset="ISO-8859-1" ...
Parcel character encoding
According to the mozilla docs, the charset field of the <script>
tag is deprecated:
If present, its value must be an ASCII case-insensitive match for "utf-8". It’s unnecessary to specify the charset attribute, because documents must use UTF-8, and the script element inherits its character encoding from the document.
So I think you can just use special characters in your .js
files by default, and parcel will respect them.
See this example in the parcel repl
Is form charset required?
Almost every decent browser ignores the accept-charset
attribute in favour of the encoding of the page with the form as it is defined in charset
param of the Content-Type
response header. The attribute works as far only in MSIE and even then, it is using it wrong. In MSIE running on Windows, any other value than UTF-8 would be interpreted as CP-1252.
Don't use this attribute. It's useless.
Related Topics
Why Does Margin-Top Work With Inline-Block But Not With Inline
Curved Div With Transparent Top
Xpath to Match @Class Value and Element Value
Using :Before and :After CSS Selector to Insert HTML
HTML Button Calling an MVC Controller and Action Method
Why Are Frames Deprecated in HTML
How to Fit an Image (Img) Inside a Div and Keep the Aspect Ratio
Spring Boot with Angularjs HTML5Mode
Iframes and the Safari on the iPad, How Can the User Scroll the Content
What Characters Can Be Used For Up/Down Triangle (Arrow Without Stem) For Display in Html
Make Wrapper Take Maximum Width of Child Image
What's the Difference Between "&Nbsp;" and " "
Using Base Tag on a Page That Contains Svg Marker Elements Fails to Render Marker
Jenkins - HTML Publisher Plugin - No CSS Is Displayed When Report Is Viewed in Jenkins Server