URLEncoder not able to translate space character
This behaves as expected. The URLEncoder
implements the HTML Specifications for how to encode URLs in HTML forms.
From the javadocs:
This class contains static methods for
converting a String to the
application/x-www-form-urlencoded MIME
format.
and from the HTML Specification:
application/x-www-form-urlencoded
Forms submitted with this content type
must be encoded as follows:
- Control names and values are escaped. Space characters are replaced
by `+'
You will have to replace it, e.g.:
System.out.println(java.net.URLEncoder.encode("Hello World", "UTF-8").replace("+", "%20"));
URL encoding the space character: + or %20?
From Wikipedia (emphasis and link added):
When data that has been entered into HTML forms is submitted, the form field names and values are encoded and sent to the server in an HTTP request message using method GET or POST, or, historically, via email. The encoding used by default is based on a very early version of the general URI percent-encoding rules, with a number of modifications such as newline normalization and replacing spaces with "+" instead of "%20". The MIME type of data encoded this way is application/x-www-form-urlencoded, and it is currently defined (still in a very outdated manner) in the HTML and XForms specifications.
So, the real percent encoding uses %20
while form data in URLs is in a modified form that uses +
. So you're most likely to only see +
in URLs in the query string after an ?
.
When should space be encoded to plus (+) or %20?
+
means a space only in application/x-www-form-urlencoded
content, such as the query part of a URL:
http://www.example.com/path/foo+bar/path?query+name=query+value
In this URL, the parameter name is query name
with a space and the value is query value
with a space, but the folder name in the path is literally foo+bar
, not foo bar
.
%20
is a valid way to encode a space in either of these contexts. So if you need to URL-encode a string for inclusion in part of a URL, it is always safe to replace spaces with %20
and pluses with %2B
. This is what, e.g., encodeURIComponent()
does in JavaScript. Unfortunately it's not what urlencode does in PHP (rawurlencode is safer).
See Also
HTML 4.01 Specification application/x-www-form-urlencoded
URLEncoder not able to translate ) (bracket) character
I Suggest to create the query part URL first and encoded it using:
URLEncoder
Example:
String query = "?login=" + URLEncoder.encode(login, "UTF-8") + "&password="+ URLEncoder.encode(pass, "UTF-8");
String url = REGISTER_URL + query;
URLDecoder is converting '+' into space
According to HTML URL Encoding Reference:
URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.
and +
sign itself must be encoded with %2B
. So if you want to pass your hash as a GET parameter in URL, you should replace plus signs with %2B
in your hash. Do not replace every +
in the entire URL because you might ruin other string parameters which suppose to contain spaces.
Related Topics
Only Using @JSONignore During Serialization, But Not Deserialization
Strings Are Objects in Java, So Why Don't We Use 'New' to Create Them
Eclipse 2021-09 Code Completion Not Showing All Methods and Classes
Returning Null as an Int Permitted with Ternary Operator But Not If Statement
Parsing Iso-8601 Datetime with Offset with Colon in Java
Easiest Way to Merge a Release into One Jar File
How to Execute System Commands (Linux/Bsd) Using Java
What Is a Question Mark "" and Colon ":" Operator Used For
Java: Syntax and Meaning Behind "[B@1Ef9157"? Binary/Address
A Simple Scenario Using Wait() and Notify() in Java
Synchronizing on String Objects in Java
Handling Passwords Used for Auth in Source Code
Strange Floating-Point Behaviour in a Java Program
String.Replaceall(Regex) Makes the Same Replacement Twice