Urlencoder Not Able to Translate Space Character

URLEncoder not able to translate space character

This behaves as expected. The URLEncoder implements the HTML Specifications for how to encode URLs in HTML forms.

From the javadocs:

This class contains static methods for
converting a String to the
application/x-www-form-urlencoded MIME
format.

and from the HTML Specification:

application/x-www-form-urlencoded

Forms submitted with this content type
must be encoded as follows:

  1. Control names and values are escaped. Space characters are replaced
    by `+'

You will have to replace it, e.g.:

System.out.println(java.net.URLEncoder.encode("Hello World", "UTF-8").replace("+", "%20"));

URL encoding the space character: + or %20?

From Wikipedia (emphasis and link added):

When data that has been entered into HTML forms is submitted, the form field names and values are encoded and sent to the server in an HTTP request message using method GET or POST, or, historically, via email. The encoding used by default is based on a very early version of the general URI percent-encoding rules, with a number of modifications such as newline normalization and replacing spaces with "+" instead of "%20". The MIME type of data encoded this way is application/x-www-form-urlencoded, and it is currently defined (still in a very outdated manner) in the HTML and XForms specifications.

So, the real percent encoding uses %20 while form data in URLs is in a modified form that uses +. So you're most likely to only see + in URLs in the query string after an ?.

When should space be encoded to plus (+) or %20?

+ means a space only in application/x-www-form-urlencoded content, such as the query part of a URL:

http://www.example.com/path/foo+bar/path?query+name=query+value

In this URL, the parameter name is query name with a space and the value is query value with a space, but the folder name in the path is literally foo+bar, not foo bar.

%20 is a valid way to encode a space in either of these contexts. So if you need to URL-encode a string for inclusion in part of a URL, it is always safe to replace spaces with %20 and pluses with %2B. This is what, e.g., encodeURIComponent() does in JavaScript. Unfortunately it's not what urlencode does in PHP (rawurlencode is safer).

See Also

HTML 4.01 Specification application/x-www-form-urlencoded

URLEncoder not able to translate ) (bracket) character

I Suggest to create the query part URL first and encoded it using:
URLEncoder

Example:

String query = "?login=" + URLEncoder.encode(login, "UTF-8") + "&password="+ URLEncoder.encode(pass, "UTF-8");
String url = REGISTER_URL + query;

URLDecoder is converting '+' into space

According to HTML URL Encoding Reference:

URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.

and + sign itself must be encoded with %2B. So if you want to pass your hash as a GET parameter in URL, you should replace plus signs with %2B in your hash. Do not replace every + in the entire URL because you might ruin other string parameters which suppose to contain spaces.



Related Topics



Leave a reply



Submit