How to Encode Url to Avoid Special Characters in Java

How to encode URL to avoid special characters in Java?

URL construction is tricky because different parts of the URL have different rules for what characters are allowed: for example, the plus sign is reserved in the query component of a URL because it represents a space, but in the path component of the URL, a plus sign has no special meaning and spaces are encoded as "%20".

RFC 2396 explains (in section 2.4.2) that a complete URL is always in its encoded form: you take the strings for the individual components (scheme, authority, path, etc.), encode each according to its own rules, and then combine them into the complete URL string. Trying to build a complete unencoded URL string and then encode it separately leads to subtle bugs, like spaces in the path being incorrectly changed to plus signs (which an RFC-compliant server will interpret as real plus signs, not encoded spaces).

In Java, the correct way to build a URL is with the URI class. Use one of the multi-argument constructors that takes the URL components as separate strings, and it'll escape each component correctly according to that component's rules. The toASCIIString() method gives you a properly-escaped and encoded string that you can send to a server. To decode a URL, construct a URI object using the single-string constructor and then use the accessor methods (such as getPath()) to retrieve the decoded components.

Don't use the URLEncoder class! Despite the name, that class actually does HTML form encoding, not URL encoding. It's not correct to concatenate unencoded strings to make an "unencoded" URL and then pass it through a URLEncoder. Doing so will result in problems (particularly the aforementioned one regarding spaces and plus signs in the path).

How to handle special characters in url as parameter values?

Use URLEncoder to encode your URL string with special characters.When encoding a String, the following rules apply:

  • The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
  • The special characters ".", "-", "*", and "_" remain the same.
  • The space character " " is converted into a plus sign "+".
  • All other characters are unsafe and are first converted into one or more bytes using some encoding scheme. Then each byte is represented

    by the 3-character string "%xy", where xy is the two-digit

    hexadecimal representation of the byte. The recommended encoding

    scheme to use is UTF-8. However, for compatibility reasons, if an

    encoding is not specified, then the default encoding of the platform

    is used.

For example using UTF-8 as the encoding scheme the string The string ü@foo-bar would get converted to The+string+%C3%BC%40foo-bar because in UTF-8 the character ü is encoded as two bytes C3 (hex) and BC (hex), and the character @ is encoded as one byte 40 (hex).

Java: handle special character in URI

when you pass the path through URI, you should encode it first, if you use ajax, maybe encodeURIComponent() method is proper.
like this:

encodeURIComponent("file://C:/6-6+hf.1-181/db/mssql-ddl.sql")
//output
"file%3A%2F%2FC%3A%2F6-6%2Bhf.1-181%2Fdb%2Fmssql-ddl.sql"

if you use java URLEncode.encode(String str, String env) method is proper.

    String path = "file://C:/6-6+hf.1-181/db/mssql-ddl.sql";
String path1 = URLEncoder.encode(path,"UTF-8");
System.out.println(path1);
String path2 = URLDecoder.decode(path1,"UTF-8");
System.out.println(path2);

//output
file%3A%2F%2FC%3A%2F6-6%2Bhf.1-181%2Fdb%2Fmssql-ddl.sql
file://C:/6-6+hf.1-181/db/mssql-ddl.sql

URL Encode and Decode Special character in Java

Sadly url encoder will not solve your problem. I had this problem and used a custom utility. I remember this I got from googling ;).

http://www.javapractices.com/topic/TopicAction.do?Id=96



Related Topics



Leave a reply



Submit