Java equivalent to JavaScript's encodeURIComponent that produces identical output?
Looking at the implementation differences, I see that:
MDC on encodeURIComponent()
:
- literal characters (regex representation):
[-a-zA-Z0-9._*~'()!]
Java 1.5.0 documentation on URLEncoder
:
- literal characters (regex representation):
[-a-zA-Z0-9._*]
- the space character
" "
is converted into a plus sign"+"
.
So basically, to get the desired result, use URLEncoder.encode(s, "UTF-8")
and then do some post-processing:
- replace all occurrences of
"+"
with"%20"
- replace all occurrences of
"%xx"
representing any of[~'()!]
back to their literal counter-parts
JS encodeURI Equivalent of Java
URLEncoder is for encoding form data. To create an escaped URL or URI, use the java.net.URI class:
URI uri = new URI("file", "///10.10.10.10/Yev Pri - Ru─▒n G├╢z├╝yle Ortado─Яu.pdf", null);
String escapedURI = uri.toASCIIString();
Note: You cannot use new URI("file://///10.10.10.10/Yev Pri - Ru─▒n G├╢z├╝yle Ortado─Яu.pdf")
because that constructor does not perform percent-escaping of characters which may not legally appear directly in URIs. The class documentation explicitly specifies that the one-argument constructor expects the argument to already have proper escaping.
How to get Java to match JavaScript encodeURIComponent() method?
According to Mozilla Developer Docs encodeURICompoent() uses UTF-8 to encode. When I run this on your string I get tester%C3%A6%C3%B8%C3%A5 as expected. When i run the following Java code:
System.out.println(URLEncoder.encode("testeræøå", "UTF-8"));
It also prints tester%C3%A6%C3%B8%C3%A5. I also ran your test and got:
------ START TESTING WITH USER ID = 'dummy' ----------------------
Test URLEncoder.encode(userId): dummy
Test URLEncoder.encode(userId,"UTF-8"): dummy
Test URLEncoder.encode(userId,"UTF-16"): dummy
Test URLEncoder.encode(userId,"UTF-16LE"): dummy
Test URLEncoder.encode(userId,"UTF-16BE"): dummy
Test engine.eval("encodeURIComponent(\""+userId+"\")"): dummy
Test encodeURIComponent(userId): dummy
TEST new URI(userId).toASCIIString(): dummy
------ END TESTING WITH USER ID = 'dummy' ----------------------
------ START TESTING WITH USER ID = 'testeræøå' ----------------------
Test URLEncoder.encode(userId): tester%C3%A6%C3%B8%C3%A5
Test URLEncoder.encode(userId,"UTF-8"): tester%C3%A6%C3%B8%C3%A5
Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%E6%00%F8%00%E5
Test URLEncoder.encode(userId,"UTF-16LE"): tester%E6%00%F8%00%E5%00
Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%E6%00%F8%00%E5
Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%C3%A6%C3%B8%C3%A5
Test encodeURIComponent(userId): tester%C3%A6%C3%B8%C3%A5
TEST new URI(userId).toASCIIString(): tester%C3%A6%C3%B8%C3%A5
------ END TESTING WITH USER ID = 'testeræøå' ----------------------
------ START TESTING WITH USER ID = 'tester%C3%A6%C3%B8%C3%A5' ----------------------
Test URLEncoder.encode(userId): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test URLEncoder.encode(userId,"UTF-8"): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%25C3%FE%FF%00%25A6%FE%FF%00%25C3%FE%FF%00%25B8%FE%FF%00%25C3%FE%FF%00%25A5
Test URLEncoder.encode(userId,"UTF-16LE"): tester%25%00C3%25%00A6%25%00C3%25%00B8%25%00C3%25%00A5
Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%25C3%00%25A6%00%25C3%00%25B8%00%25C3%00%25A5
Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test encodeURIComponent(userId): tester%25C3%25A6%25C3%25B8%25C3%25A5
TEST new URI(userId).toASCIIString(): tester%C3%A6%C3%B8%C3%A5
------ END TESTING WITH USER ID = 'tester%C3%A6%C3%B8%C3%A5' ----------------------
This is what I would expect.
I think you need to check the file encoding for your Java source file. If you are using Eclipse it defaults to cp1252 for some reason. The first thing I do when I install Eclipse is change the default encoding to UTF-8.
Decode url which has been encoded in javascript
Use java.net.URLDecoder
.
But pay attention that there are several differences between java and javascript implementations.
For details take a look on:
Difference in URL decode/encode UTF-8 between Java and JS/AS3 (bug!?)
Java equivalent to JavaScript's encodeURIComponent that produces identical output?
Java URL encoding: URLEncoder vs. URI
Although I think the answer from @fge is the right one, as I was using a 3rd party webservice that relied on the encoding outlined in the W3Schools article, I followed the answer from Java equivalent to JavaScript's encodeURIComponent that produces identical output?
public static String encodeURIComponent(String s) {
String result;
try {
result = URLEncoder.encode(s, "UTF-8")
.replaceAll("\\+", "%20")
.replaceAll("\\%21", "!")
.replaceAll("\\%27", "'")
.replaceAll("\\%28", "(")
.replaceAll("\\%29", ")")
.replaceAll("\\%7E", "~");
} catch (UnsupportedEncodingException e) {
result = s;
}
return result;
}
Related Topics
What Does a Java Static Method Look Like in Ruby
Svg/Vector Graphical Objects Boolean Operations (Union, Intersection, Subtraction)
Multiple Dex Files Define <My Package>/Buildconfig, Can't Find the Cause:
Why Ruby Modulo Is Different from Java/Other Lang
How to Get from Jruby a Correctly Typed Ruby Implementation of a Java Interface
Google Cloud Messaging: Don't Receive Alerts When iOS App Is in Background
Java Apns Certificate Error with "Derinputstream.Getlength(): Lengthtag=109, Too Big."
R.Raw.Anything Cannot Be Resolved
How to Iterate Through the Id Properties of R.Java Class
Get Current Time in a Given Timezone:Android
Simple Program to Call R from Java Using Eclipse and Rserve
How Does One Configure Rjava on Osx to Select the Right Jvm -- .Jinit() Failing
Android:Save a Bitmap to Bmp File Format
Android Studio Mailto Intent Doesn't Show Subject and Mail Body
File Exists and Is Directory, But Listfiles() Returns Null
JPA Support for Java 8 New Date and Time API
How to Change the Edittext Text Without Triggering the Text Watcher