How to Url Encode a String

Should I url encode a query string parameter that's a URL?

According to RFC 3986:

The query component is indicated by the first question mark ("?")
character and terminated by a number sign ("#") character or by the
end of the URI.

So the following URI is valid:

http://www.example.com?next=http://www.example.com

The following excerpt from the RFC makes this clear:

... as query components are often used to carry identifying
information in the form of "key=value" pairs and one frequently used
value is a reference to another URI, it is sometimes better for
usability to avoid percent-encoding those characters.

It is worth noting that RFC 3986 makes RFC 2396 obsolete.

Java URL encoding of query string parameters

URLEncoder is the way to go. You only need to keep in mind to encode only the individual query string parameter name and/or value, not the entire URL, for sure not the query string parameter separator character & nor the parameter name-value separator character =.

String q = "random word £500 bank $";
String url = "https://example.com?q=" + URLEncoder.encode(q, StandardCharsets.UTF_8);

When you're still not on Java 10 or newer, then use StandardCharsets.UTF_8.toString() as charset argument, or when you're still not on Java 7 or newer, then use "UTF-8".


Note that spaces in query parameters are represented by +, not %20, which is legitimately valid. The %20 is usually to be used to represent spaces in URI itself (the part before the URI-query string separator character ?), not in query string (the part after ?).

Also note that there are three encode() methods. One without Charset as second argument and another with String as second argument which throws a checked exception. The one without Charset argument is deprecated. Never use it and always specify the Charset argument. The javadoc even explicitly recommends to use the UTF-8 encoding, as mandated by RFC3986 and W3C.

All other characters are unsafe and are first converted into one or more bytes using some encoding scheme. Then each byte is represented by the 3-character string "%xy", where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme to use is UTF-8. However, for compatibility reasons, if an encoding is not specified, then the default encoding of the platform is used.

See also:

  • What every web developer must know about URL encoding

String url-encoded twice, I can't get the initial string

decode it first and then encode it

String id = "http://gastro-huc.org.pt/index.php?view=article&catid=41%3Ateses-de-mestrado&id=41%3Ateses-de-mestrado&format=pdf&option=com_content&Itemid=68";
id = URLDecoder.decode(id);
String urlEncoded = URLEncoder.encode(id, "UTF-8");

How to urlencode data for curl command?

Use curl --data-urlencode; from man curl:

This posts data, similar to the other --data options with the exception that this performs URL-encoding. To be CGI-compliant, the <data> part should begin with a name followed by a separator and a content specification.

Example usage:

curl \
--data-urlencode "paramName=value" \
--data-urlencode "secondParam=value" \
http://example.com

See the man page for more info.

This requires curl 7.18.0 or newer (released January 2008). Use curl -V to check which version you have.

You can as well encode the query string:

curl --get \
--data-urlencode "p1=value 1" \
--data-urlencode "p2=value 2" \
http://example.com
# http://example.com?p1=value%201&p2=value%202

PHP urlencode issue with a parameter in my string: ¬ify_url incorrectly returns ¬ify_url

urlencode does not usually replace ¬ at all, but does replace & with %26. See example here: http://sandbox.onlinephpfunctions.com/code/e9d62797d01f8162170e5ad5181e14fc339faa52

You could try replacing & with %26 before urlencode.

$urlString = str_replace('&', '%26', $urlString);

How do I urlencode all the characters in a string, including safe characters?

This gist reveals a very nice answer to this problem. The final function code is as follows:

def encode_all(string):
return "".join("%{0:0>2}".format(format(ord(char), "x")) for char in string)

Let's break this down.

The first thing to notice is that the return value is a generator expression (... for char in string) wrapped in a str.join call ("".join(...)). This means we will be performing an operation for each character in the string, then finally joining each outputted string together (with the empty string, "").

The operation performed on each character in the string is "%{0:0>2}".format(format(ord(char), "x")). This can be broken down into the following:

  • ord(char): Convert each character to the corresponding number.
  • format(..., "x"): Convert the number to a hexadecimal value.
  • "%{0:0>2}".format(...): Format the hexadecimal value into a string (with a prefixed "%").

When you look at the whole function from an overview, it is converting each character to a number, converting that number to hexadecimal, then jamming all the hexadecimal values into a string (which is then returned).

I need to URL-encode a string in AppleScript

Note: The solution no longer works as of Big Sur (macOS 11) - it sounds like a bug; do tell us if you have more information.

Try the following:

set search to text returned of (display dialog "Enter song you wish to find" default answer "" buttons {"Search", "Cancel"} default button 1)
do shell script "open 'http://www.mp3juices.com/search/'" & quoted form of search
end

What you need is URL encoding (i.e., encoding of a string for safe inclusion in a URL), which involves more than just replacing spaces.
The open command-line utility, thankfully, performs this encoding for you, so you can just pass it the string directly; you need do shell script to invoke open, and quoted form of ensures that the string is passed through unmodified (to be URI-encoded by open later).

As you'll see, the kind of URL encoding open performs replaces spaces with %20, not underscores, but that should still work.

How to URL encode strings in C#

According to RFC 1738:

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

Neither HttpUtility.UrlEncode nor WebUtility.UrlEncode will encode those characters since the standard says the parentheses () can be used unencoded.

I don't know why the URL Encoder / Decoder you linked encodes them since it also lists them as as a character that can be used in a URL.



Related Topics



Leave a reply



Submit