urlencode vs rawurlencode?
It will depend on your purpose. If interoperability with other systems is important then it seems rawurlencode is the way to go. The one exception is legacy systems which expect the query string to follow form-encoding style of spaces encoded as + instead of %20 (in which case you need urlencode).
rawurlencode follows RFC 1738 prior to PHP 5.3.0 and RFC 3986 afterwards (see http://us2.php.net/manual/en/function.rawurlencode.php)
Returns a string in which all non-alphanumeric characters except -_.~ have been replaced with a percent (%) sign followed by two hex digits. This is the encoding described in » RFC 3986 for protecting literal characters from being interpreted as special URL delimiters, and for protecting URLs from being mangled by transmission media with character conversions (like some email systems).
Note on RFC 3986 vs 1738. rawurlencode prior to php 5.3 encoded the tilde character (~
) according to RFC 1738. As of PHP 5.3, however, rawurlencode follows RFC 3986 which does not require encoding tilde characters.
urlencode encodes spaces as plus signs (not as %20
as done in rawurlencode)(see http://us2.php.net/manual/en/function.urlencode.php)
Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs. It is encoded the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type. This differs from the » RFC 3986 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.
This corresponds to the definition for application/x-www-form-urlencoded in RFC 1866.
Additional Reading:
You may also want to see the discussion at http://bytes.com/groups/php/5624-urlencode-vs-rawurlencode.
Also, RFC 2396 is worth a look. RFC 2396 defines valid URI syntax. The main part we're interested in is from 3.4 Query Component:
Within a query component, the characters
";", "/", "?", ":", "@",
are reserved.
"&", "=", "+", ",", and "$"
As you can see, the +
is a reserved character in the query string and thus would need to be encoded as per RFC 3986 (as in rawurlencode).
What is the difference between urlencode and rawurlencode?
It depends on what you are after. A main difference between them is the standard that they encode to of course, but also spaces.
urlencode
encodes the same way that form data is encoded
urlencode
encodes spaces as +
symbols while rawurlencode
encodes them as %20
.
Therefore when dealing with form data, urlencode would be preferable (as forms encode spaces as + signs too). Otherwise rawurlencode is a wiser choice in my opinion.
For example, you may want to mimic form data being submitted via a URL, you would use urlencode.
urlencode/rawurlencode and automatic decoding
The +
character is encoded by both function as %2B, so no confusion is possible.
To safely decode any version, PHP only has to transform each %XX
into its corresponding character and transform each +
to a space. This is what urldecode
does.
rawurlencode
shouldn't cause issues as all it does is encode a wider range of chars into their %XX counterparts. Those will be decoded safely by any version of the function.
Substituting whitespaces with %20 in PHP. urlencode and rawurlencode does not work
rawurlencode()
is what you're looking for. However, if your Content-Type
is set to text/html
(which is the default), then you will see the space character instead of the encoded entity.
header('Content-Type: text/plain');
$str = "my string";
echo rawurlencode($str); // => my%20string
Note: I'm not suggesting that you should change the Content-Type
header in your original script. It's just to show that your rawurlencode()
call is working and to explain why you're not seeing it.
PHP - auto detect (raw)urlencode
The two functions take any character defined by the regular expression [^0-9A-Za-z_~-]
and convert it to a percent sign followed by its hexadecimal codepoint. The only difference between the two encoding methods is rawurlencode()
uses a %20
for a space, instead of the +
used by urlencode()
.
For decoding, this means that any sequence that matches the regular expression %[0-9A-F]{2}
will be properly decoded by either function. That only leaves a +
to worry about, which will not get decoded properly by rawurldecode()
. So, you can use urldecode()
on the server side and not worry about any testing.
<?php
$str = "foo bar baz";
$raw = rawurlencode($str);
$enc = urlencode($str);
echo rawurldecode($raw);
echo rawurldecode($enc);
echo urldecode($raw);
echo urldecode($enc);
?>
Output:
foo bar baz
foo+bar+baz
foo bar baz
foo bar baz
Urlencode everything but slashes?
- Split by
/
urlencode()
each part- Join with
/
rawurlencode() and urlencode() not working in CodeIgniter
I know this is an old question. But I was dealing with the same issue. What I have done is:
Encode
<?php echo urlencode(base64_encode('http://kchason.com')); ?>
Decode
<?php echo urldecode(base64_decode('http://kchason.com')); ?>
You use base64_encode
to get rid of any URL parts that will cause problems with Codeigniter, and then you use urlencode
to encode any =
that base64_encode
adds to the end of its output.
Related Topics
Get Specific Columns Using "With()" Function in Laravel Eloquent
Method="Post" Enctype="Text/Plain" Are Not Compatible
Get Value from Simplexmlelement Object
How to Parse a CSV File Using PHP
Remove Excess Whitespace from Within a String
Make Xampp/Apache Serve File Outside of Htdocs Folder
Replace the Variable Price Range by the Chosen Variation Price in Woocommerce 3
How to Make Pdf File Downloadable in HTML Link
List of Big-O For PHP Functions
Run Process With Realtime Output in PHP
How to Set Http Header to Utf-8 Using PHP Which Is Valid in W3C Validator
How to Find Entry by Object Property from an Array of Objects