What's the difference between URI.escape and CGI.escape?
There were some small differences, but the important point is that URI.escape
has been deprecated in Ruby 1.9.2... so use CGI::escape
or ERB::Util.url_encode.
There is a long discussion on ruby-core for those interested which also mentions WEBrick::HTTPUtils.escape and WEBrick::HTTPUtils.escape_form.
What is the difference between URI.escape and URI.encode in Ruby?
There is no difference. In Ruby 1.9.3 encode
is simply an alias for escape
.
[Edit] Note that those methods allow an "unsafe" descriptor of characters to encode:
URI.encode('http://my.web.com', /\W/) # => "http%3A%2F%2Fmy%2Eweb%2Ecom"
Thanks @muistooshort! =)
What's the difference between CGI.unescape and URI.decode_www_form_component?
These methods are very similar. They both accept a string and an encoding and return a string in the specified encoding with the %
escapes decoded. But there are differences:
Invalid escapes
URI.decode_www_form_component
raises an ArgumentError
if the string contains invalid escape sequences.
URI.decode_www_form_component('%xz')
# ArgumentError: invalid %-encoding (%xz)
CGI.unescape
simply ignores them.
CGI.unescape('%xz')
# "%xz"
Invalid encodings
CGI.unescape
ignores your specified encoding if the result is invalid
p CGI.unescape("\u263a", 'ASCII')
# "☺"
URI.decode_www_form_component
doesn't care
p URI.decode_www_form_component("\u263a", 'ASCII')
# "\xE2\x98\xBA"
Lastly (and I hesitate to even mention this), URI.decode_www_form_component
is slightly faster because it uses a precomputed Hash to decode all 485 valid escape codes (it's case-sensitive), whereas CGI.unescape
actually interprets the hex code and repacks it as a character.
Ruby 2.7 says URI.escape is obsolete, what replaces it?
There is no official RFC 3986-compliant URI escaper in the Ruby standard library today.
See Why is URI.escape() marked as obsolete and where is this REGEXP::UNSAFE constant? for background.
There are several methods that have various issues with them as you have discovered and pointed out in the comment:
- They produce deprecation warnings
- They do not claim standards compliance
- They are not escaping in accordance with RFC 3986
- They are implemented in tangentially related libraries
CGI.escape and URLEncoder.encode result are not matching
The %0A
denotes a line break ("\n"
).
Perhaps you got the text from some input source (like user input, or a file), and you need to chomp
the line break?
hash = "gFH6B8aN+yReGkBL2QS7X4O7d98=\n"
puts "hash: " + hash
# => hash: gFH6B8aN+yReGkBL2QS7X4O7d98=
puts "escaped hash: " + CGI.escape(hash.chomp)
# => escaped hash: gFH6B8aN%2ByReGkBL2QS7X4O7d98%3D
Why doesn't URI.escape escape single quotes?
For the same reason it doesn't escape ?
or /
or :
, and so forth. URI.escape()
only escapes characters that cannot be used in URLs at all, not characters that have a special meaning.
What you're looking for is CGI.escape()
:
require "cgi"
CGI.escape("foo'bar\" baz")
=> "foo%27bar%22+baz"
Related Topics
How to Check If a Value Exists in an Array in Ruby
Difference Between "Or" and || in Ruby
How to Break Out from a Ruby Block
Add a Default Value to a Column Through a Migration
Rails 3, Has_One/Has_Many With Lambda Condition
How to Solve "/Usr/Bin/Env: Ruby_Executable_Hooks: No Such File or Directory"
To_D to Always Return 2 Decimals Places in Ruby
Ssl Error When Installing Rubygems, Unable to Pull Data from 'Https://Rubygems.Org/
How to Pick Randomly from an Array
Difference Between Include and Require in Ruby
What Does a Double * (Splat) Operator Do
Checking If a Variable Is Defined