Percent encoding in Ruby
As I mentioned in my comment, equating the character ä as the codepoint 228 (0xE4) implies that you're dealing with the ISO 8859-1 character encoding.
So, you need to tell Ruby what encoding you want for your string.
str1 = "Hullo ängstrom" # uses whatever encoding is current, generally utf-8
str2 = str1.encode('iso-8859-1')
Then you can encode it as you like:
require 'cgi'
s2c = CGI.escape str2
#=> "Hullo+%E4ngstrom"
require 'uri'
s2u = URI.escape str2
#=> "Hullo%20%E4ngstrom"
Then, to reverse it, you must first (a) unescape the value, and then (b) turn the encoding back into what you're used to (likely UTF-8), telling Ruby what character encoding it should interpret the codepoints as:
s3a = CGI.unescape(s2c) #=> "Hullo \xE4ngstrom"
puts s3a.encode('utf-8','iso-8859-1')
#=> "Hullo ängstrom"
s3b = URI.unescape(s2u) #=> "Hullo \xE4ngstrom"
puts s3b.encode('utf-8','iso-8859-1')
#=> "Hullo ängstrom"
How to URL encode a string in Ruby
str = "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a".force_encoding('ASCII-8BIT')
puts CGI.escape str
=> "%124Vx%9A%BC%DE%F1%23Eg%89%AB%CD%EF%124Vx%9A"
URL encode every possible character
URI.escape
was deprecated and replaced by CGI::escape
which is RFC compliant by grabbing non-alphanum characters and converting them. This is the module that does it:
# https://ruby-doc.org/stdlib-2.4.3/libdoc/cgi/rdoc/CGI/Util.html
# File cgi/util.rb, line 11
def escape(string)
encoding = string.encoding
string.b.gsub(/([^ a-zA-Z0-9_.-]+)/) do |m|
'%' + m.unpack('H2' * m.bytesize).join('%').upcase
end.tr(' ', '+').force_encoding(encoding)
end
At the end of the day, it's the server that needs fixing, not your code. You can monkeypatch or fork CGI and remove the -
from the regex, or gsub()
the character.
Ruby - how to encode URL without re-encoding already encoded characters
I can't think of a way to do this that isn't a little bit of a kludge. So I propose a little bit of a kludge.
URI.escape
appears to work the way you want in all cases except when characters are already encoded. With that in mind we can take the result of URI.encode
and use String#gsub
to "un-encode" only those characters.
The below regular expression looks for %25
(an encoded %
) followed by two hex digits, turning e.g. %252f
back into %2f
:
require "uri"
DOUBLE_ESCAPED_EXPR = /%25([0-9a-f]{2})/i
def escape_uri(uri)
URI.encode(uri).gsub(DOUBLE_ESCAPED_EXPR, '%\1')
end
puts escape_uri("https://www.example.com/url-déjà-vu")
# => https://www.example.com/url-d%C3%A9j%C3%A0-vu
puts escape_uri("https://somesite.com/page?stuff=stuff&%20")
# => https://somesite.com/page?stuff=stuff&%20
puts escape_uri("http://example.com/a%2fb")
# => http://example.com/a%2fb
I don't promise that this is foolproof, but hopefully it helps.
Issue with percent encoding in paperclip document.url on s3
Use URI.unescape
:
<%= URI.unescape(client.document.url) %>
Ruby - URL encoding
What about CGI::escape
You need to only encode the parameters though.
url = "http://xyz.com/hello?"
params = "name=john&msg=hello\nJohn\n\rgoodmorning¬e=last\night I went to \roger"
puts "#{url}#{CGI::escape(params)}"
# => "http://xyz.com/hello?name%3Djohn%26msg%3Dhello%0AJohn%0A%0Dgoodmorning%26note%3Dlast%0Aight+I+went+to+%0Doger"
Is there a function to url encode dot ('.') in ruby
You actually don't need to encode the dot. After the ?
in the url, /
and .
don't have any specific meaning.
How to encode Email if it contains + to %2B in Ruby
The uri
std-lib has a method for that URI::Escape#escape
. URI
extends the URI::Escape
module, so also has this method.
URI.escape('test+@gmail.com', '+')
#=> "test%2B@gmail.com" ^ the characters to escape with URL encoding
However like @spickermann says in the comments:
Why do you want to encode the
+
in the URL but not the@
?@
must be encoded too.
Parsing string to add to URL-encoded URL
In 2019, URI.encode is obsolete and should not be used.
require 'uri'
URI.encode("Hello there world")
#=> "Hello%20there%20world"
URI.encode("hello there: world, how are you")
#=> "hello%20there:%20world,%20how%20are%20you"
URI.decode("Hello%20there%20world")
#=> "Hello there world"
Related Topics
Ruby Parenthesis Syntax Exception with I++ ++I
Ruby on Rails: Create Confirmation View Before Creating the Object
"/#Action" Route in Routes.Rb in Ruby on Rails
Uploading a File to a S3 Presigned Url
Loop Within Loop in Rails Controller
Ruby Selenium Web Driver: How to Count Child Element Nodes of a Specific Node
Confused with Ruby Accessor Methods
Ruby 1.8: Hash#Sort Not Return Hash But Array (Better Way to Do This)
Can't Access the Dockerized App Launched from the Command Line from Outside
Open Xml File with Nokogiri Update Node and Save
How to Transform the Utf8 Chars to Iso8859-1
Rubymine 6.0.2, Unable to Debug
How to Push a Custom Gem to Heroku Master
How to Ignore File Types in a Web Crawler