Ruby Equivalent to JavaScript's Encodeuricomponent That Produces Identical Output

Ruby equivalent to JavaScript’s encodeURIComponent that produces identical output?

URI.escape(foo, Regexp.new("[^#{URI::PATTERN::UNRESERVED}]"))

found here: How do I raw URL encode/decode in JavaScript and Ruby to get the same values in both?

URI.escape is obslete since last I posted. As per suggested comment, now use: ERB::Util.url_encode or CGI.escape

How do I raw URL encode/decode in JavaScript and Ruby to get the same values in both?

Use

URI.escape(foo, Regexp.new("[^#{URI::PATTERN::UNRESERVED}]"))

in ruby, and

encodeURIComponent(foo); 

in javascript

Both these will behave equally and encode space as %20.

How to encode periods for URLs in Javascript?

Periods shouldn't break the url, but I don't know how you are using the period, so I can't really say. None of the functions I know of encode the '.' for a url, meaning you will have to use your own function to encode the '.' .

You could base64 encode the data, but I don't believe there is a native way to do that in js. You could also replace all periods with their ASCII equivalent (%2E) on both the client and server side.

Basically, it's not generally necessary to encode '.', so if you need to do it, you'll need to come up with your own solution. You may want to also do further testing to be sure the '.' will actually break the url.

hth

How to URL encode a string in Ruby

str = "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a".force_encoding('ASCII-8BIT')
puts CGI.escape str

=> "%124Vx%9A%BC%DE%F1%23Eg%89%AB%CD%EF%124Vx%9A"

Is a slash (/) equivalent to an encoded slash (%2F) in the path portion of an HTTP URL

From the data you gathered, I would tend to say that encoded / in an URI are meant to be seen as / again at the application or CGI level.

That's to say, that if you're using Apache with mod_rewrite for instance, it will not match pattern expecting slashes against URI with encoded slashes in it.
However, once the appropriate module/cgi/... is called to handle the request, it's up to it to do the decoding and, for instance, retrieve a parameter including slashes as the first component of the URI.

If your application is then using this data to retrieve a file (whose filename contains a slash), that's probably a bad thing.

To sum up, I find it perfectly normal to see a difference of behaviour in / or %2F as their interpretation will be done at different levels.

How to HTML encode/escape a string? Is there a built-in?

The h helper method:

<%=h "<p> will be preserved" %>

How to urlencode data for curl command?

Use curl --data-urlencode; from man curl:

This posts data, similar to the other --data options with the exception that this performs URL-encoding. To be CGI-compliant, the <data> part should begin with a name followed by a separator and a content specification.

Example usage:

curl \
--data-urlencode "paramName=value" \
--data-urlencode "secondParam=value" \
http://example.com

See the man page for more info.

This requires curl 7.18.0 or newer (released January 2008). Use curl -V to check which version you have.

You can as well encode the query string:

curl --get \
--data-urlencode "p1=value 1" \
--data-urlencode "p2=value 2" \
http://example.com
# http://example.com?p1=value%201&p2=value%202

Encode HTML entities in JavaScript

You can use regex to replace any character in a given unicode range with its html entity equivalent. The code would look something like this:

var encodedStr = rawStr.replace(/[\u00A0-\u9999<>\&]/g, function(i) {
return '&#'+i.charCodeAt(0)+';';
});

This code will replace all characters in the given range (unicode 00A0 - 9999, as well as ampersand, greater & less than) with their html entity equivalents, which is simply &#nnn; where nnn is the unicode value we get from charCodeAt.

See it in action here: http://jsfiddle.net/E3EqX/13/ (this example uses jQuery for element selectors used in the example. The base code itself, above, does not use jQuery)

Making these conversions does not solve all the problems -- make sure you're using UTF8 character encoding, make sure your database is storing the strings in UTF8. You still may see instances where the characters do not display correctly, depending on system font configuration and other issues out of your control.

Documentation

  • String.charCodeAt - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt
  • HTML Character entities - http://www.chucke.com/entities.html


Related Topics



Leave a reply



Submit