Ruby 1.9 Doesn't Support Unicode Normalization Yet

Ruby 1.9 doesn't support Unicode normalization yet

If you are aware of the consequences, i.e. accented characters will not be transliterated in Ruby 1.9.1 + Rails 2.3.x, place this in config/initializers to silence the warning:

# http://stackoverflow.com/questions/2135247/ruby-1-9-doesnt-support-unicode-normalization-yet
module ActiveSupport
module Inflector
# Calling String#parameterize prints a warning under Ruby 1.9,
# even if the data in the string doesn't need transliterating.
if Rails.version =~ /^2\.3/
undef_method :transliterate
def transliterate(string)
string.dup
end
end
end
end

Rails 3 does indeed solve this issue, so a more future-proof solution would be to migrate towards that.

Should I use Ruby 1.9.2 for my Rails 2.3.10 app?

I would recommend doing both, but the order in which you do them, or if you do them both at once is really a personal preference. If you have a strong test-suite with good coverage this is a great first step in making the transition. The main road-blocks you'll run into are the following:

  • Many newer gem versions only support Rails 3, so if you are only doing one step at a time, make sure that you're gems are supported. For example you don't want to get stuck in a situation where a gem requires you to upgrade because you are using Ruby 1.9.2 but the new version of the gem is only available in Rails 3.
  • There are some syntax changes in Ruby 1.9.2 and compiled C extensions need to be re-complied or the gems re-installed.
  • There are major application configuration changes from Rails 2.3 to Rails 3.0. They take some time to complete but there is lots of support.

In general Ruby 1.9.2 will be faster than Ruby 1.8.7 and will provide some cool new syntax. If you are getting opposite results then I would benchmark your code and make sure that this is actually the case and that it's not just failing tests that are slowing your suite down.

Ruby 1.9.x replace sets of characters with specific cleaned up characters in a string

I'll make it easy for you to implement

#encoding: UTF-8
t = 'ŠšÐŽžÀÁÂÃÄAÆAÇÈÉÊËÌÎÑNÒOÓOÔOÕOÖOØOUÚUUÜUÝYÞBßSàaáaâäaaæaçcèéêëìîðñòóôõöùûýýþÿƒ'
fallback = {
'Š'=>'S', 'š'=>'s', 'Ð'=>'Dj','Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E', 'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss','à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a',
'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y', 'ƒ'=>'f'
}

p t.encode('us-ascii', :fallback => fallback)

incompatible character encodings: ASCII-8BIT and UTF-8

I have a suspicion that you either copy/pasted a part of your Haml template into the file, or you're working with a non-Unicode/non-UTF-8 friendly editor.

See if you can recreate that file from the scratch in a UTF-8 friendly editor. There are plenty for any platform and see whether this fixes your problem. Start by erasing the line with #content and retyping it manually.

Filter ruby warnings

You may not be able to monkey patch before the main file is read, but you can make your main file call subfiles after doing monkeypatching.

myruby (executable)

#!/usr/bin/env ruby

module Kernel
def warn *args
args # => captured warnings
end
end

load ARGV[0]

Usage is:

myruby foo.rb

How do I replace accented Latin characters in Ruby?

Rails has already a builtin for normalizing, you just have to use this to normalize your string to form KD and then remove the other chars (i.e. accent marks) like this:

>> "àáâãäå".mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n,'').downcase.to_s
=> "aaaaaa"

Ruby method to remove accents from UTF-8 international characters

I generally use I18n to handle this:

1.9.3p392 :001 > require "i18n"
=> true
1.9.3p392 :002 > I18n.transliterate("Hé les mecs!")
=> "He les mecs!"


Related Topics



Leave a reply



Submit