String.Force_Encoding() in Ruby 1.8.7 (Or Rails 2.X)

String.force_encoding() in Ruby 1.8.7 (or Rails 2.x)

The only thing force_encoding does in 1.9 is that it changes the encoding field of the string, it does not actually modify the string's bytes.

Ruby 1.8 doesn't have the concept of string encodings, so force_encoding would be a no-op. You can add it yourself like this if you want to be able to run the same code in 1.8 and 1.9:

class String
def force_encoding(enc)
self
end
end

There will of course be other things that you have to do to make encodings work the same across 1.8 and 1.9, since they handle this issue very differently.

Making character-range Regexp work with Ruby 1.9

If you plan to capture the range expressed with code points, you'll need to use \u notation with the utf-8 encoding header:

#!/bin/env ruby
# encoding: utf-8

puts "Café".match(/[\u0080-\uFFFF]/)

The output of the demo program is é.

Rails 3, heroku - PGError: ERROR: invalid byte sequence for encoding UTF8 :

From what I can gather, this is a problem where the string you're trying to insert into your PostgrSQL server isn't encoded with UTF-8. This is somewhat odd, because your Rails app should be configured to use UTF-8 by default.

There are a couple of ways you can try fix this (in order of what I recommend):

  • Firstly, make sure that config.encoding is set to "utf-8" in config/application.rb.

  • If you're using Ruby 1.9, you can try to force the character encoding prior to insertion with toutf8.

  • You can figure out what your string is encoded with, and manually set SET CLIENT_ENCODING TO 'ISO-8859-1'; (or whatever the encoding is) on your PostgeSQL connection before inserting the string. Don't forget to do RESET CLIENT_ENCODING; after the statement to reset the encoding.

  • If you're using Ruby 1.8 (which is more likely), you can use the iconv library to convert the string to UTF-8. See documentation here.

  • A more hackish solution is to override your getters and setters in the model (i.e. content and content=) encode and decode your string with Base64. It'd look something like this:

 

require 'base64'

class Comment
def content
Base64::decode64(self[:content])
end

def content=(value)
self[:content] = Base64::encode64(value)
end
end

Ruby 1.9.x and string encoding

Well, updating to Rails 3.1.3 and mysql2 0.3.10 seems to have solved my issue (was running Rails 3.0.3 and mysql2 0.2.6). Seems weird to me though, as Ruby 1.9 is over 3y old and Rails 3.0.3 was released way after that, so I don't see why Rails 3.0.x wouldn't play nice with Ruby's 1.9 new string encodings. If anyone can add up on this I would be grateful.

Is ruby 1.9.2's new regex engine (Oniguruma) very slow?

You can drop the .* from your regex completely. All it does is match the entire string and then backtrack until your search string is found. Remove it and see if it's still as slow.



Related Topics



Leave a reply



Submit