Can't Enter Umlauts in Ruby 1.9.3 Irb

Can't enter Umlauts in Ruby 1.9.3 IRB

Victor Moroz didn't quite give the definitive answer but his link led me to a solution (thx!):

I forgot to mention:

  • Im running homebrew
  • I built ruby using ruby-build and this recipe (1.9.3-p125-perf, with falcon patches)

What I then did to solve this problem in my case was to recompile, this time pointing ruby to a more recent version of readline (6.2.2 in my case) that I installed with homebrew.

The steps it took were:

$ brew install readline
$ export CPPFLAGS=-I/usr/local/Cellar/readline/6.2.2/include
$ export LDFLAGS=-L/usr/local/Cellar/readline/6.2.2/lib/
$ curl https://raw.github.com/gist/1688857/rbenv.sh | sh && rbenv global 1.9.3-p125-perf

Ruby 1.9.3 unexpected File behavior under msys

On Windows under msys (but not under linux) there is ENV['BIN'] which contains the path to the bin directory, in this case "C:\msysgit\bin". This is actually the same directory pointed at by /bin and /usr/bin in msys so it's exactly what I need.

My IRB is not producing the output,got blanked after clicking on the ENTER

Regexp syntax is wrong, should be:

m = /(.)(.)(\d+)(\d)/.match("THX1138.")

or

m = %r/(.)(.)(\d+)(\d)/.match("THX1138.")

Docs are definetly not correct. Ruby regexp syntax is /regexp/ or %r'open-symbol'regexp'close-symbol', for example /test/ or %r{test} or %r|test|. Usually // used, but in some cases when regexp contains '/' symbol %r form can be usefull.

Ruby IRB output is messed up in the console on Windows 7

Those are escape codes used to set colors in a terminal program; probably most popularly to colour a prompt in an xterm or compatible terminal. My bash prompt environment variable, for example, looks like this:

PS1="\[\033]2;\w\007\]\[\033[0;31m\]\u@\h \[\033[0;32m\]\!\[\033[0;31m\]> \[\033[0m\]

It looks like some string like that one is getting into your console and confusing it (since it's not bash and/or in an xterm-friendly terminal emulator, I guess).

Ruby 1.9.3 Why does \x03 .force_encoding( UTF-8 ) get \u0003 ,but \x03 .force_encoding( UTF-16 ) gets \x03

Because "\x03" is not a valid code point in UTF-16, but a valid one in UTF-8 (ASCII 03, ETX, end of text). You have to use at least two bytes to represent a unicode code point in UTF-16.

That's why "\x03" can be treated as unicode \u0003 in UTF-8 but not in UTF-16.

To represent "\u0003" in UTF-16, you have to use two byte, either 00 03 or 03 00, depending on the byte order. That's why we need to specify byte order in UTF-16. For the big-endian version, the byte sequence should be

FE FF 00 03

For the little-endian, the byte sequence should be

FF FE 03 00

The byte order mark should appear at the beginning of a string, or at the beginning of a file.

Starting from Ruby 1.9, String is just a byte sequence with a specific encoding as a tag. force_encoding is a method to change the encoding tag, it won't affect the byte sequence. You can verify that by inspecting "\x03".force_encoding("UTF-8").bytes.

If you see "\u0003", that doesn't mean you got a String which is represented in two bytes 00 03, but some byte(s) that represents the Unicode code point 0003 under the specific encoding as carried in that String. It may be:

03              //tagged as UTF-8
FE FF 00 03 //tagged as UTF-16
FF FE 03 00 //tagged as UTF-16
03 //tagged as GBK
03 //tagged as ASCII
00 00 FE FF 00 00 00 03 // tagged as UTF-32
FF FE 00 00 03 00 00 00 // tagged as UTF-32

how to ensure irb accepts emoji input instead of escaping it?

Make sure Ruby is compiled with GNU Readline. When rvm compiles Ruby it automatically checks if Readline is installed, and if it is it will be included automatically.

You can check your Readline version in irb. Example:

ruby 2.1.3p242 (2014-09-19 revision 47630) [x86_64-linux]
irb(main):001:0> Readline::VERSION
=> "6.3"

So, copying your solution for installing the latest Readline with Homebrew, and then recompiling Ruby:

brew update; brew uninstall readline; brew install readline; rbenv install

Ruby 1.9.x replace sets of characters with specific cleaned up characters in a string

I'll make it easy for you to implement

#encoding: UTF-8
t = 'ŠšÐŽžÀÁÂÃÄAÆAÇÈÉÊËÌÎÑNÒOÓOÔOÕOÖOØOUÚUUÜUÝYÞBßSàaáaâäaaæaçcèéêëìîðñòóôõöùûýýþÿƒ'
fallback = {
'Š'=>'S', 'š'=>'s', 'Ð'=>'Dj','Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E', 'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss','à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a',
'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y', 'ƒ'=>'f'
}

p t.encode('us-ascii', :fallback => fallback)

How do I replace accented Latin characters in Ruby?

Rails has already a builtin for normalizing, you just have to use this to normalize your string to form KD and then remove the other chars (i.e. accent marks) like this:

>> "àáâãäå".mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n,'').downcase.to_s
=> "aaaaaa"


Related Topics



Leave a reply



Submit