Can't enter Umlauts in Ruby 1.9.3 IRB
Victor Moroz didn't quite give the definitive answer but his link led me to a solution (thx!):
I forgot to mention:
- Im running homebrew
- I built ruby using ruby-build and this recipe (1.9.3-p125-perf, with falcon patches)
What I then did to solve this problem in my case was to recompile, this time pointing ruby to a more recent version of readline (6.2.2 in my case) that I installed with homebrew.
The steps it took were:
$ brew install readline
$ export CPPFLAGS=-I/usr/local/Cellar/readline/6.2.2/include
$ export LDFLAGS=-L/usr/local/Cellar/readline/6.2.2/lib/
$ curl https://raw.github.com/gist/1688857/rbenv.sh | sh && rbenv global 1.9.3-p125-perf
Ruby 1.9.3 unexpected File behavior under msys
On Windows under msys (but not under linux) there is ENV['BIN']
which contains the path to the bin directory, in this case "C:\msysgit\bin"
. This is actually the same directory pointed at by /bin and /usr/bin in msys so it's exactly what I need.
My IRB is not producing the output,got blanked after clicking on the ENTER
Regexp syntax is wrong, should be:
m = /(.)(.)(\d+)(\d)/.match("THX1138.")
or
m = %r/(.)(.)(\d+)(\d)/.match("THX1138.")
Docs are definetly not correct. Ruby regexp syntax is /regexp/
or %r'open-symbol'regexp'close-symbol'
, for example /test/
or %r{test}
or %r|test|
. Usually //
used, but in some cases when regexp contains '/' symbol %r
form can be usefull.
Ruby IRB output is messed up in the console on Windows 7
Those are escape codes used to set colors in a terminal program; probably most popularly to colour a prompt in an xterm or compatible terminal. My bash prompt environment variable, for example, looks like this:
PS1="\[\033]2;\w\007\]\[\033[0;31m\]\u@\h \[\033[0;32m\]\!\[\033[0;31m\]> \[\033[0m\]
It looks like some string like that one is getting into your console and confusing it (since it's not bash and/or in an xterm-friendly terminal emulator, I guess).
Ruby 1.9.3 Why does \x03 .force_encoding( UTF-8 ) get \u0003 ,but \x03 .force_encoding( UTF-16 ) gets \x03
Because "\x03"
is not a valid code point in UTF-16, but a valid one in UTF-8 (ASCII 03, ETX, end of text). You have to use at least two bytes to represent a unicode code point in UTF-16.
That's why "\x03"
can be treated as unicode \u0003
in UTF-8 but not in UTF-16.
To represent "\u0003"
in UTF-16, you have to use two byte, either 00 03
or 03 00
, depending on the byte order. That's why we need to specify byte order in UTF-16. For the big-endian version, the byte sequence should be
FE FF 00 03
For the little-endian, the byte sequence should be
FF FE 03 00
The byte order mark should appear at the beginning of a string, or at the beginning of a file.
Starting from Ruby 1.9, String is just a byte sequence with a specific encoding as a tag. force_encoding
is a method to change the encoding tag, it won't affect the byte sequence. You can verify that by inspecting "\x03".force_encoding("UTF-8").bytes
.
If you see "\u0003"
, that doesn't mean you got a String which is represented in two bytes 00 03
, but some byte(s) that represents the Unicode code point 0003
under the specific encoding as carried in that String. It may be:
03 //tagged as UTF-8
FE FF 00 03 //tagged as UTF-16
FF FE 03 00 //tagged as UTF-16
03 //tagged as GBK
03 //tagged as ASCII
00 00 FE FF 00 00 00 03 // tagged as UTF-32
FF FE 00 00 03 00 00 00 // tagged as UTF-32
how to ensure irb accepts emoji input instead of escaping it?
Make sure Ruby is compiled with GNU Readline. When rvm compiles Ruby it automatically checks if Readline is installed, and if it is it will be included automatically.
You can check your Readline version in irb
. Example:
ruby 2.1.3p242 (2014-09-19 revision 47630) [x86_64-linux]
irb(main):001:0> Readline::VERSION
=> "6.3"
So, copying your solution for installing the latest Readline with Homebrew, and then recompiling Ruby:
brew update; brew uninstall readline; brew install readline; rbenv install
Ruby 1.9.x replace sets of characters with specific cleaned up characters in a string
I'll make it easy for you to implement
#encoding: UTF-8
t = 'ŠšÐŽžÀÁÂÃÄAÆAÇÈÉÊËÌÎÑNÒOÓOÔOÕOÖOØOUÚUUÜUÝYÞBßSàaáaâäaaæaçcèéêëìîðñòóôõöùûýýþÿƒ'
fallback = {
'Š'=>'S', 'š'=>'s', 'Ð'=>'Dj','Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E', 'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss','à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a',
'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y', 'ƒ'=>'f'
}
p t.encode('us-ascii', :fallback => fallback)
How do I replace accented Latin characters in Ruby?
Rails has already a builtin for normalizing, you just have to use this to normalize your string to form KD and then remove the other chars (i.e. accent marks) like this:
>> "àáâãäå".mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n,'').downcase.to_s
=> "aaaaaa"
Related Topics
Rails - Displaying Foreign Key References in a Form
Model Using Modules in Rails Application
Cucumber + Webrat + Selenium Guide
Sinatra Static Assets Are Not Found When Using Rackup
How to Ignore Multiline Comments in SASS
Get Link and Href Text from HTML Doc with Nokogiri & Ruby
How to Use Basic Authentication with Httparty in a Rails App
Ruby: How to Group a Ruby Array
Rspec Stubbing Method for Only Specific Arguments
Use Global or Constant Variable in Ruby/Rails
Jekyll: How to Use Custom Plugins with Github Pages
Do You Know an Alternative Ctags Generator for Ruby