How can I avoid putting the magic encoding comment on top of every UTF-8 file in Ruby 1.9?
Explicit is better than implicit. Writing out the name of the encoding is good for your text editor, your interpreter, and anyone else who wants to look at the file. Different platforms have different defaults -- UTF-8, Windows-1252, Windows-1251, etc. -- and you will either hamper portability or platform integration if you automatically pick one over the other. Requiring more explicit encodings is a Good Thing.
It might be a good idea to integrate your Rails app with GetText. Then all of your UTF-8 strings will be isolated to a small number of translation files, and your Ruby modules will be clean ASCII.
How does the magic comment ( # Encoding: utf-8 ) in ruby work?
Ruby interpreter instructions at the top of the source file - this is called magic comment. Before processing your source code interpreter reads this line and sets proper encoding. It's quite common for interpreted languages I believe. At least Python uses the same approach.
You can specify encoding in a number of different ways (some of them are recognized by editors):
# encoding: UTF-8
# coding: UTF-8
# -*- coding: UTF-8 -*-
You can read some interesting stuff about source encoding in this article.
The only thing I'm aware of that has similar construction is shebang, but it is related to Unix shells in general and is not Ruby-specific.
magic_comments defined in ruby/ruby
Make Ruby 1.9 regard all source files to be UTF-8 encoded. (Even if recompiling the interpreter is necessary)
I found a workaround:
set the RUBYOPT
environment variable, for example by executing
export RUBYOPT=-Ku
in your shell.
This will set -Ku als default option when calling ruby. You can now call all other tools which invoke ruby without worrying about parameters. rails server
or rake
works and regards all files as UTF-8. No BOM or magic comments necessary!
utf-8 encoding in gemspec, does it apply to the source files?
The file encoding header specifies the encoding for that file. It doesn't specify the encoding of other files. How could it?
Reading ASCII-encoded files with Ruby 1.9 in a UTF-8 environment
What's your locale set to in the shell? In Linux-based systems you can check this by running the locale
command and change it by e.g.
$ export LANG=en_US
My guess is that you are using locale settings which have UTF-8 encoding and this is causing Ruby to assume that the text files were created according to utf-8 encoding rules. You can see this by trying
$ LANG=en_GB ruby -e 'warn "foo".encoding.name'
US-ASCII
$ LANG=en_GB.UTF-8 ruby -e 'warn "foo".encoding.name'
UTF-8
For a more general treatment of how string encoding has changed in Ruby 1.9 I thoroughly recommend
http://blog.grayproductions.net/articles/ruby_19s_string
(code examples assume bash or similar shell - C-shell derivatives are different)
How do i prevent emacs from adding coding information in the first line?
It looks like this is part of the ruby-mode in emacs.
I found a link to an article that shows how to edit the ruby-mode.el file. Not sure if it works, but there is also a comment on that article that may work better:
(setq ruby-insert-encoding-magic-comment nil)
If instead of using ruby-mode your are using enh-ruby-mode you should set this variable:
(setq enh-ruby-add-encoding-comment-on-save nil)
Links:
Fix: Emacs/Aquamacs keeps adding encoding comments to my files
Also, semi-related question but pertinent answer by Michael Kohl: How can I avoid putting the magic encoding comment on top of every UTF-8 file in Ruby 1.9?
Enh-ruby-mode comment encoding line
Set global default encoding for ruby 1.9
You can either:
- set your RUBYOPT environment variable to "-E utf-8"
- or use https://github.com/m-ryan/magic_encoding
Batch convert to UTF8 using Ruby
Unfortunately that's not how it is done - the file is still in ANSI. At least that's what my Notepad++ says.
UTF-8 was designed to be a superset of ASCII, which means that most of the printable ASCII characters are the same in UTF-8. For this reason it's not possible to distinguish between ASCII and UTF-8 unless you have "special" characters. These special characters are represented using multiple bytes in UTF-8.
It's well possible that your conversion is actually working, but you can double-check by trying your program with special characters.
Also, one of the best utilities for converting between encodings is iconv
, which also has ruby bindings.
How do i prevent emacs from adding coding information in the first line?
It looks like this is part of the ruby-mode in emacs.
I found a link to an article that shows how to edit the ruby-mode.el file. Not sure if it works, but there is also a comment on that article that may work better:
(setq ruby-insert-encoding-magic-comment nil)
If instead of using ruby-mode your are using enh-ruby-mode you should set this variable:
(setq enh-ruby-add-encoding-comment-on-save nil)
Links:
Fix: Emacs/Aquamacs keeps adding encoding comments to my files
Also, semi-related question but pertinent answer by Michael Kohl: How can I avoid putting the magic encoding comment on top of every UTF-8 file in Ruby 1.9?
Enh-ruby-mode comment encoding line
Related Topics
Getting Only New Mail from an Imap Server
What's the Difference Between "Includes" and "Preload" in an Activerecord Query
How to Get Elapsed Time in Milliseconds in Ruby
Rspec: How to Stub an Instance Method Called by Constructor
How to Check If a Variable Is a Number or a String
Install Bundler Gem Using Ansible
How to Wrap the Invocation of a Ruby Method by Including a Module
How to Test Code Coverage for Rails Erb Templates
The Encoding That Notepad++ Just Calls "Ansi", Does Anyone Know What to Call It for Ruby
Ruby on Rails: Devise, Want to Add Invite Code
How to Save Values into a Yaml File
Embedding JSON Data into Yaml File
How to Get the Line of Code That Triggers a Query
I Have a Gem Installed But Require 'Gemname' Does Not Work. Why
How to Set an Option as Selected Using Selenium Webdriver (Selenium 2.0) Client in Ruby
How to Share the Factories That I Have in a Gem and Use It in Other Project