Ruby 1.9.3 Invalid byte sequence in UTF-8 explanation needed
I have 64 bit Cygwin, Ruby 2.0.0 and gem 2.4.1 and was experiencing the same issue. gem install ...
, gem update
, everything ended with "ERROR: While executing gem ... (ArgumentError) invalid byte sequence in UTF-8".
I had also all locales set to "en_US.UTF-8".
I have read somewhere that it should help to set LANG
to an empty string or "C.BINARY", but it didn't help. But it was good hint to start experimenting.
Finally I have solved that by setting both LANG
and LC_ALL
to an empty string. All other locale environment variables (LC_CTYPE
etc.) was automatically set to "C.UTF-8" by that, LANG
and LC_ALL
remained empty.
Now gem
is finally working.
UPDATE
It seems that specifically LC_CTYPE
is causing that issue if it's set to UTF-8. So setting it to C.BINARY should help. Other locale environment variables can be set to UTF-8 without affecting it.
export LC_CTYPE=C.BINARY
Invalid Byte Sequence In UTF-8 Ruby
As Arie already answered this error is because invalid byte sequence \xC3
If you are using Ruby 2.1 +, you can also use String#scrub
to replace invalid bytes with given replacement character. Here:
a = "abce\xC3"
# => "abce\xC3"
a.scrub
# => "abce�"
a.scrub.sub("a","A")
# => "Abce�"
ArgumentError (invalid byte sequence in UTF-8): Ruby 1.9.3 render view
For Fixed this, only used gem 'mysql2' and change adapter in my database.yml, and change the encoding
staging:
adapter: mysql2
database: data_basename
username: root
encoding: utf8
Ruby Invalid Byte Sequence in UTF-8
The combination of using: @file = IO.read(file).force_encoding("ISO-8859-1").encode("utf-8", replace: nil)
and #encoding: UTF-8
solved the issue.
rake invalid byte sequence in UTF-8
Try saving the offending file (could be anything that rake is trying) in UTF-8 WITH BOM.
Is there a way in ruby 1.9 to remove invalid byte sequences from strings?
"€foo\xA0".chars.select(&:valid_encoding?).join
Ruby `CSV.read` error invalid byte sequence in UTF-8 (ArgumentError)
First of all, your encoding doesn't look right:
'社員番号'.force_encoding("Shift_JIS").encode!
#=> "\x{E7A4}\xBE\x{E593}\xA1\x{E795}\xAA\x{E58F}\xB7"
force_encoding
takes the bytes from str1
and interprets them as Shift JIS, whereas you probably want to convert the string to Shift JIS:
'社員番号'.encode('Shift_JIS')
#=> "\x{8ED0}\x{88F5}\x{94D4}\x{8D86}"
Next, you can pass a filename to CSV.read
, so instead of:
file = File.open(filename)
CSV.read(file)
You can just write:
CSV.read(filename)
That said, you could either work with Shift JIS encoded strings:
require 'csv'
str1 = '社員番号'.encode("Shift_JIS")
str2 = 'メールアドレス'.encode("Shift_JIS")
csv = CSV.read('SyainInfo.csv', encoding: 'Shift_JIS', headers: true)
csv[str1]
csv[str2]
Or – and that's what I would do – you could work with UTF-8 strings by specifying a second encoding:
require 'csv'
str1 = '社員番号'
str2 = 'メールアドレス'
csv = CSV.read('SyainInfo.csv', encoding: 'Shift_JIS:UTF-8', headers: true)
csv[str1]
csv[str2]
encoding: 'Shift_JIS:UTF-8'
instructs CSV
to read Shift JIS data and transcode it to UTF-8. It's equivalent to passing 'r:Shift_JIS:UTF-8'
to File.open
Related Topics
How to Install a Gem or Update Rubygems If It Fails With a Permissions Error
Difference Between Rake Db:Migrate Db:Reset and Db:Schema:Load
Pg::Connectionbad - Could Not Connect to Server: Connection Refused
How to Match All Occurrences of a Regex
How to Implement Enums in Ruby
Which Ruby on Rails Is Compatible With Which Ruby Version
How to Get a Single Character Without Pressing Enter
Naked Asterisk as Parameter in Method Definition: Def F(*)
What Are the Brackets [5.1] After Activerecord Migration and How Does It Work
How to Reload the Current Page in Ruby on Rails
Why Is Division in Ruby Returning an Integer Instead of Decimal Value
How to Modify Path For Homebrew
What Are the Ruby Gotchas a Newbie Should Be Warned About
String Concatenation Vs. Interpolation in Ruby
Why Are Gems Installed in a Directory With a Different Ruby Version Than I'M Running