Invalid Byte Sequence In UTF-8 Ruby
As Arie already answered this error is because invalid byte sequence \xC3
If you are using Ruby 2.1 +, you can also use String#scrub
to replace invalid bytes with given replacement character. Here:
a = "abce\xC3"
# => "abce\xC3"
a.scrub
# => "abce�"
a.scrub.sub("a","A")
# => "Abce�"
ArgumentError invalid byte sequence in UTF-8
You get these errors because the Zip gem assumes the filenames to be encoded in UTF-8 but they are actually in a different encoding.
To fix the error, you first have to find the correct encoding. Let's re-create the string from its bytes:
bytes = [111, 117, 116, 112, 117, 116, 50, 48, 50, 48, 49,
50, 48, 55, 95, 49, 52, 49, 54, 48, 50, 47, 87,
78, 83, 95, 85, 80, 151, 112, 131, 102, 129, 91,
131, 94, 46, 116, 120, 116]
string = bytes.pack('c*')
#=> "output20201207_141602/WNS_UP\x97p\x83f\x81[\x83^.txt"
We can now traverse the Encoding.list
and select
those that return the expected result:
Encoding.list.select do |enc|
s = string.encode('UTF-8', enc) rescue next
s.end_with?('WNS_UP用データ.txt')
end
#=> [
# #<Encoding:Windows-31J>,
# #<Encoding:Shift_JIS>,
# #<Encoding:SJIS-DoCoMo>,
# #<Encoding:SJIS-KDDI>,
# #<Encoding:SJIS-SoftBank>
# ]
All of the above encodings result in the correct output.
Back to your code, you could use:
path = entry.name.encode('UTF-8', 'Windows-31J')
#=> "output20201207_141602/WNS_UP用データ.txt"
ext = File.extname(path)
#=> ".txt"
file_name = File.basename(path)
#=> "WNS_UP用データ.txt"
The Zip gem also has an option to set an explicit encoding for non-ASCII file names. You might want to give it a try by setting Zip.force_entry_names_encoding = 'Windows-31J'
(haven't tried it)
Invalid byte sequence in UTF-8 (ArgumentError)
Probably your string is not in UTF-8 format, so use
if ! file_content.valid_encoding?
s = file_content.encode("UTF-16be", :invalid=>:replace, :replace=>"?").encode('UTF-8')
s.gsub(/dr/i,'med')
end
See "Ruby 2.0.0 String#Match ArgumentError: invalid byte sequence in UTF-8".
Related Topics
How to Include Video in Jekyll Markdown Blog
How to Install an Older Version of Jekyll
Can't Install Thrift Gem on Os X El Capitan
Suppressing the Output of a Command Run Using 'System' Method While Running It in a Ruby Script
Rails 3.1 with Postgresql: Group by Must Be Used in an Aggregate Function
Initialize an Object with a Block
Should One Use Dashes or Underscores When Naming a Gem with More Than One Word
How to Spawn a Child Process in Ruby
Rails Migration Changing Column to Use Postgres Arrays
How to Cache a Calculated Column in Rails
Rvm Ruby Installation Errors - MAC
How to Share the Factories That I Have in a Gem and Use It in Other Project