Ruby: Create a String from Bytes

Ruby: create a String from bytes

There is a much simpler approach than any of the above: Array#pack:

>> [65,66,67,68,69].pack('c*')
=> "ABCDE"

I believe pack is implemented in c in matz ruby, so it also will be considerably faster with very large arrays.

Also, pack can correctly handle UTF-8 using the 'U*' template.

Convert byte array to hex string

This works:

[1, 1, 65, -50].map { |n| '%02X' % (n & 0xFF) }.join

The %02X format specifier makes a 2-character-wide hex number, padded with 0 digits. The & 0xFF is necessary to convert your negative numbers into the standard 0 through 255 range that people usually use when talking about byte values.

Ruby 1.9: Convert byte array to string with multibyte UTF-8 characters

This has to do with how pack interprets its input data. The U* in your example causes it to convert the input data (assumed to be in a default character set, I assume; I really couldn't find any documentation of this) to UTF-8, thus the double encoding. Instead, just pack the bytes and interpret as UTF-8:

irb(main):010:0> [67, 97, 102, 195, 169].pack('C*').force_encoding('utf-8')
=> "Café"

How to convert string to bytes in Ruby?

Ruby already has a String#each_byte method which is aliased to String#bytes.

Prior to Ruby 1.9 strings were equivalent to byte arrays, i.e. a character was assumed to be a single byte. That's fine for ASCII text and various text encodings like Win-1252 and ISO-8859-1 but fails badly with Unicode, which we see more and more often on the web. Ruby 1.9+ is Unicode aware, and strings are no longer considered to be made up of bytes, but instead consist of characters, which can be multiple bytes long.

So, if you are trying to manipulate text as single bytes, you'll need to ensure your input is ASCII, or at least a single-byte-based character set. If you might have multi-byte characters you should use String#each_char or String.split(//) or String.unpack with the U flag.


What does // mean in String.split(//)

// is the same as using ''. Either tells split to return characters. You can also usually use chars.

How encode sequence of bytes into ruby string with characters

You can try something like:

"string\xaa".each_byte.map {|b| "%c(%x)" % [ b, b ] }.join( ' ' )
# => "s(73) t(74) r(72) i(69) n(6e) g(67) ª(aa)"

How Does Ruby handle bytes/binary?

To make a string that has an arbitrary sequence of bytes, do something like this:

binary_string = "\xE5\xA5\xBD"

The "\x" is a special escape to encode an arbitrary byte from hex, so "\xE5" means byte 0xE5.

Then try sending that string on the socket.



Related Topics



Leave a reply



Submit