Byte Array to Short Array and Back Again in Java

Converting a byte[] to short[] in Java

You are lucky enough that byte is "fully castable" to short, so:

// Grab size of the byte array, create an array of shorts of the same size
int size = byteArray.length;
short[] shortArray = new short[size];

for (int index = 0; index < size; index++)
shortArray[index] = (short) byteArray[index];

And then use shortArray.

Note: as far as primitive type goes, Java always treats them in big endian order, so converting, say, byte ff will yield short 00ff.

how to convert short array to byte array

Java short is a 16-bit type, and byte is an 8-bit type. You have a loop that tries to insert N shorts into a buffer that's N-bytes long; it needs to be 2*N bytes long to fit all your data.

ByteBuffer byteBuf = ByteBuffer.allocate(2*N);
while (N >= i) {
byteBuf.putShort(buffer[i]);
i++;
}

String to Byte Conversion and Back Again Not Returning Same Result (ASCII)

ã is not an ASCII character, so how it is handled is given by the implementation

https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#getBytes-java.nio.charset.Charset-

This method always replaces malformed-input and unmappable-character sequences with this charset's default replacement byte array.

For this charset it comes out as '?'

How can I treat an underlying byte array as an array of shorts? Or ints? Or longs?

Do you actually need a ShortBuffer? Why not just use the various get/put methods on the ByteBuffer to read/write short, int, and other values from/to the underlying byte array?

Problems converting byte array to string and back to byte array

It is not a good idea to store encrypted data in Strings because they are for human-readable text, not for arbitrary binary data. For binary data it's best to use byte[].

However, if you must do it you should use an encoding that has a 1-to-1 mapping between bytes and characters, that is, where every byte sequence can be mapped to a unique sequence of characters, and back. One such encoding is ISO-8859-1, that is:

    String decoded = new String(encryptedByteArray, "ISO-8859-1");
System.out.println("decoded:" + decoded);

byte[] encoded = decoded.getBytes("ISO-8859-1");
System.out.println("encoded:" + java.util.Arrays.toString(encoded));

String decryptedText = encrypter.decrypt(encoded);

Other common encodings that don't lose data are hexadecimal and base64, but sadly you need a helper library for them. The standard API doesn't define classes for them.

With UTF-16 the program would fail for two reasons:

  1. String.getBytes("UTF-16") adds a byte-order-marker character to the output to identify the order of the bytes. You should use UTF-16LE or UTF-16BE for this to not happen.
  2. Not all sequences of bytes can be mapped to characters in UTF-16. First, text encoded in UTF-16 must have an even number of bytes. Second, UTF-16 has a mechanism for encoding unicode characters beyond U+FFFF. This means that e.g. there are sequences of 4 bytes that map to only one unicode character. For this to be possible the first 2 bytes of the 4 don't encode any character in UTF-16.

How to convert String (byte array as string) to short

Since the string value that you're using is a hexadecimal value, to convert it into short, you need to remove the 0x using a substring and pass the radix as below:

Short.parseShort(yourHexString.substring(2), 16)

Here 16 is the radix. More info in the doc here.

Update

Since the OP asked for some more clarification, adding the below info.

The short datatype can only have values between -32,768 and 32,767. It can't directly hold 0x3eb, but it can hold the equivalent decimal value of it. That's why when you parse it into the short variable and print, it shows 1003, which is the decimal equivalent of 0x3eb.

Rebuild byte array with bigInteger and other method

Don't use either of these approaches. Either convert into hex directly (not using BigInteger) or use base64. BigInteger will faithfully reproduce numbers, but it's not meant to be a general purpose binary-to-hex converter. In particular, it will lose leading zeroes, because they're insignificant when treating the data as an integer. (If you know the expected length you could always format it to that, but why bother? Just treat the data as arbitrary data instead of as a number.)

Definitely don't try to "decode" the byte array as if it's UTF-8-encoded text - it isn't.

There are plenty of questions on Stack Overflow about converting byte arrays to hex or base64. (Those are just links to two examples... search for more.)



Related Topics



Leave a reply



Submit