Convert a String to a Byte Array and Then Back to the Original String

Convert a String to a byte array and then back to the original String

I would suggest using the members of string, but with an explicit encoding:

byte[] bytes = text.getBytes("UTF-8");
String text = new String(bytes, "UTF-8");

By using an explicit encoding (and one which supports all of Unicode) you avoid the problems of just calling text.getBytes() etc:

  • You're explicitly using a specific encoding, so you know which encoding to use later, rather than relying on the platform default.
  • You know it will support all of Unicode (as opposed to, say, ISO-Latin-1).

EDIT: Even though UTF-8 is the default encoding on Android, I'd definitely be explicit about this. For example, this question only says "in Java or Android" - so it's entirely possible that the code will end up being used on other platforms.

Basically given that the normal Java platform can have different default encodings, I think it's best to be absolutely explicit. I've seen way too many people using the default encoding and losing data to take that risk.

EDIT: In my haste I forgot to mention that you don't have to use the encoding's name - you can use a Charset instead. Using Guava I'd really use:

byte[] bytes = text.getBytes(Charsets.UTF_8);
String text = new String(bytes, Charsets.UTF_8);

Convert string of byte array back to original string in vb.net

You can call the following function with your hex string and get "Hello" returned to you. Note that the function doesn't validate the input, you would need to add validation unless you can be sure the input is valid.

Private Function HexToString(ByVal hex As String) As String
Dim result As String = ""
For i As integer = 0 To hex.Length - 1 Step 2
Dim num As Integer = Convert.ToInt32(hex.Substring(i, 2), 16)
result &= Chr(num)
Next
Return result
End Function

James Thorpe points out in his comment that it would be more appropriate to use Encoding.UTF8.GetString to convert back to a string as that is the reverse of the method used to create the hex string in the first place. I agree, but as my original answer was already accepted, I hesitate to change it, so I am adding an alternative version. The note about validation of input being skipped still applies.

Private Function HexToString(ByVal hex As String) As String
Dim bytes(hex.Length \ 2 - 1) As Byte
For i As Integer = 0 To hex.Length - 1 Step 2
bytes(i \ 2) = Byte.Parse(hex.Substring(i, 2), System.Globalization.NumberStyles.HexNumber)
Next
Return System.Text.Encoding.UTF8.GetString(bytes)
End Function

String to Byte Conversion and Back Again Not Returning Same Result (ASCII)

ã is not an ASCII character, so how it is handled is given by the implementation

https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#getBytes-java.nio.charset.Charset-

This method always replaces malformed-input and unmappable-character sequences with this charset's default replacement byte array.

For this charset it comes out as '?'

How to convert byte array to string and vice versa?

Your byte array must have some encoding. The encoding cannot be ASCII if you've got negative values. Once you figure that out, you can convert a set of bytes to a String using:

byte[] bytes = {...}
String str = new String(bytes, StandardCharsets.UTF_8); // for UTF-8 encoding

There are a bunch of encodings you can use, look at the supported encodings in the Oracle javadocs.

How to convert Java String into byte[]?

The object your method decompressGZIP() needs is a byte[].

So the basic, technical answer to the question you have asked is:

byte[] b = string.getBytes();
byte[] b = string.getBytes(Charset.forName("UTF-8"));
byte[] b = string.getBytes(StandardCharsets.UTF_8); // Java 7+ only

However the problem you appear to be wrestling with is that this doesn't display very well. Calling toString() will just give you the default Object.toString() which is the class name + memory address. In your result [B@38ee9f13, the [B means byte[] and 38ee9f13 is the memory address, separated by an @.

For display purposes you can use:

Arrays.toString(bytes);

But this will just display as a sequence of comma-separated integers, which may or may not be what you want.

To get a readable String back from a byte[], use:

String string = new String(byte[] bytes, Charset charset);

The reason the Charset version is favoured, is that all String objects in Java are stored internally as UTF-16. When converting to a byte[] you will get a different breakdown of bytes for the given glyphs of that String, depending upon the chosen charset.



Related Topics



Leave a reply



Submit