Why Does Guid.Tobytearray() Order the Bytes the Way It Does

Why does Guid.ToByteArray() order the bytes the way it does?

If you read the Examples section from the GUID constructor, you'll find your answer:

Guid(1,2,3,new byte[]{0,1,2,3,4,5,6,7}) creates a Guid that corresponds to "00000001-0002-0003-0001-020304050607".

a is a 32-bit integer, b is a 16-bit integer, c is a 16-bit integer, and d is simply 8 bytes.

Because a, b, and c are integer types rather than raw bytes, they are subject to endian ordering when choosing how to display them. The RFC for GUID's (RFC4122) states that they should be presented in big endian format.

C# - why does System.Guid flip the bytes in a byte array?

Found on Wikipedia regarding UUID.

Other systems, notably Microsoft's marshalling of UUIDs in their COM/OLE libraries, use a mixed-endian format, whereby the first three components of the UUID are little-endian, and the last two are big-endian.

For example, 00112233-4455-6677-8899-aabbccddeeff is encoded as the bytes 33 22 11 00 55 44 77 66 88 99 aa bb cc dd ee ff

How to implement GUID toByteArray?

A guid is essentially just a 128-bit number. Internally this is represented as one 32-bit int, two 16-bit ints and eight 8-bit ints.

So conversion to a byte array is essentially just creating an array, and using shifting to select the correct byte in the 16 & 32-bit ints.

Is Guid.ToByteArray() cross-platform?

For your question:

I would like to know if this method will return the same result on
every platform (86x hardware, 64x hardware, Linux, Windows etc)

Yes it will be same for all the platforms.

However, this method converts value in a strage way:

The order returned from ToByteArray would be different from string representation.

See: Guid.ToByteArray Method

Note that the order of bytes in the returned byte array is
different
from the string representation of a Guid value. The order
of the beginning four-byte group and the next two two-byte groups is
reversed, whereas the order of the last two-byte group and the closing
six-byte group is the same.

Guid Byte Order in .NET

It appears that MS are storing the five parts in a structure.
The first 4 parts are either 2 or 4 bytes long and are therefore probably stored as a native type (ie. WORD and DWORD) in little endian format. The last part is 6 bytes long and it therefore handled differently (probably an array).

Does the Spec state that the GUID is stored in big-endian order, or that the storage of parts are in that order but the indiviual parts may be implementation specific?

EDIT:

From the UUID spec, section 4.1.2. Layout and Byte Order (emphasis mine):

To minimize confusion about bit assignments within octets, the UUID

record definition is defined only in terms of fields that are

integral numbers of octets. The fields are presented with the most

significant one first.

...

In the absence of explicit application or presentation protocol

specification to the contrary
, a UUID is encoded as a 128-bit object,
as follows:

The fields are encoded as 16 octets, with the sizes and order of
the fields defined above, and with each field encoded with the Most
Significant Byte first (known as network byte order).

It might be that MS have stored the bytes in the correct order, but have not bothered to network-to-host order the WORD and DWORD parts for presentation (which appears to be ok according to the spec, at least by my unskilled reading of it.)

Why are these two strings not equal?

As per the Microsoft documentation:

Note that the order of bytes in the returned byte array is different from the string representation of a Guid value. The order of the beginning four-byte group and the next two two-byte groups is reversed, whereas the order of the last two-byte group and the closing six-byte group is the same. The example provides an illustration.

using System;

public class Example
{
public static void Main()
{
Guid guid = Guid.NewGuid();
Console.WriteLine("Guid: {0}", guid);
Byte[] bytes = guid.ToByteArray();
foreach (var byt in bytes)
Console.Write("{0:X2} ", byt);

Console.WriteLine();
Guid guid2 = new Guid(bytes);
Console.WriteLine("Guid: {0} (Same as First Guid: {1})", guid2, guid2.Equals(guid));
}
}
// The example displays the following output:
// Guid: 35918bc9-196d-40ea-9779-889d79b753f0
// C9 8B 91 35 6D 19 EA 40 97 79 88 9D 79 B7 53 F0
// Guid: 35918bc9-196d-40ea-9779-889d79b753f0 (Same as First Guid: True)

Copying a `System.Guid` to `byte[]` without allocating

The solution I settled on came from some help from the Jil project by Kevin Montrose. I didn't go with that exact solution, but it inspired me to come up with something that I think is fairly elegant.

Note: The following code uses Fixed Size Buffers and requires that your project be built with the /unsafe switch (and in all likelihood requires Full Trust to run).

[StructLayout(LayoutKind.Explicit)]
unsafe struct GuidBuffer
{
[FieldOffset(0)]
fixed long buffer[2];

[FieldOffset(0)]
public Guid Guid;

public GuidBuffer(Guid guid)
: this()
{
Guid = guid;
}

public void CopyTo(byte[] dest, int offset)
{
if (dest.Length - offset < 16)
throw new ArgumentException("Destination buffer is too small");

fixed (byte* bDestRoot = dest)
fixed (long* bSrc = buffer)
{
byte* bDestOffset = bDestRoot + offset;
long* bDest = (long*)bDestOffset;

bDest[0] = bSrc[0];
bDest[1] = bSrc[1];
}
}
}

Usage is simple:

var myGuid = Guid.NewGuid(); // however you get it
var guidBuffer = new GuidBuffer(myGuid);

var buffer = new buffer[16];
guidBuffer.CopyTo(buffer, 0);

Timing this yielded an average duration of 1-2 ticks for the copy. Should be fast enough for most any application.

However, if you want to eke out the absolute best performance, one possibility (suggested by Kevin) is to ensure that the offset parameter is long-aligned (on an 8-byte boundary). My particular use case favors memory over speed, but if speed is the most important thing that would be a good way to go about it.

What is the equivalent to .Net Guid.ToByteArray() in Java

The near-equivalent class in Java is java.util.UUID. However, as you've noticed, the two do not give the same byte arrays. But if you execute the following and look at the array given by Java versus the array given by .NET:

import java.nio.ByteBuffer;
import java.util.Arrays;
import java.util.UUID;

public class Main {

// expected from your question
private static final int[] EXPECTED_BYTES = {
185, 242, 54, 152, 140, 186, 166, 66, 184, 132, 46, 158, 237, 159, 185, 90
};

public static void main(String[] args) {
UUID uuid = UUID.fromString("9836f2b9-ba8c-42a6-b884-2e9eed9fb95a");

byte[] array = toByteArray(uuid);

System.out.println("EXPECTED: " + Arrays.toString(EXPECTED_BYTES));
System.out.println("ACTUAL : " + Arrays.toString(toUnsignedInts(array)));
}

private static byte[] toByteArray(UUID uuid) {
return ByteBuffer.allocate(16)
.putLong(uuid.getMostSignificantBits())
.putLong(uuid.getLeastSignificantBits())
.array();
}

// for visual purposes only
private static int[] toUnsignedInts(byte[] array) {
int[] result = new int[array.length];
for (int i = 0; i < array.length; i++) {
result[i] = Byte.toUnsignedInt(array[i]);
}
return result;
}
}

And the output:

EXPECTED: [185, 242, 54, 152, 140, 186, 166, 66, 184, 132, 46, 158, 237, 159, 185, 90]
ACTUAL : [152, 54, 242, 185, 186, 140, 66, 166, 184, 132, 46, 158, 237, 159, 185, 90]

You'll see the arrays are almost equal, it's just the order of some bytes don't match. The last eight bytes (i.e. the least significant bits) all match, but the first four bytes are reversed, the next two bytes are reversed, and so are the next two bytes. To see it visually:

EXPECTED: [185, 242, 54, 152, 140, 186, 166, 66, 184, 132, 46, 158, 237, 159, 185, 90]
ACTUAL : [152, 54, 242, 185, 186, 140, 66, 166, 184, 132, 46, 158, 237, 159, 185, 90]
|---------------| |------| |-----|

I don't know enough to explain why this difference exists, but this comment on an answer to a question you linked to says:

See also [Universally unique identifier - Wikipedia] "Many systems encode the UUID entirely in a big-endian format." "Other systems, notably Microsoft's marshalling of UUIDs in their COM/OLE libraries, use a mixed-endian format, whereby the first three components of the UUID are little-endian, and the last two are big-endian." – Denis Dec 20 '19 at 13:06

The answer that comment is on gives a solution to your problem, which you've included in your question. That solution simply swaps bytes around to get the desired effect. Here's another solution that doesn't involve creating a copy array:

private static byte[] toByteArray(UUID uuid) {
long mostSigBits = uuid.getMostSignificantBits();
return ByteBuffer.allocate(16)
.order(ByteOrder.LITTLE_ENDIAN)
.putInt((int) (mostSigBits >> 32))
.putShort((short) (((int) mostSigBits) >> 16))
.putShort((short) mostSigBits)
.order(ByteOrder.BIG_ENDIAN)
.putLong(uuid.getLeastSignificantBits())
.array();
}

Note: I'm not very comfortable with bit-shifting, so there may be a more succinct way of accomplishing the above that I couldn't think of.

Which gives the following output:

EXPECTED: [185, 242, 54, 152, 140, 186, 166, 66, 184, 132, 46, 158, 237, 159, 185, 90]
ACTUAL : [185, 242, 54, 152, 140, 186, 166, 66, 184, 132, 46, 158, 237, 159, 185, 90]

Warning: Unfortunately, I'm not sure you can rely on either workaround giving the correct bytes 100% of the time.

  • How to read a .NET Guid into a Java UUID
  • Is there any difference between a GUID and a UUID?

Incorrect Guid order

Since the Original Poster asked for my comments (which are just links) to be posted as an answer, here it comes:

SO: Guid Byte Order in .NET

MSDN: System.Guid .ToByteArray swapping first 4 bytes

SO: C#: Why isn't Guid.ToString(“n”) the same as a hex string generated from a byte array of the same guid?

It seems like it's not clearly documented what endianness the different components of the Guid possess, when converting to and from Byte[].



Related Topics



Leave a reply



Submit