Using a Byte Array as Map Key

Using a byte array as Map key

The problem is that byte[] uses object identity for equals and hashCode, so that

byte[] b1 = {1, 2, 3}
byte[] b2 = {1, 2, 3}

will not match in a HashMap. I see three options:

  1. Wrapping in a String, but then you have to be careful about encoding issues (you need to make certain that the byte -> String -> byte gives you the same bytes).
  2. Use List<Byte> (can be expensive in memory).
  3. Do your own wrapping class, writing hashCode and equals to use the contents of the byte array.

HashMap with byte array key and String value - containsKey() function doesn't work

A byte[] (or any array) can't work properly as a key in a HashMap, since arrays don't override equals, so two arrays will be considered equal only if they refer to the same object.

You'll have to wrap your byte[] in some custom class that overrides hashCode and equals, and use that custom class as the key to your HashMap.

Java use byte[] as key in a Map

Ok, introduction material to Java here.

There are two things at play:

  1. Arrays are identity-based, not value-based (see this answer for a bit more details; I couldn't find any formal specification for this, though).

Hence, the following returns false:

byte[] a = {1};
byte[] b = {1};
System.out.println(a.equals(b));

  1. Arrays do not implement Comparable (it would make no sense for them). In order to put anything into a TreeMap, you must either provide a Comparator for it (using this constructor), or it must implement Comparable.

EDIT: If you indeed have some binary data which you want to use as a key (which is a rather strange case), you can:

  1. Use Java NIO's ByteBuffer and its ByteBuffer.wrap(byte[]) method (drawback: this class is mutable but so was byte[] in your original version).

  2. Create a custom (preferably immutable) wrapper over byte[], and use this class as a key for your HashMap/TreeMap (though, if you do not need sorting, do not use TreeMap for performance reasons).

Stub of a sample custom wrapper:

final class ByteWrapper {

private final byte[] bytes;

public ByteWrapper(byte[] bytes) {
this.bytes = bytes;
}

@Override
public boolean equals(Object o) {
// type checks here
ByteWrapper that = (ByteWrapper) o;
return Arrays.equals(bytes, that.bytes);
}

@Override
public int hashCode() {
return Arrays.hashCode(bytes);
}
}

Get value from a HashMap using a byte[] key

The problem is that two byte[] use the Object.hashCode, which tests for the instance of the object. Two array instances as created by new byte[...] will yield two different hash codes, keys, and hence almost always null is returned.

Furthermore equals does not work too, hence byte[] as key is no option.

You could use the string itself, as you are actual doing "key".getBytes(StandardCharsets.UTF_8).

Or create wrapper class:

public class ByteArray {
public final byte[] bytes;
public ByteArray(byte[] bytes) {
this.bytes = bytes;
}
@Override
public boolean equals(Object rhs) {
return rhs != null && rhs instanceof ByteArray
&& Arrays.equals(bytes, ((ByteArray)rhs).bytes);
}
@Override
public int hashCode() {
return Arrays.hashCode(bytes);
}
}

Using String(byte[]) as key to map

You should probably use Biginteger. It has a BigInteger(byte[]) constructor and has an effective comparison system.

How to use byte arrays as keys in MapDB

MapDB author here.

It is possible to use byte[] without wrappers. There is Hasher which handles hashCode and equals methods for HTreeMap:

    Map map = db.createHashMap("map")
.hasher(Hasher.BYTE_ARRAY)
.keySerializer(Serializer.BYTE_ARRAY)
.makeOrGet();

Java hashmap with single byte[] array for all keys

Since most personal computers have 16GB these days and servers often 32 - 128 GB or more, is there a real problem with there being a degree of bookkeeping overhead?

If we consider the alternative: the byte data concatenated into a single large array -- we should think about what individual values would have to look like, to reference a slice of a larger array.

Most generally you would start with:

public class ByteSlice {
protected byte[] array;
protected int offset;
protected int len;
}

However, that's 8 bytes + the size of a pointer (perhaps just 4 bytes?) + the JVM object header (12 bytes on a 64-bit JVM). So perhaps total 24 bytes.

If we try and make this single-purpose & minimalist, we're still going to need 4 bytes for the offset.

public class DedicatedByteSlice {
protected int offset;
protected byte len;

protected static byte[] getArray() {/*somebody else knows about the array*/}
}

This is still 5 bytes (probably padded to 8) + the JVM object header. Probably still total 20 bytes.

It seems that the cost of dereferencing with an offset & length, and having an object to track that, are not substantially less than the cost of storing the small array directly.

One further theoretical possibility -- de-structuring the Map Key so it is not an object

It is possible to conceive of destructuring the "length & offset" data such that it is no longer in an object. It then is passed as a set of scalar parameters eg (length, offset) and -- in a hashmap implementation -- would be stored by means of arrays of the separate components (eg. instead of a single Object[] keyArray).

However I expect it is extremely unlikely that any library offers an existing hashmap implementation for your (very particular) usecase.

If you were talking about the values it would probably be pointless, since Java does not offer multiple returns or method OUT parameters; which makes communication impractical without "boxing" destructured data back to an object. Since here you are asking about Map Keys specifically, and these are passed as parameters but need not be returned, such an approach may be theoretically possible to consider.

[Extended]
Even given this it becomes tricky -- for your usecase the map API probably has to become asymmetrical for population vs lookup, as population has to be by (offset, len) to define keys; whereas practical lookup is likely still by concrete byte[] arrays.

OTOH: Even quite old laptops have 16GB now. And your time to write this (times 4-10 to maintain) should be worth far more than the small cost of extra RAM.

Slice as a key in map

No, slices cannot be used as map keys as they have no equality defined.



Related Topics



Leave a reply



Submit