How is the Arrays.hashCode implemented?
From the book Effective Java:
The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) - i. Modern VMs do this sort of optimization automatically.
Does hashcode implementation of Java Arrays.hashcode() uniformly distribute
Why use the prime number 31?
This can be split in two parts?
- Why a prime number?
Here we need to understand that our goal is to get a unique HashCode for an object which will help us to find that object in O(1) time.
The key word here, is unique.
Primes
Primes are unique numbers. They are unique in that, the product of a
prime with any other number has the best chance of being unique (not
as unique as the prime itself of-course) due to the fact that a prime
is used to compose it. This property is used in hashing functions.
.
Why number 31?
From Effective Java
- Because it's an odd prime, and it's "traditional" to use primes.
It's also one less than a power of two, which permits for bitwise
optimizationHere's the full quote,
from Item 9: Always override
hashCode when you override equals:The value 31 was chosen because it's an odd prime. If it were even and
multiplication overflowed, information would be lost, as
multiplication by 2 is equivalent to shifting. The advantage of using
a prime is less clear, but it is traditional.A nice property of 31 is that the multiplication can be replaced by a
shift (§15.19) and subtraction for better performance:31 * i == (i << 5) - i Modern VMs do this sort of optimization
automatically.While the recipe in this item yields reasonably good hash functions,
it does not yield state-of-the-art hash functions, nor do Java
platform libraries provide such hash functions as of release 1.6.
Writing such hash functions is a research topic, best left to
mathematicians and theoretical computer scientists.Perhaps a later release of the platform will provide state-of-the-art
hash functions for its classes and utility methods to allow average
programmers to construct such hash functions. In the meantime, the
techniques described in this item should be adequate for most
applications.
This is a very Good source.
Do arrays have inherent hashCode()?
Arrays do have hashCode
, per JLS 10.7 (emphasis added):
The members of an array type are all of the following:
The public final field length, which contains the number of components of the array. length may be positive or zero.
The public method clone, which overrides the method of the same name in class Object and throws no checked exceptions. The return type of the clone method of an array type T[] is T[].
A clone of a multidimensional array is shallow, which is to say that it creates only a single new array. Subarrays are shared.
All the members inherited from class Object; the only method of Object that is not inherited is its clone method.
This means that hashCode
is inherited from Object
, and thus is identity-based, and not dependent upon the values in the array.
This might be what you want, but I suggest it might not be. If you want a hash code based on values in an array, you would need to wrap the array in some class which implements a sensible equals and hash code.
hashCode() called on array instance in Java
Java arrays are directly derived from java.lang.Object
and thus inherit the methods equals
and hashCode
from it.
The default equals
method compares object identity (i.e. whether two references refer to the same object), and the default hashCode
method will most probably return different values for different objects.
In most cases, this is a behavior that you do not want. You most probably want to compare arrays based on the content. Note, that your equals
method seems to do that, which in turn means that your hashCode
method is broken. Look at the method's documentation for the reason.
Java provides a class with helper methods for exactly that reason: java.util.Arrays
. It provides the methods equals
and hashCode
that are both based on the content of the array.
So you should write it like that:
public class SomeObject {
private String a;
private char[] b;
@Override
public boolean equals(Object anotherObject) {
if (!(anotherObject instanceof SomeObject)) {
return false;
}
SomeObject object = (SomeObject) anotherObject;
return (this.a.equals(object.a) && Arrays.equals(this.b, object.b));
}
@Override
public int hashCode() {
return (43 * this.a.hashCode() + 11 * Arrays.hashCode(this.b));
}
}
Best implementation for hashCode method for a collection
The best implementation? That is a hard question because it depends on the usage pattern.
A for nearly all cases reasonable good implementation was proposed in Josh Bloch's Effective Java in Item 8 (second edition). The best thing is to look it up there because the author explains there why the approach is good.
A short version
Create a
int result
and assign a non-zero value.For every field
f
tested in theequals()
method, calculate a hash codec
by:- If the field f is a
boolean
:
calculate(f ? 0 : 1)
; - If the field f is a
byte
,char
,short
orint
: calculate(int)f
; - If the field f is a
long
: calculate(int)(f ^ (f >>> 32))
; - If the field f is a
float
: calculateFloat.floatToIntBits(f)
; - If the field f is a
double
: calculateDouble.doubleToLongBits(f)
and handle the return value like every long value; - If the field f is an object: Use the result of the
hashCode()
method or 0 iff == null
; - If the field f is an array: see every field as separate element and calculate the hash value in a recursive fashion and combine the values as described next.
- If the field f is a
Combine the hash value
c
withresult
:result = 37 * result + c
Return
result
This should result in a proper distribution of hash values for most use situations.
hashCode computation with array fields in Java
Possible solution is use List<Byte>
instead of byte[]
and then that's correct way:
Objects.hashCode(field1, field2);
Overriding the hashCode( ) in java for a class that contains an array
To calculate a hashcode for the int array you can use java.util.Arrays.hashcode(int[])
.
If you look at its implementation:
public static int hashCode(int a[]) {
if (a == null)
return 0;
int result = 1;
for (int element : a)
result = 31 * result + element;
return result;
}
you can derive an idea how to calculate a hash code for your class which should be based on the values of your two integers and the integer array:
public class MyClass {
private int a, b;
private int[] array;
public int hashCode() {
return (31 * (31 * Arrays.hashCode(array) + a)) + b;
}
To equals implementation can look like:
public int equals(Object o) {
if (o instanceof of MyClass) {
MyClass m = (MyClass)o;
return m.a == a && m.b == b && Arrays.equals(m.array, array);
}
else
return false;
}
Collision strength of Java's Arrays.hashCode
Arrays.hashCode(double[])
is specified to return the equivalent value of a List
containing Double
values representing the same numeric value.
List.hashCode
in turn is specified with a fairly simple algorithm:
int hashCode = 1;
for (E e : list)
hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());
In general the multiplication with a prime number is a good practice for general-purpose hash functions, but it's far from a cryptographically strong hash function.
This means that while collisions are unlikely in the general (effectively random) case, they can usually be constructed quite easily if you can influence (or select) the hashCode
of the items in the List.
As a constructed example consider these two statements:
System.out.println(Arrays.hashCode(new double[] {4.753E-321d}));
System.out.println(Arrays.hashCode(new double[] {4.9E-324d, 4.9E-324d}));
Both of these will output 993, despite being clearly different arrays.
Related Topics
How to Asynchronously Call a Method in Java
Spark Read File from S3 Using Sc.Textfile ("S3N://...)
How to Customize Parameter Names When Binding Spring MVC Command Objects
Rest - Http Post Multipart with JSON
How to Create an Instance of an Interface in Java
Java Conditional Compilation: How to Prevent Code Chunks from Being Compiled
Why Does My Sorting Loop Seem to Append an Element Where It Shouldn'T
Using == Operator in Java to Compare Wrapper Objects
Has Been Compiled by a More Recent Version of the Java Runtime (Class File Version 57.0)
Spring Data JPA Update @Query Not Updating
Configuring Spring Security 3.X to Have Multiple Entry Points
Spring Boot Rest API - Request Timeout
How to Check That a String Is Parseable to a Double
Varargs and the '...' Argument
"Always on Top" Windows with Java