Implementing a Hash Combiner in Swift

Implementing a hash combiner in Swift

You cannot define a parameter of type P if P
is a protocol which has Self or associated type requirements.
In this case it is the Equatable protocol from which Hashable
inherits, which has a Self requirement:

public static func ==(lhs: Self, rhs: Self) -> Bool

What you can do is to define a generic method instead:

extension Hashable {
func combineHash<T: Hashable>(with hashableOther: T) -> Int {
let ownHash = self.hashValue
let otherHash = hashableOther.hashValue
return (ownHash << 5) &+ ownHash &+ otherHash
}
}

Combining hash of Date and String

For NSObject subclasses you must override the hash property and the isEqual: method, compare NSObject subclass in Swift: hash vs hashValue, isEqual vs ==.

For the implementation of the hash property you can use the Hasher class introduced in Swift 4.2, and its combine() method:

Adds the given value to this hasher, mixing its essential parts into the hasher state.

I would also suggest to make the properties constant, since mutating them after an object is inserted into a hashable collection (set, dictionary) causes undefined behaviour, compare Swift mutable set: Duplicate element found.

class Model: NSObject { /* ... */ }

class Something : Model {
let name: String
let time: Date

init(name: String, time: Date) {
self.name = name
self.time = time
}

override var hash : Int {
var hasher = Hasher()
hasher.combine(name)
hasher.combine(time)
return hasher.finalize()
}

static func ==(lhs: Something, rhs: Something) -> Bool {
return lhs.name == rhs.name && lhs.time == rhs.time
}
}

How to Implement hash(into:) from hashValue in Swift?

You can simply use hasher.combine and call it with the values you want to use for hashing:

func hash(into hasher: inout Hasher) {
hasher.combine(index)
hasher.combine(title)
}

will hasher.combine(self) cause trouble while using collections?

I'm farily certain that hasher.combine(self) would either not compile or result in an infinite loop.

When hasher.combine() sees the type that it is given, it is going to look for that objects hash(into:) function, which will call hasher.combine() with the same type, and so on and so forth.

What you should do is

func hash(into hasher: inout Hasher) {
hasher.combine(property1)
hasher.combine(prop2)
//...
//...
//...
//...
//...
//...
//until you have written a line to combine every property you want to hash
}

Let me know if you have any questions.

How to natively hash a struct without randomization?

You should make your own protocol, something like:

protocol Signable {
/// A signature that's stable across app launches
var signature: Signature { get }
}

Equatable/Hashable are standard protocols with very specific semantics (Equality implies substitutability, and Hashability is a heuristic, as a performance optimization of Equatable). They're not catch-all bags of syntax for anything vaguely similar to equality checking or hashing.

Also beware: the randomly seeded behavior of Hasher is intentional. It protects you against hash-flooding DoS attacks. You should be very careful circumventing this, because you'll be exposing yourself to this vector.

Using signature to compare instances persisted across app launches seems fine. But I would not recommend using that to implement any kind of hashed data structures, which could be attacked to degenerate into linear-time-lookups.

How to implement the Hashable Protocol in Swift for an Int array (a custom string struct)

Update

Martin R writes:

As of Swift 4.1, the compiler can synthesize Equatable and Hashable
for types conformance automatically, if all members conform to
Equatable/Hashable (SE0185). And as of Swift 4.2, a high-quality hash
combiner is built-in into the Swift standard library (SE-0206).

Therefore there is no need anymore to define your own hashing
function, it suffices to declare the conformance:

struct ScalarString: Hashable, ... {

private var scalarArray: [UInt32] = []

// ... }

Thus, the answer below needs to be rewritten (yet again). Until that happens refer to Martin R's answer from the link above.


Old Answer:

This answer has been completely rewritten after submitting my original answer to code review.

How to implement to Hashable protocol

The Hashable protocol allows you to use your custom class or struct as a dictionary key. In order to implement this protocol you need to

  1. Implement the Equatable protocol (Hashable inherits from Equatable)
  2. Return a computed hashValue

These points follow from the axiom given in the documentation:

x == y implies x.hashValue == y.hashValue

where x and y are values of some Type.

Implement the Equatable protocol

In order to implement the Equatable protocol, you define how your type uses the == (equivalence) operator. In your example, equivalence can be determined like this:

func ==(left: ScalarString, right: ScalarString) -> Bool {
return left.scalarArray == right.scalarArray
}

The == function is global so it goes outside of your class or struct.

Return a computed hashValue

Your custom class or struct must also have a computed hashValue variable. A good hash algorithm will provide a wide range of hash values. However, it should be noted that you do not need to guarantee that the hash values are all unique. When two different values have identical hash values, this is called a hash collision. It requires some extra work when there is a collision (which is why a good distribution is desirable), but some collisions are to be expected. As I understand it, the == function does that extra work. (Update: It looks like == may do all the work.)

There are a number of ways to calculate the hash value. For example, you could do something as simple as returning the number of elements in the array.

var hashValue: Int {
return self.scalarArray.count
}

This would give a hash collision every time two arrays had the same number of elements but different values. NSArray apparently uses this approach.

DJB Hash Function

A common hash function that works with strings is the DJB hash function. This is the one I will be using, but check out some others here.

A Swift implementation provided by @MartinR follows:

var hashValue: Int {
return self.scalarArray.reduce(5381) {
($0 << 5) &+ $0 &+ Int($1)
}
}

This is an improved version of my original implementation, but let me also include the older expanded form, which may be more readable for people not familiar with reduce. This is equivalent, I believe:

var hashValue: Int {

// DJB Hash Function
var hash = 5381

for(var i = 0; i < self.scalarArray.count; i++)
{
hash = ((hash << 5) &+ hash) &+ Int(self.scalarArray[i])
}

return hash
}

The &+ operator allows Int to overflow and start over again for long strings.

Big Picture

We have looked at the pieces, but let me now show the whole example code as it relates to the Hashable protocol. ScalarString is the custom type from the question. This will be different for different people, of course.

// Include the Hashable keyword after the class/struct name
struct ScalarString: Hashable {

private var scalarArray: [UInt32] = []

// required var for the Hashable protocol
var hashValue: Int {
// DJB hash function
return self.scalarArray.reduce(5381) {
($0 << 5) &+ $0 &+ Int($1)
}
}
}

// required function for the Equatable protocol, which Hashable inheirits from
func ==(left: ScalarString, right: ScalarString) -> Bool {
return left.scalarArray == right.scalarArray
}

Other helpful reading

  • Which hashing algorithm is best for uniqueness and speed?
  • Overflow Operators
  • Why are 5381 and 33 so important in the djb2 algorithm?
  • How are hash collisions handled?

Credits

A big thanks to Martin R over in Code Review. My rewrite is largely based on his answer. If you found this helpful, then please give him an upvote.

Update

Swift is open source now so it is possible to see how hashValue is implemented for String from the source code. It appears to be more complex than the answer I have given here, and I have not taken the time to analyze it fully. Feel free to do so yourself.

why do i receive this warning to implement 'hash(into:)' on a type that conforms to a protocol with at default implementation

This code has two default implementations for hash(into:). One from Int, and one from SettingSelectable.

I'm not certain if this is defined behavior. My expectation is that the Int implementation is used and the SettingsSelectable extension is ignored. In any case, the diagnostic is not very good. I suggest opening a defect about that.

You can fix this error by removing the Int, or by explicitly implementing hash(into:) so it's clear which one you mean. Or you could create another layer of protocols:

protocol SettingsSelectableBase: Hashable, Settingsable {
var display: String { get }
}

protocol SettingsSelectable: SettingsSelectableBase {}

// Only give the default to things that ask for it, not to Base conformers
extension SettingsSelectable {
func hash(into hasher: inout Hasher) {
hasher.combine(display)
}
}

extension SettingsSelectableBase {
func resetSettings() {
print("These have been reset")
}
}

enum PlaybackSpeed: Int, SettingsSelectableBase { ... }

Swift, how to implement Hashable protocol based on object reference?

If you are working with classes and not structs, you can use the ObjectIdentifier struct. Note that you also have to define == for your class in order to conform to Equatable (Hashable requires it). It would look something like this:

class MyClass: Hashable { }

func ==(lhs: MyClass, rhs: MyClass) -> Bool {
return ObjectIdentifier(lhs) == ObjectIdentifier(rhs)
}

class MyClass: Hashable {
var hashValue: Int {
return ObjectIdentifier(self).hashValue
}
}


Related Topics



Leave a reply



Submit