What's the Right Way to Implement Equality in Ruby

What's the right way to implement equality in ruby

To simplify comparison operators for objects with more than one state variable, create a method that returns all of the object's state as an array. Then just compare the two states:

class Thing

  def initialize(a, b, c)
    @a = a
    @b = b
    @c = c
  end

  def ==(o)
    o.class == self.class && o.state == state
  end

  protected

  def state
    [@a, @b, @c]
  end

end

p Thing.new(1, 2, 3) == Thing.new(1, 2, 3)    # => true
p Thing.new(1, 2, 3) == Thing.new(1, 2, 4)    # => false

Also, if you want instances of your class to be usable as a hash key, then add:

  alias_method :eql?, :==

  def hash
    state.hash
  end

These need to be public.

What's the difference between equal?, eql?, ===, and ==?

I'm going to heavily quote the Object documentation here, because I think it has some great explanations. I encourage you to read it, and also the documentation for these methods as they're overridden in other classes, like String.

Side note: if you want to try these out for yourself on different objects, use something like this:

class Object
  def all_equals(o)
    ops = [:==, :===, :eql?, :equal?]
    Hash[ops.map(&:to_s).zip(ops.map {|s| send(s, o) })]
  end
end

"a".all_equals "a" # => {"=="=>true, "==="=>true, "eql?"=>true, "equal?"=>false}

`==` — generic "equality"

At the Object level, == returns true only if obj and other are the same object. Typically, this method is overridden in descendant classes to provide class-specific meaning.

This is the most common comparison, and thus the most fundamental place where you (as the author of a class) get to decide if two objects are "equal" or not.

`===` — case equality

For class Object, effectively the same as calling #==, but typically overridden by descendants to provide meaningful semantics in case statements.

This is incredibly useful. Examples of things which have interesting === implementations:

Range
Regex
Proc (in Ruby 1.9)

So you can do things like:

case some_object
when /a regex/
  # The regex matches
when 2..4
  # some_object is in the range 2..4
when lambda {|x| some_crazy_custom_predicate }
  # the lambda returned true
end

See my answer here for a neat example of how case+Regex can make code a lot cleaner. And of course, by providing your own === implementation, you can get custom case semantics.

`eql?` — `Hash` equality

The eql? method returns true if obj and other refer to the same hash key. This is used by Hash to test members for equality. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so:
1 == 1.0     #=> true
1.eql? 1.0   #=> false

So you're free to override this for your own uses, or you can override == and use alias :eql? :== so the two methods behave the same way.

`equal?` — identity comparison

Unlike ==, the equal? method should never be overridden by subclasses: it is used to determine object identity (that is, a.equal?(b) iff a is the same object as b).

This is effectively pointer comparison.

What does the Ruby `uniq` method use for equality checking?

It compares values using their hash and eql? methods for efficiency.

https://ruby-doc.org/core-2.5.0/Array.html#method-i-uniq-3F

So you should override eql? (that is ==) and hash

UPDATE:

I cannot explain fully why is that, but overriding hash and == doesn't work. I guess it's cause by the way uniq is implemented in C:

From: array.c (C Method):
Owner: Array
Visibility: public
Number of lines: 20

static VALUE
rb_ary_uniq(VALUE ary)
{
    VALUE hash, uniq;

    if (RARRAY_LEN(ary) <= 1)
        return rb_ary_dup(ary);
    if (rb_block_given_p()) {
        hash = ary_make_hash_by(ary);
        uniq = rb_hash_values(hash);
    }
    else {
        hash = ary_make_hash(ary);
        uniq = rb_hash_values(hash);
    }
    RBASIC_SET_CLASS(uniq, rb_obj_class(ary));
    ary_recycle_hash(hash);

    return uniq;
}

You can bypass that by using a block version of uniq:

> [Foo.new(1,2), Foo.new(1,2), Foo.new(2,3)].uniq{|f| [f.a, f.b]}
=> [#<Foo:0x0000562e48937cc8 @a=1, @b=2>, #<Foo:0x0000562e48937c78 @a=2, @b=3>]

Or use Struct instead:

F = Struct.new(:a, :b)
[F.new(1,2), F.new(1,2), F.new(2,3)].uniq
# => [#<struct F a=1, b=2>, #<struct F a=2, b=3>]

UPDATE2:

Actually in terms of overriding it's not the same if you override == or eql?. When I overriden eql? It worked as intended:

class Foo
  attr_accessor :a, :b

  def initialize(a, b)
    @a = a
    @b = b
  end 

  def eql?(other)
    (@a == other.a && @b == other.b)
  end

  def hash
    [a, b].hash
  end

  def to_s
    "#{@a}: #{@b}"  
  end

end 

a = [
  Foo.new(1, 1),
  Foo.new(1, 2),
  Foo.new(2, 1),
  Foo.new(2, 2),
  Foo.new(2, 2)
]
a.uniq
#=> [#<Foo:0x0000562e483bff70 @a=1, @b=1>,
#<Foo:0x0000562e483bff48 @a=1, @b=2>,
#<Foo:0x0000562e483bff20 @a=2, @b=1>,
#<Foo:0x0000562e483bfef8 @a=2, @b=2>]

Is == in Ruby always value equality?

In Ruby, == can be overloaded, so it could do anything the designer of the class you're comparing wants it to do. In that respect, it's very similar to Java's equals() method.

The convention is for == to do value comparison, and most classes follow that convention, String included. So you're right, using == for comparing strings will do the expected thing.

The convention is for equal? to do reference comparison, so your test a.object_id == b.object_id could also be written a.equal?(b). (The equal? method could be defined to do something nonstandard, but then again, so can object_id!)

(Side note: when you find yourself comparing strings in Ruby, you often should have been using symbols instead.)

Ruby. How is the case equality method implemented in the class Class?

We can further simplify your question I think.
I believe you are asking is why

String == String # true

But

String === String # false

I think it's semi consistent by Ruby. the === equality asks if right side is a member of the left side.

Class === String

Is true since String is a member of Class. And indeed String is not a member of String.

What I do find weird though is that

5 === 5 # returns true

Imo it should return false to be consistent with String === String returning false, but for primitives Ruby has this quirk, probably so it works well with case statements.

What methods gets called when I try to compare custom object with string?

You need to override your class' == method.

class MyClass
  def ==(other)
    # custom equal comparison logic here
    # if you just need string comparison
    to_s == other
  end
end

=== vs. == in Ruby

The two really have nothing to do with each other. In particular, #== is the equality operator and #=== has absolutely nothing to with equality. Personally, I find it rather unfortunate that #=== looks so similar to #==, uses the equals sign and is often called the case equality operator, triple equals operator or threequals operator when it really has nothing to do with equality.

I call #=== the case subsumption operator (it's the best I could come up with, I'm open to suggestions, especially from native English speakers).

The best way to describe a === b is "if I have a drawer labeled a, does it make sense to put b in it?"

So, for example, Module#=== tests whether b.is_a?(a). If you have Integer === 2, does it make sense to put 2 in a box labeled Integer? Yes, it does. What about Integer === 'hello'? Obviously not.

Another example is Regexp#===. It tests for a match. Does it make sense to put 'hello' in a box labeled /el+/? Yes, it does.

For collections such as ranges, Range#=== is defined as a membership test: it makes sense to put an element in a box labeled with a collection if that element is in the collection.

So, that's what #=== does: it tests whether the argument can be subsumed under the receiver.

What does that have to with case expressions? Simple:

case foo
when bar
  baz
end

is the same as

if bar === foo
  baz
end

What does the === operator do in Ruby?

Just like with every other method in Ruby (or actually pretty much any object-oriented language),

a === b

means whatever the author of a's class wants it to mean.

However, if you don't want to confuse the heck out of your colleagues, the convention is that === is the case subsumption operator. Basically, it's a boolean operator which asks the question "If I have a drawer labelled a would it make sense to put b in that drawer?"

An alternative formulation is "If a described a set, would b be a member of that set?"

For example:

 (1..5) === 3           # => true
 (1..5) === 6           # => false

Integer === 42          # => true
Integer === 'fourtytwo' # => false

  /ell/ === 'Hello'     # => true
  /ell/ === 'Foobar'    # => false

The main usage for the === operator is in case expressions, since

case foo
when bar
  baz
when quux
  flurb
else
  blarf
end

gets translated to something (roughly) like

_temp = foo

if bar === _temp
  baz
elsif quux === _temp
  flurb
else
  blarf
end

Note that if you want to search for this operator, it is usually called the triple equals operator or threequals operator or case equality operator. I really dislike those names, because this operator has absolutely nothing whatsoever to do with equality.

In particular, one would expect equality to be symmetric: if a is equal to b, then b better be also equal to a. Also, one would expect equality to be transitive: if a == b and b == c, then a == c. While there is no way to actually guarantee that in a single-dispatch language like Ruby, you should at least make an effort to preserve this property (for example, by following the coerce protocol).

However, for === there is no expectation of either symmetry or transitivity. In fact, it is very much by design not symmetric. That's why I don't like calling it anything that even remotely resembles equality. It's also why I think, it should have been called something else like ~~~ or whatever.

Is the order of the equality operator important in Ruby?

Yes, there is a difference.

my_pw == hashed_pw calls the == method on the my_pw string and passes hashed_pw as an argument. That means you are using the String#== method. From the docs of String#==:

string == object → true or false
Returns true if object has the same length and content; as self; false otherwise

Whereas hashed_pw == my_pw calls the == method on an instance of BCrypt::Password and passes my_pw as an argument. From the docs of BCrypt::Password#==:

#==(secret) ⇒ Object
Compares a potential secret against the hash. Returns true if the secret is the original secret, false otherwise.

What's the Right Way to Implement Equality in Ruby