Ruby Method Lookup Path For an Object

Ruby method lookup path for an object

That other post makes it seem confusing, but it really isn't. If you are interested in such things, you should read "Metaprogramming Ruby". Until then, the basic rule is one step to the right and up:

          Object (superclass)
              ^
              |
          Parent class A(superclass)
              ^
              |
          Parent class B(superclass)
              ^
              |
obj  ->   object's class

2) Singleton classes are inserted between the obj and the object's class:

          Object
              ^
              |
          Parent class A(superclass)
              ^
              |
          Parent class B(superclass)
              ^
              |
          object's class(superclass)
              ^
              |
obj  ->   obj's singleton_class

3) Included modules are inserted immediately above the class that does the including:

          Object
              ^
              |
          Parent class A
              ^
              |
              Module included by Parent Class B
              ^
              |
          Parent class B
              ^
              |
          object's class
              ^
              |
obj  ->   obj's singleton_class

Edit:

Please point out any flaws

p method_lookup_chain(Class)

--output:--
[#<Class:Class>, #<Class:Module>, #<Class:Object>, #<Class:BasicObject>]

But...

class Object
  def greet
    puts "Hi from an Object instance method"
  end
end

Class.greet

--output:--
Hi from an Object instance method

And..

class Class
  def greet
    puts "Hi from a Class instance method"
  end
end

Class.greet

--output:--
Hi from a Class instance method

The lookup path for a method called on a class actually continues past BasicObject's singleton class(#<Class:BasicObject>):

class BasicObject
  class <<self
    puts superclass
  end
end

--output:--
Class

The full lookup path for a method called on Class looks like this:

                  Basic Object                 
                      ^
                      |
                    Object
                      ^
                      |
                    Module
                      ^
                      |
                    Class
                      ^
                      |
BasicObject    BasicObject's singleton class                
  |                   ^
  |                   |
Object         Object's singleton class
  |                   ^
  |                   |
Module         Module's singleton class
  |                   ^
  |                   |
Class  --->    Class's singleton class

The lookup starts in Class's singleton class and then goes up the hierarchy on the right. "Metaprogramming Ruby" claims there is a unified lookup theory for all objects, but the lookup for methods called on a class does not fit the diagram in 3).

You have the same problem here:

class A 
end

class B < A
end

p method_lookup_chain(B)

--output:--
[#<Class:B>, #<Class:A>, #<Class:Object>, #<Class:BasicObject>]

It should be this:

                  Basic Object                 
                      ^
                      |
                    Object
                      ^
                      |
                    Module
                      ^
                      |
                    Class
                      ^
                      |
BasicObject    BasicObject's singleton class
  |                   ^
  |                   |
Object         Object's singleton class
  |                   ^
  |                   |
  A            A's singleton class
  |                   ^
  |                   |
  B.greet -->  B's singleton class

One thing you need to keep in mind: the lookup path of any method called on a class has to include Class somewhere because ALL classes inherit from Class.

Method-lookup path for Ruby

You can use ancestor reflection:

class C
  def report
    my_ancestors = self.class.ancestors
    puts "my ancestors are: #{my_ancestors}"
    method = my_ancestors[2].instance_method(:report)
    method.bind(self).call
  end
end

C.new.report
=> my ancestors are: [C, N, M, Object, PP::ObjectMixin, Kernel, BasicObject]
=> 'report' method in module M

Method lookup for class singleton methods in Ruby

When subclassing, not only is Bar.superclass set to Foo, but the same holds true for the singleton classes:

Bar.singleton_class.superclass == Foo.singleton_class  # => true

So you're not really confused. The actual lookup is:

Start with obj's singleton class.
Look for instance methods down the ancestor list:
- prepended modules (Ruby 2.0)
- the class itself
- included modules
Repeat #2 with superclass.
Repeat #1 but this time looking for method_missing

Ruby Method Lookup (comparison with JavaScript)

Does obj1 and obj2 in the Ruby code each own a copy of the some_method method? Or is it similar to JavaScript where both objects have access to some_method via another object (in this case, via the Child class)?

You don't know. The Ruby Language Specification simply says "if you do this, that happens". It does, however, not prescribe a particular way of making that happen. Every Ruby implementation is free to implement it in the way it sees fit, as long as the results match those of the spec, the spec doesn't care how those results were obtained.

You can't tell. If the implementation maintains proper abstraction, it will be impossible for you to tell how they do it. That is just the nature of abstraction. (It is, in fact, pretty much the definition of abstraction.)

Similarly, when inheritance is taken into account in Ruby, does each Ruby object have a copy of all of the class and superclass methods of the same name?

Same as above.

There are a lot of Ruby implementations currently, and there have been even more in the past, in various stages of (in)completeness. Some of those implement(ed) their own object models (e.g. MRI, YARV, Rubinius, MRuby, Topaz, tinyrb, RubyGoLightly), some sit on top of an existing object model into which they are trying to fit (e.g. XRuby and JRuby on Java, Ruby.NET and IronRuby on the CLI, SmallRuby, smalltalk.rb, Alumina, and MagLev on Smalltalk, MacRuby and RubyMotion on Objective-C/Cocoa, Cardinal on Parrot, Red Sun on ActionScript/Flash, BlueRuby on SAP/ABAP, HotRuby and Opal.rb on ECMAScript)

Who is to say that all of those implementations work exactly the same?

My gut tells me that Ruby objects DO NOT have separate copies of the methods inherited from their class, mixed-in modules, and superclasses. Instead, my gut is that Ruby handles method lookup similarly to JavaScript, where objects check if the object itself has the method and if not, it looks up the method in the object's class, mixed-in modules, and superclasses until the lookup reaches BasicObject.

Despite what I wrote above, that is a reasonable assumption, and is in fact, how the implementations that I know about (MRI, YARV, Rubinius, JRuby, IronRuby, MagLev, Topaz) work.

Just think about what it would mean if it weren't so. Every instance of the String class would need to have its own copy of all of String's 116 methods. Think about how many Strings there are in a typical Ruby program!

ruby -e 'p ObjectSpace.each_object(String).count'
# => 10013

Even in this most trivial of programs, which doesn't require any libraries, and only creates one single string itself (for printing the number to the screen), there are already more than 10000 strings. Every single one of those would have its own copies of the over 100 String methods. That would be huge waste of memory.

It would also be a synchronization nightmare! Ruby allows you to monkeypatch methods at any time. What if I redefine a method in the String class? Ruby would now have to update every single copy of that method, even across different threads.

And I actually only counted public methods defined directly in String. Taking into account private methods, the number of methods is even bigger. And of course, there is inheritance: strings wouldn't just need a copy of every method in String, but also a copy of every method in Comparable, Object, Kernel, and BasicObject. Can you imagine every object in the system having a copy of require?

No, the way it works in most Ruby implementations is like this. An object has an identity, instance variables, and a class (in statically typed pseudo-Ruby):

struct Object
  object_id: Id
  ivars: Dictionary<Symbol, *Object>
  class: *Class
end

A module has a method dictionary, a constant dictionary, and a class variable dictionary:

struct Module
  methods: Dictionary<Symbol, *Method>
  constants: Dictionary<Symbol, *Object>
  cvars: Dictionary<Symbol, *Object>
end

A class is like a module, but it also has a superclass:

struct Class
  methods: Dictionary<Symbol, *Method>
  constants: Dictionary<Symbol, *Object>
  cvars: Dictionary<Symbol, *Object>
  superclass: *Class
end

When you call a method on an object, Ruby will look up the object's class pointer and try to find the method there. If it doesn't, it will look at the class's superclass pointer and so on, until it reaches a class which has no superclass. At that point it will actually not give up, but try to call the method_missing method on the original object, passing the name of the method you tried to call as an argument, but that's just a normal method call, too, so it follows all the same rules (except that if a call to method_missing reaches the top of the hierarchy, it will not try to call it again, that would result in an infinite loop).

Oh, but we ignored one thing: singleton methods! Every object needs to have its own method dictionary as well. Actually, rather, every object has its own private singleton class in addition to its class:

struct Object
  object_id: Id
  ivars: Dictionary<Symbol, *Object>
  class: *Class
  singleton_class: Class
end

So, method lookup starts first in the singleton class, and only then goes to the class.

And what about mixins? Oh, right, every module and class also needs a list of its included mixins:

struct Module
  methods: Dictionary<Symbol, *Method>
  constants: Dictionary<Symbol, *Object>
  cvars: Dictionary<Symbol, *Object>
  mixins: List<*Module>
end

struct Class
  methods: Dictionary<Symbol, *Method>
  constants: Dictionary<Symbol, *Object>
  cvars: Dictionary<Symbol, *Object>
  superclass: *Class
  mixins: List<*Module>
end

Now, the algorithm goes: look first in the singleton class, then the class and then the superclass(es), where however, "look" also means "after you look at the method dictionary, also look at all the method dictionaries of the included mixins (and the included mixins of the included mixins, and so forth, recursively) before going up to the superclass".

Does that sound complicated? It is! And that's not good. Method lookup is the single most often executed algorithm in an object-oriented system, it needs to be simple and lightning fast. So, what some Ruby implementations (e.g. MRI, YARV) do, is to decouple the interpreter's internal notion of what "class" and "superclass" mean from the programmer's view of those same concepts.

An object no longer has both a singleton class and a class, it just has a class:

struct Object
  object_id: Id
  ivars: Dictionary<Symbol, *Object>
  class: *Class
  singleton_class: Class
end

A class no longer has a list of included mixins, just a superclass. It may, however, be hidden. Note also that the Dictionaries become pointers, you'll see why in a moment:

struct Class
  methods: *Dictionary<Symbol, *Method>
  constants: *Dictionary<Symbol, *Object>
  cvars: *Dictionary<Symbol, *Object>
  superclass: *Class
  visible?: Bool
end

Now, the object's class pointer will always point to the singleton class, and the singleton class's superclass pointer will always point to the object's actual class. If you include a mixin M into a class C, Ruby will create a new invisible class M′ which shares its method, constant and cvar dictionaries with the mixin. This mixin class will become the superclass of C, and the old superclass of C will become the superclass of the mixin class:

M′ = Class.new(
  methods = M->methods
  constants = M->constants
  cvars = M->cvars
  superclass = C->superclass
  visible? = false
)

C->superclass = *M'

Actually, it's little bit more involved, since it also has to create classes for the mixins that are included in M (and recursively), but in the end, what we end up with is a nice linear method lookup path with no side-stepping into singleton classes and included mixins.

Now, the method lookup algorithm is just this:

def lookup(meth, obj)
  c = obj->class

  until res = c->methods[meth]
    c = c->superclass
    raise MethodNotFound, meth if c.nil?
  end

  res
end

Nice and clean and lean and fast.

As a trade-off, finding out the class of an object or the superclass of a class is slightly more difficult, because you can't simply return the class or superclass pointer, you have to walk the chain until you find a class that is not hidden. But how often do you call Object#class or Class#superclass? Do you even call it at all, outside of debugging?

Unfortunately, Module#prepend doesn't fit cleanly into this picture. And Refinements really mess things up, which is why many Ruby implementations don't even implement them.

Nested singleton class method lookup

Much of this explanation is based on How Ruby Method Dispatch Works by James Coglan, a little of the Ruby Hacking Guide, and just a smidge of source.

To begin with a summary, the ancestry looks like this:

                                                           +----------------+
                                                           |                |
+--------------------------- Module ~~~~~~~~~~~~~~> #<Class:Module>         |
|                              ^                           ^                |
|                              |                           |                |
|                            Class ~~~~~~~~~~~~~~~> #<Class:Class>          |
|                              ^                           ^                |
|                              |                           |                |
| BasicObject ~~~~~> #<Class:BasicObject> ~~> #<Class:#<Class:BasicObject>> |
|     ^                        ^                           ^                |
|     |        Kernel          |                           |                |
|     |          ^             |                           |                |
|     |          |             |   +-----------------------|----------------+
|     +-----+----+             |   |                       |
|           |                  |   v                       |
+-------> Object ~~~~~~> #<Class:Object> ~~~~~~~~> #<Class:#<Class:Object>>
            ^                  ^                           ^
            |                  |                           |
           Foo ~~~~~~~~> #<Class:Foo> ~~~~~~~~~~> #<Class:#<Class:Foo>>

---> Parent
~~~> Singleton class

Let's start from the beginning and build out. BasicObject is the root of everything - if you check BasicObject.superclass, you get nil. BasicObject is also an instance of Class. Yes, that gets circular, and there's a special case in the code to deal with it. When A is an instance of B, A.singleton_class is a child of B, so we get this:

                           Class
                             ^
                             |
BasicObject ~~~~~> #<Class:BasicObject>

Object inherits from BasicObject. When A inherits from B, A is a child of B and A.singleton_class is a child of B.singleton_class. Object also includes Kernel. When A includes B, B is inserted as the first ancestor of A (after A itself, but before A.superclass).

                           Class
                             ^
                             |
BasicObject ~~~~~> #<Class:BasicObject
    ^                        ^
    |        Kernel          |
    |          ^             |
    |          |             |
    +-----+----+             |
          |                  |
        Object ~~~~~~> #<Class:Object>

Kernel is an instance of Module. It's the only instance of Module we'll see, and its singleton class doesn't appear in any ancestry chains, so I won't draw beyond it.

Now we get down to Foo, which inherits from Object (though you don't need to write < Object). We can already figure out what Foo and its singleton class are children of.

                           Class
                             ^
                             |
BasicObject ~~~~~> #<Class:BasicObject>
    ^                        ^
    |        Kernel          |
    |          ^             |
    |          |             |
    +-----+----+             |
          |                  |
        Object ~~~~~~> #<Class:Object>
          ^                  ^
          |                  |
         Foo ~~~~~~~~> #<Class:Foo>

Now Class inherits from Module, and Module inherits from Object, so add Module and the appropriate singleton classes. Because Module < Object and Object < BasicObject and BasicObject.instance_of?(Class), this is where the drawing gets a little funky. Remember you just stop traversing upwards whenever you hit BasicObject.

                                                           +----------------+
                                                           |                |
+--------------------------- Module ~~~~~~~~~~~~~~> #<Class:Module>         |
|                              ^                           ^                |
|                              |                           |                |
|                            Class ~~~~~~~~~~~~~~~> #<Class:Class>          |
|                              ^                                            |
|                              |                                            |
| BasicObject ~~~~~> #<Class:BasicObject>                                   |
|     ^                        ^                                            |
|     |        Kernel          |                                            |
|     |          ^             |                                            |
|     |          |             |   +----------------------------------------+
|     +-----+----+             |   |
|           |                  |   v
+-------> Object ~~~~~~> #<Class:Object>
            ^                  ^
            |                  |
           Foo ~~~~~~~~> #<Class:Foo>

Last step. Every instance of Class has a singleton_class (though it won't be instantiated until it's needed, or else you'd need more RAM). All of our singleton classes are instances of Class, so they have singleton classes. Watch out for this sentence: A class's singleton class's parent is the class's parent's singleton class. I don't know if there's a succinct way to state that as far as type systems go, and the Ruby source pretty much says it's just doing it for consistency in any case. So, when you ask for Foo.singleton_class.singleton_class, the language happily obliges you and propagates the necessary parents upward, leading finally to:

                                                           +----------------+
                                                           |                |
+--------------------------- Module ~~~~~~~~~~~~~~> #<Class:Module>         |
|                              ^                           ^                |
|                              |                           |                |
|                            Class ~~~~~~~~~~~~~~~> #<Class:Class>          |
|                              ^                           ^                |
|                              |                           |                |
| BasicObject ~~~~~> #<Class:BasicObject> ~~> #<Class:#<Class:BasicObject>> |
|     ^                        ^                           ^                |
|     |        Kernel          |                           |                |
|     |          ^             |                           |                |
|     |          |             |   +-----------------------|----------------+
|     +-----+----+             |   |                       |
|           |                  |   v                       |
+-------> Object ~~~~~~> #<Class:Object> ~~~~~~~~> #<Class:#<Class:Object>>
            ^                  ^                           ^
            |                  |                           |
           Foo ~~~~~~~~> #<Class:Foo> ~~~~~~~~~~> #<Class:#<Class:Foo>>

If you start from any node in this graph and traverse depth-first, right to left (and stop at BasicObject, you get the node's ancestor chain, just like we wanted. And, we've built it up from some basic axioms, so we might just be able to trust it. Lacking trust, there are a couple interesting ways to verify the structure further.

Try looking at node.singleton_class.ancestors - node.ancestors for any node in the graph. This gives us the ancestors of the singleton class that are not the ancestors of the node itself, which eliminates some of the confusing redundancy in the list.

> Foo.singleton_class.singleton_class.ancestors - Foo.singleton_class.ancestors
 => [#<Class:#<Class:Foo>>, #<Class:#<Class:Object>>, #<Class:#<Class:BasicObject>>,
     #<Class:Class>, #<Class:Module>]

You can also verify any one parent with node.superclass.

> Foo.singleton_class.singleton_class.superclass
 => #<Class:#<Class:Object>>

And you can even verify that the object identity is all consistent, so there aren't anonymous classes popping up all over the place with no particular relationship to each other.

> def ancestor_ids(ancestors)
>   ancestors.map(&:object_id).zip(ancestors).map{|pair| pair.join("\t")}
> end

> puts ancestor_ids(Foo.ancestors)
70165241815140  Foo
70165216040500  Object
70165216040340  Kernel
70165216040540  BasicObject

> puts ancestor_ids(Foo.singleton_class.ancestors)
70165241815120  #<Class:Foo>
70165216039400  #<Class:Object>
70165216039380  #<Class:BasicObject>
70165216040420  Class
70165216040460  Module
70165216040500  Object # Same as Foo from here down
70165216040340  Kernel
70165216040540  BasicObject

> puts ancestor_ids(Foo.singleton_class.singleton_class.ancestors)
70165241980080  #<Class:#<Class:Foo>>
70165215986060  #<Class:#<Class:Object>>
70165215986040  #<Class:#<Class:BasicObject>>
70165216039440  #<Class:Class>
70165216039420  #<Class:Module>
70165216039400  #<Class:Object> # Same as Foo.singleton_class from here down
70165216039380  #<Class:BasicObject>
70165216040420  Class
70165216040460  Module
70165216040500  Object
70165216040340  Kernel
70165216040540  BasicObject

And that, in a nutshell, is how you snipe a nerd.

Ruby - determining method origins?

Object#method returns a Method object giving meta-data about a given method. For example:

> [].method(:length).inspect
=> "#<Method: Array#length>"
> [].method(:max).inspect
=> "#<Method: Array(Enumerable)#max>"

In Ruby 1.8.7 and later, you can use Method#owner to determine the class or module that defined the method.

To get a list of all the methods with the name of the class or module where they are defined you could do something like the following:

obj.methods.collect {|m| "#{m} defined by #{obj.method(m).owner}"}

Ruby Method Lookup Path For an Object