Method lookup for class singleton methods in Ruby
When subclassing, not only is Bar.superclass
set to Foo
, but the same holds true for the singleton classes:
Bar.singleton_class.superclass == Foo.singleton_class # => true
So you're not really confused. The actual lookup is:
- Start with
obj
's singleton class. - Look for instance methods down the ancestor list:
- prepended modules (Ruby 2.0)
- the class itself
- included modules
- Repeat #2 with superclass.
- Repeat #1 but this time looking for
method_missing
Ruby method lookup path for an object
That other post makes it seem confusing, but it really isn't. If you are interested in such things, you should read "Metaprogramming Ruby". Until then, the basic rule is one step to the right and up
:
Object (superclass)
^
|
Parent class A(superclass)
^
|
Parent class B(superclass)
^
|
obj -> object's class
2) Singleton classes are inserted between the obj and the object's class:
Object
^
|
Parent class A(superclass)
^
|
Parent class B(superclass)
^
|
object's class(superclass)
^
|
obj -> obj's singleton_class
3) Included modules are inserted immediately above the class that does the including:
Object
^
|
Parent class A
^
|
Module included by Parent Class B
^
|
Parent class B
^
|
object's class
^
|
obj -> obj's singleton_class
Edit:
Please point out any flaws
p method_lookup_chain(Class)
--output:--
[#<Class:Class>, #<Class:Module>, #<Class:Object>, #<Class:BasicObject>]
But...
class Object
def greet
puts "Hi from an Object instance method"
end
end
Class.greet
--output:--
Hi from an Object instance method
And..
class Class
def greet
puts "Hi from a Class instance method"
end
end
Class.greet
--output:--
Hi from a Class instance method
The lookup path for a method called on a class actually continues past BasicObject's singleton class(#<Class:BasicObject>
):
class BasicObject
class <<self
puts superclass
end
end
--output:--
Class
The full lookup path for a method called on Class looks like this:
Basic Object
^
|
Object
^
|
Module
^
|
Class
^
|
BasicObject BasicObject's singleton class
| ^
| |
Object Object's singleton class
| ^
| |
Module Module's singleton class
| ^
| |
Class ---> Class's singleton class
The lookup starts in Class's singleton class and then goes up the hierarchy on the right. "Metaprogramming Ruby" claims there is a unified lookup theory for all objects, but the lookup for methods called on a class does not fit the diagram in 3).
You have the same problem here:
class A
end
class B < A
end
p method_lookup_chain(B)
--output:--
[#<Class:B>, #<Class:A>, #<Class:Object>, #<Class:BasicObject>]
It should be this:
Basic Object
^
|
Object
^
|
Module
^
|
Class
^
|
BasicObject BasicObject's singleton class
| ^
| |
Object Object's singleton class
| ^
| |
A A's singleton class
| ^
| |
B.greet --> B's singleton class
One thing you need to keep in mind: the lookup path of any method called on a class has to include Class somewhere because ALL classes inherit from Class.
Ruby method lookup in regards to globals/scopes
Ruby's method lookup algorithm is actually really simple:
- retrieve the
class
pointer of the receiver - if the method is there, invoke it
- otherwise retrieve the
superclass
pointer, and repeat
That's it.
If the algorithm comes to a point where there is no more superclass, but it still hasn't found the method yet, it will restart the whole process again, with method_missing
as the message and the name of the original message prepended to the arguments. But that's it. That's the whole algorithm. It is very small and very simple, and it has to be very small and very simple because method lookup is the single most often executed operation in an object-oriented language.
Note: I am completely ignoring Module#prepend
/ Module#prepend_features
since I just don't know enough about how it works. I only know what it does, and that's good enough for me.
Also note: I am ignoring performance optimizations such as caching the result of a method lookup in something like a Polymorphic Inline Cache.
Okay, but here's the trick: where exactly do those class
and superclass
pointers point to? Well, they do not point to what the Object#class
and Class#superclass
methods return. So, let's step back a little.
Every object has a class
pointer that points to the class of the object. And every class has a superclass
pointer that points to its superclass.
Let's start a running example:
class Foo; end
Now, we have class Foo
, and its superclass
pointer points to Object
.
foo = Foo.new
And our object foo
's class
pointer points to Foo
.
def foo.bar; end
Now things start to get interesting. We have created a singleton method. Well, actually, there is no such thing as a singleton method, it's really just a normal method in the singleton class. So, how does this work? Well, now the class
pointer points to foo
's singleton class and foo
's singleton class's superclass
pointer points to Foo
! In other words, the singleton class was inserted in between foo
and its "real" class Foo
.
However, when we ask foo
about its class, it still responds Foo
:
foo.class #=> Foo
The Object#class
method knows about singleton classes, and simply skips over them, following the superclass
pointer until it finds a "normal" class, and returns that.
Next complication:
module Bar; end
class Foo
include Bar
end
What happens here? Ruby creates a new class (let's call it Barʹ
), called an include class. This class's method table pointer, class variable table pointer, and constant table pointer point to Bar
's method table, class variable table, and constant table. Then, Ruby makes Barʹ
's superclass
pointer point to Foo
's current superclass, and then makes Foo
's superclass pointer point to Barʹ
. In other words, including a module creates a new class that gets inserted as the superclass of the class the module is included into.
There's a slight complication here: you can also include
modules into modules. How does that work? Well, Ruby simply keeps track of the modules that were included into a module. And then, when the module is included into a class, it will recursively repeat the steps above for every included module.
And that's all you need to know about the Ruby method lookup:
- find the class
- follow the superclass
- singleton classes insert above objects
- include classes insert above classes
Now let's look at some of your questions:
When calling methods explicitly on a class, there are lots of illustrations on the order in which the classes, and modules included by them are searched (and thus exactly what
super
calls in each case). But when not explicitly calling a method, e.g. a plainfunc args
rather thanself.func args
what is the search order?
The same. self
is the implicit receiver, if you don't specify a receiver, the receiver is self
. And parentheses are optional. In other words:
func args
is exactly the same as
self.func(args)
Why does in my example below, the member method calling
func
find the member method before the global, butfunc2
finds the global withoutmethod_missing
being called?
There is no such thing as a "global method" in Ruby. There is also no such thing as a "member method". Every method is an instance method. Period. There are no global, static, class, singleton, member methods, procedures, functions, or subroutines.
A method defined at the top-level becomes a private
instance method of class Object
. Test
inherits from Object
. Run the steps I outlined above, and you will find exactly what is going on:
- Retrieve
x
's class pointer:Test
- Does
Test
have a method calledfunc
: Yes, so invoke it.
Now again:
- Retrieve
x
's class pointer:Test
- Does
Test
have a method calledfunc2
: No! - Retrieve
Test
's superclass pointer:Object
- Does
Object
have a method calledfunc2
: Yes, so invoke it.
And when the global is instead an module/class/type, why is the member with the same name not found at all?
Again, there is no global here, there are no members here. This also doesn't have anything to do with modules or classes. And Ruby doesn't have (static) types.
Math
is a reference to a constant. If you want to call a method with the same name, you have to ensure that Ruby can tell that it's a method. There are two things that only methods can have: a receiver and arguments. So, you can either add a receiver:
self.Math
or arguments:
Math()
and now Ruby knows that you mean the method Math
and not the constant Math
.
The same applies to local variables, by the way. And to setters. If you want to call a setter instead of assigning a local variable, you need to say
self.func = 'setter method'
Ruby Method Lookup (comparison with JavaScript)
- Does
obj1
andobj2
in the Ruby code each own a copy of thesome_method
method? Or is it similar to JavaScript where both objects have access tosome_method
via another object (in this case, via the Child class)?
You don't know. The Ruby Language Specification simply says "if you do this, that happens". It does, however, not prescribe a particular way of making that happen. Every Ruby implementation is free to implement it in the way it sees fit, as long as the results match those of the spec, the spec doesn't care how those results were obtained.
You can't tell. If the implementation maintains proper abstraction, it will be impossible for you to tell how they do it. That is just the nature of abstraction. (It is, in fact, pretty much the definition of abstraction.)
- Similarly, when inheritance is taken into account in Ruby, does each Ruby object have a copy of all of the class and superclass methods of the same name?
Same as above.
There are a lot of Ruby implementations currently, and there have been even more in the past, in various stages of (in)completeness. Some of those implement(ed) their own object models (e.g. MRI, YARV, Rubinius, MRuby, Topaz, tinyrb, RubyGoLightly), some sit on top of an existing object model into which they are trying to fit (e.g. XRuby and JRuby on Java, Ruby.NET and IronRuby on the CLI, SmallRuby, smalltalk.rb, Alumina, and MagLev on Smalltalk, MacRuby and RubyMotion on Objective-C/Cocoa, Cardinal on Parrot, Red Sun on ActionScript/Flash, BlueRuby on SAP/ABAP, HotRuby and Opal.rb on ECMAScript)
Who is to say that all of those implementations work exactly the same?
My gut tells me that Ruby objects DO NOT have separate copies of the methods inherited from their class, mixed-in modules, and superclasses. Instead, my gut is that Ruby handles method lookup similarly to JavaScript, where objects check if the object itself has the method and if not, it looks up the method in the object's class, mixed-in modules, and superclasses until the lookup reaches
BasicObject
.
Despite what I wrote above, that is a reasonable assumption, and is in fact, how the implementations that I know about (MRI, YARV, Rubinius, JRuby, IronRuby, MagLev, Topaz) work.
Just think about what it would mean if it weren't so. Every instance of the String
class would need to have its own copy of all of String
's 116 methods. Think about how many String
s there are in a typical Ruby program!
ruby -e 'p ObjectSpace.each_object(String).count'
# => 10013
Even in this most trivial of programs, which doesn't require
any libraries, and only creates one single string itself (for printing the number to the screen), there are already more than 10000 strings. Every single one of those would have its own copies of the over 100 String
methods. That would be huge waste of memory.
It would also be a synchronization nightmare! Ruby allows you to monkeypatch methods at any time. What if I redefine a method in the String
class? Ruby would now have to update every single copy of that method, even across different threads.
And I actually only counted public methods defined directly in String
. Taking into account private methods, the number of methods is even bigger. And of course, there is inheritance: strings wouldn't just need a copy of every method in String
, but also a copy of every method in Comparable
, Object
, Kernel
, and BasicObject
. Can you imagine every object in the system having a copy of require
?
No, the way it works in most Ruby implementations is like this. An object has an identity, instance variables, and a class (in statically typed pseudo-Ruby):
struct Object
object_id: Id
ivars: Dictionary<Symbol, *Object>
class: *Class
end
A module has a method dictionary, a constant dictionary, and a class variable dictionary:
struct Module
methods: Dictionary<Symbol, *Method>
constants: Dictionary<Symbol, *Object>
cvars: Dictionary<Symbol, *Object>
end
A class is like a module, but it also has a superclass:
struct Class
methods: Dictionary<Symbol, *Method>
constants: Dictionary<Symbol, *Object>
cvars: Dictionary<Symbol, *Object>
superclass: *Class
end
When you call a method on an object, Ruby will look up the object's class
pointer and try to find the method there. If it doesn't, it will look at the class's superclass
pointer and so on, until it reaches a class which has no superclass. At that point it will actually not give up, but try to call the method_missing
method on the original object, passing the name of the method you tried to call as an argument, but that's just a normal method call, too, so it follows all the same rules (except that if a call to method_missing
reaches the top of the hierarchy, it will not try to call it again, that would result in an infinite loop).
Oh, but we ignored one thing: singleton methods! Every object needs to have its own method dictionary as well. Actually, rather, every object has its own private singleton class in addition to its class:
struct Object
object_id: Id
ivars: Dictionary<Symbol, *Object>
class: *Class
singleton_class: Class
end
So, method lookup starts first in the singleton class, and only then goes to the class.
And what about mixins? Oh, right, every module and class also needs a list of its included mixins:
struct Module
methods: Dictionary<Symbol, *Method>
constants: Dictionary<Symbol, *Object>
cvars: Dictionary<Symbol, *Object>
mixins: List<*Module>
end
struct Class
methods: Dictionary<Symbol, *Method>
constants: Dictionary<Symbol, *Object>
cvars: Dictionary<Symbol, *Object>
superclass: *Class
mixins: List<*Module>
end
Now, the algorithm goes: look first in the singleton class, then the class and then the superclass(es), where however, "look" also means "after you look at the method dictionary, also look at all the method dictionaries of the included mixins (and the included mixins of the included mixins, and so forth, recursively) before going up to the superclass".
Does that sound complicated? It is! And that's not good. Method lookup is the single most often executed algorithm in an object-oriented system, it needs to be simple and lightning fast. So, what some Ruby implementations (e.g. MRI, YARV) do, is to decouple the interpreter's internal notion of what "class" and "superclass" mean from the programmer's view of those same concepts.
An object no longer has both a singleton class and a class, it just has a class:
struct Object
object_id: Id
ivars: Dictionary<Symbol, *Object>
class: *Class
singleton_class: Class
end
A class no longer has a list of included mixins, just a superclass. It may, however, be hidden. Note also that the Dictionaries become pointers, you'll see why in a moment:
struct Class
methods: *Dictionary<Symbol, *Method>
constants: *Dictionary<Symbol, *Object>
cvars: *Dictionary<Symbol, *Object>
superclass: *Class
visible?: Bool
end
Now, the object's class pointer will always point to the singleton class, and the singleton class's superclass pointer will always point to the object's actual class. If you include a mixin M
into a class C
, Ruby will create a new invisible class M′
which shares its method, constant and cvar dictionaries with the mixin. This mixin class will become the superclass of C
, and the old superclass of C
will become the superclass of the mixin class:
M′ = Class.new(
methods = M->methods
constants = M->constants
cvars = M->cvars
superclass = C->superclass
visible? = false
)
C->superclass = *M'
Actually, it's little bit more involved, since it also has to create classes for the mixins that are included in M
(and recursively), but in the end, what we end up with is a nice linear method lookup path with no side-stepping into singleton classes and included mixins.
Now, the method lookup algorithm is just this:
def lookup(meth, obj)
c = obj->class
until res = c->methods[meth]
c = c->superclass
raise MethodNotFound, meth if c.nil?
end
res
end
Nice and clean and lean and fast.
As a trade-off, finding out the class of an object or the superclass of a class is slightly more difficult, because you can't simply return the class or superclass pointer, you have to walk the chain until you find a class that is not hidden. But how often do you call Object#class
or Class#superclass
? Do you even call it at all, outside of debugging?
Unfortunately, Module#prepend
doesn't fit cleanly into this picture. And Refinements really mess things up, which is why many Ruby implementations don't even implement them.
How does Ruby solve the method lookup in 'diamond' mixin?
My understanding is that:
include
andprepend
establishes a partial relation (ancestor relation) among the relevant modules/classes at each step, and- An ancestor relation cannot be contradicted in a later step.
Let us start with:
module M1; end
module M2; end
module M3; end
class Base; end
and follow each step. The first three steps should be trivial:
Base.ancestors
#=> [Base, Object, Kernel, BasicObject]
M1.prepend M3
M1.ancestors
# => [M3, M1]
M2.prepend M3
M2.ancestors
#=> [M3, M2]
Now, your crucial first step is Base.include M1
. This inserts the ancestors of M1
(the whole [M3, M1]
chunk) just to the right of Base
before Object
:
Base.include M1
Base.ancestors
#=> [Base, M3, M1, Object, Kernel, BasicObject]
The next step is Base.prepend M2
. This attempts to insert the ancestors of M2
(the whole [M3, M2]
chunk) just to the left of Base
. But notice that that would cause a contradictory relation between Base
and M3
.
Base.prepend M2
Base.ancestors
#=> Cannot be [M3, M2, Base, M3, M1, Object, Kernel, BasicObject]
Since it has already been established that M3
appears on the right side of Base
, the best it can do to place [M3, M2]
is to place it on the right side of Base
:
Base.prepend M2
Base.ancestors
#=> [Base, M3, M2, M1, Object, Kernel, BasicObject]
It may appear that placing M2
on the right side of Base
contradicts the intention of Base.prepend M2
. But that can be canceled/modified to fit on the spot, whereas an already established relation between Base
and M3
cannot be canceled in a later spot.
In fact, when there is no way to satisfy the relations already established, then an error is raised:
module M4; end
module M5; end
M4.include M5
M5.include M4 #>> ArgumentError: cyclic include detected
objects' method lookup path (modules & method override)
I'm not 100% I understand your question but I try my best...
Ruby actually remembers which modules are included into a class and merges these modules into the lookup path for methods. You can ask a class for its included methods my using A.included_modules
Methods from included modules are put on top of the modules defined in the current class. Please see this example:
class X
def foo
puts 'Hi from X'
end
end
module AnotherFoo
def foo
puts "Hi from AnotherFoo"
super
end
end
class Y < X
include AnotherFoo
end
y = Y.new
y.foo
# Hi from another foo
# Hi from X
class Y < X
include AnotherFoo
def foo
puts "Hi from Y"
super
end
end
y.foo
# Hi from Y
# Hi from another foo
# Hi from X
You can see the method resolution order: the child class -> the included modules -> the parent class. You can also see that modules are always included only once. So if a class already includes a module, it will not be re-included again.
Nested singleton class method lookup
Much of this explanation is based on How Ruby Method Dispatch Works by James Coglan, a little of the Ruby Hacking Guide, and just a smidge of source.
To begin with a summary, the ancestry looks like this:
+----------------+
| |
+--------------------------- Module ~~~~~~~~~~~~~~> #<Class:Module> |
| ^ ^ |
| | | |
| Class ~~~~~~~~~~~~~~~> #<Class:Class> |
| ^ ^ |
| | | |
| BasicObject ~~~~~> #<Class:BasicObject> ~~> #<Class:#<Class:BasicObject>> |
| ^ ^ ^ |
| | Kernel | | |
| | ^ | | |
| | | | +-----------------------|----------------+
| +-----+----+ | | |
| | | v |
+-------> Object ~~~~~~> #<Class:Object> ~~~~~~~~> #<Class:#<Class:Object>>
^ ^ ^
| | |
Foo ~~~~~~~~> #<Class:Foo> ~~~~~~~~~~> #<Class:#<Class:Foo>>
---> Parent
~~~> Singleton class
Let's start from the beginning and build out. BasicObject
is the root of everything - if you check BasicObject.superclass
, you get nil
. BasicObject
is also an instance of Class
. Yes, that gets circular, and there's a special case in the code to deal with it. When A
is an instance of B
, A.singleton_class
is a child of B
, so we get this:
Class
^
|
BasicObject ~~~~~> #<Class:BasicObject>
Object
inherits from BasicObject
. When A
inherits from B
, A
is a child of B
and A.singleton_class
is a child of B.singleton_class
. Object
also includes Kernel
. When A
includes B
, B
is inserted as the first ancestor of A
(after A
itself, but before A.superclass
).
Class
^
|
BasicObject ~~~~~> #<Class:BasicObject
^ ^
| Kernel |
| ^ |
| | |
+-----+----+ |
| |
Object ~~~~~~> #<Class:Object>
Kernel
is an instance of Module
. It's the only instance of Module
we'll see, and its singleton class doesn't appear in any ancestry chains, so I won't draw beyond it.
Now we get down to Foo
, which inherits from Object
(though you don't need to write < Object
). We can already figure out what Foo
and its singleton class are children of.
Class
^
|
BasicObject ~~~~~> #<Class:BasicObject>
^ ^
| Kernel |
| ^ |
| | |
+-----+----+ |
| |
Object ~~~~~~> #<Class:Object>
^ ^
| |
Foo ~~~~~~~~> #<Class:Foo>
Now Class
inherits from Module
, and Module
inherits from Object
, so add Module
and the appropriate singleton classes. Because Module < Object
and Object < BasicObject
and BasicObject.instance_of?(Class)
, this is where the drawing gets a little funky. Remember you just stop traversing upwards whenever you hit BasicObject
.
+----------------+
| |
+--------------------------- Module ~~~~~~~~~~~~~~> #<Class:Module> |
| ^ ^ |
| | | |
| Class ~~~~~~~~~~~~~~~> #<Class:Class> |
| ^ |
| | |
| BasicObject ~~~~~> #<Class:BasicObject> |
| ^ ^ |
| | Kernel | |
| | ^ | |
| | | | +----------------------------------------+
| +-----+----+ | |
| | | v
+-------> Object ~~~~~~> #<Class:Object>
^ ^
| |
Foo ~~~~~~~~> #<Class:Foo>
Related Topics
Bundle Install Doesn't Work from Capistrano
How to Use One Line Regular Expression to Get Matched Content
Rbenv Can't Change Global Ruby Version
Rendering Partials/View in a Rake Task/Background Job/Model in Rails 4
Restarting Unicorn with Usr2 Doesn't Seem to Reload Production.Rb Settings
Failed to Build JSON (1.6.3) with Native Extension After I Installed Xcode 4.2
Puts Doesn't Print Stuff to Console
Creating an Md5 Hash of a Number, String, Array, or Hash in Ruby
No Database Connection in Rails Console
Simple_Form: Remove Outer Label for an Inline Checkbox with Label
Ruby Class with Static Method Calling a Private Method
Generate a Migration File from Schema.Rb
How to Make Rails 3 Assets Precompile Faster