Good Explanation of Ruby Object Model -- Mainly, 'Classes Are Objects'

What comes first in Ruby's object model?

Here is a complete Ruby Class Diagram (for Ruby 1.8): http://banisterfiend.wordpress.com/2008/11/25/a-complete-ruby-class-diagram/

To help you understand the strange seemingly impossible self-reflexive nature of the class diagram it is useful to know that Class pointers and Super class pointers can be assigned at any time in the C API. That is, you can create an object (in the C API) and after-the-fact decide what the Class and Super class pointers point to.

Also, to understand the order of definition, look at Init_Object() in object.c (in Ruby 1.9)

rb_cBasicObject = boot_defclass("BasicObject", 0);
rb_cObject = boot_defclass("Object", rb_cBasicObject);
rb_cModule = boot_defclass("Module", rb_cObject);
rb_cClass = boot_defclass("Class", rb_cModule);

metaclass = rb_make_metaclass(rb_cBasicObject, rb_cClass);
metaclass = rb_make_metaclass(rb_cObject, metaclass);
metaclass = rb_make_metaclass(rb_cModule, metaclass);
metaclass = rb_make_metaclass(rb_cClass, metaclass);
boot_defmetametaclass(rb_cModule, metaclass);
boot_defmetametaclass(rb_cObject, metaclass);
boot_defmetametaclass(rb_cBasicObject, metaclass);

Where rb_cBasicObject is BasicObject in Ruby, rb_cObject is Object in Ruby, and so on.

Ruby Object Model - ancestors of a class

The class A is an instance of Class, and you can see that via A.class

The class of an instance of A is A, and you access that via a = A.new; a.class

The method ancestors is showing the class hierarchy that an object of that class has (or would have) as its inheritance.

There are two parallel class hierarchy models going on in your example, and they only impinge on each other because Ruby represents its classes as objects for you to inspect and modify.

There is no fundamental reason to need A.class.ancestors and A.ancestors to intersect at all - except Ruby also has a deep class model with simple roots, so in practice that is what you'll see.

In fact I couldn't find any counter-example, even nil does this:

NilClass.ancestors
=> [NilClass, Object, Kernel, BasicObject]

NilClass.class.ancestors
=> [Class, Module, Object, Kernel, BasicObject]

This one is more enlightening though:

BasicObject.ancestors
=> [BasicObject]

BasicObject.class.ancestors
=> [Class, Module, Object, Kernel, BasicObject]

Ruby's Classes and Objects

You have several things wrong.

  • (2.) The reason Class, Module, Object, and Basic Object are instances of Class is not because Class < Module < Object < BasicObject. It has nothing to do with it.
  • (3.) (Object.new).instance_of? Class returns false not because Object.new is an instance of Object. It is because it is not an instance of Class.
  • Class.is_a? Object is true not because [the mentioned] Class is a subclass of Object. It is because (the mentioned) Class is an instance of Class (which is not mentioned), which is a subclass of Object.

The answer to the question is: Object.is_a? Class returns true because Object is an instance of Class.

If you want to know the class of an instance, use instance_of? or class methods.

3.is_a?(Object) # => false
3.is_a?(Fixnum) # => true
3.class # => Fixnum

Ruby craziness: Class vs Object?

Basically the key thing to understand is that every class is an instance of the Class class and every class is a subclass of Object (in 1.8 - in 1.9 every class is a subclass of BasicObject). So every class is an object in the sense that it is an instance of a subclass of Object, i.e. Class.

Of course this means that Class is an instance of itself. If that makes your brain hurt, just don't think about it too deeply.

Object and Class are is_a? Object

x.is_a? y returns true if x.class == y or x.class < y, i.e. if x's class is y or x's class inherits from y. Since every class inherits from object x.is_a? Object returns true no matter what x is. (In 1.8 anyway, in 1.9 there's also BasicObject which is now the most basic class in the inheritance hierarchy).

They are also is_a? Class

Both Object and Class are indeed classes, so that should not be surprising.

They are also instance_of? Class, but not instance_of? Object.

Unlike is_a?, x.instance_of? y only returns true if x.class == y, not if x.class is a subclass of y. So since both x and y are instance_of? Class, they're not instance_of? Object.

right, nothing can be instance of object.

That's not true. Object.new.instance_of? Object is true.

kind_of?

kind_of? is an alias for is_a?, so see above.

So both are exactly same, then why do we have both these.?

It should be pointed out that everything up to now is true for all classes. E.g. String.is_a? Object, String.is_a? Class and String.instance_of? Class are true and String.instance_of? Object is false for the same reasons as above. (Also String.is_a? String and String.instance_of? String are both false for the same reasons - String is a class, not a string).

You can not conclude from this that all the classes are the same. They're just all instances of the same class.

Comparing methods

Since both Object and Class are classes, they both have all the instance methods defined by Class. Class additionally has the singleton method nesting. nesting tells you which module you're currently nested in, it has nothing to do with inheritance.

For any given class TheClass.methods will return the instance methods defined by Class (e.g. superclass, which returns the class which TheClass inherits from, and new which creates a new instance of TheClass) plus the singleton methods defined by that class.

Anyway methods only tells you which methods can be called directly on a given object. It does not tell you which methods can be called on an instance of a class. For that you can use instance_methods, which returns significantly different results for Object and Class.

Confused with objects in Ruby

Here's the cool thing, 100 is an object too. So a pointed to the constant object 100 and when you assigned b = a you assigned b to the pointer to the object 100 not to the pointer for a or even a value of 100 held by a.

To see this is true try this:

irb> puts 100.object_id
=> 201

Cool huh?

Here's a little explanation of the Ruby object model I wrote up a while back. It's not specific to your question but adds a little more knowledge on how the Ruby object model works:

When you create an instance of an object what you have created is a new object with a set of instance variables and a pointer to the class of the object (and a few other things like an object ID and a pointer to the superclass) but the methods themselves are not in the instance of the object. The class definition contains the list of methods and their code (and a pointer to its own class, a pointer to its superclass, and an object ID).

When you call a method on an instance Ruby looks up the class of the instance and looks in that class's method list for the method you called. If it doesn't find it then it looks in the class' superclass. If it doesn't find it there it looks in that class' superclass until it runs out of superclasses. Then it goes back to the first class and looks for a method_missing method. If it doesn't find one it goes to the superclass and so on till it gets to the root object where it's designed to raise an error.

Let's say for instance you have a class Person and you make an instance of the class with the variable bubba like this:

class Person
attr_accessor :dob, :name
def age
years = Time.now.year - @dob.year
puts "You are #{years} year#{"s" if years != 1} old"
end
def feed
puts "nom, nom, nom"
end
end
bubba = Person.new
bubba.name = "Bubba"
bubba.dob = Time.new(1983,9,26)

The class diagram would look something like this:
alt text

So what's happening when you create a static method, a class/module method? Well, remember that almost everything is an object in Ruby and a module definition is an instance of the class Class. Yep, that code you type out is actually an instance too, it's live code. When you create a class method by using def self.method_name you are creating a method in the instance of the object that is the class/module definition.

Great, so where's that class method being defined at you ask? It's being defined in an anonymous class (aka singleton, eigen, ghost class) that is created for exactly this reason.

Going back to our Person class what if we add a class method on the instance bubba like so:

def bubba.drive_pickup
puts "Yee-haw!"
end

That method gets put into a special singleton class created just for that instance and the singleton's superclass is now the Person class. This makes our method calling chain look like this:
alt text

Any other methods defined on the instance object bubba will also be put into that singleton class. There's never more than one singleton class per instance object.

So, to wrap it all up the reason why it doesn't work is the static methods in the modules are being defined in the singleton class for the instance of the module definition. When you include or extend from the module you are adding a pointer to the method table of the module but not the method table of the instance object of the singleton class for the module.

Think of it this way: If you create an instance x of type Z and an instance y of type Z should x know about y? No, not unless specifically told about it. So too your module that mixes in another module should not know about some other object that just happens to have that first module as its superclass.

For a much better explanation of the Ruby object model watch this awesome free video by the amazingly erudite Dave Thomas (no, not the guy from Wendy's):

http://scotland-on-rails.s3.amazonaws.com/2A04_DaveThomas-SOR.mp4

After watching that video I bought Dave Thomas's whole series on the Ruby object model from Pragmatic and it was well worth it.

P.S. Anyone please feel free to correct me on anything I forgot; like what's specifically in an object.

Difference between a class and a module

The first answer is good and gives some structural answers, but another approach is to think about what you're doing. Modules are about providing methods that you can use across multiple classes - think about them as "libraries" (as you would see in a Rails app). Classes are about objects; modules are about functions.

For example, authentication and authorization systems are good examples of modules. Authentication systems work across multiple app-level classes (users are authenticated, sessions manage authentication, lots of other classes will act differently based on the auth state), so authentication systems act as shared APIs.

You might also use a module when you have shared methods across multiple apps (again, the library model is good here).

In Ruby or Python can the very concept of Class be rewritten?

I agree with Samir that it just sounds like duck typing. You don't need to care what 'type' an object really 'is' you only need bother with what an object can 'do'. This is true in both Ruby and Python.

However if you really are checking the types of classes and you really do need to have a Poodle object optionally also be a FurCoat at runtime, then the way to do this in Ruby is to mixin a FurCoat module into the Poodle object, as follows:

class Poodle; end
module FurCoat; def wear; end; end

my_poodle = Poodle.new
my_poodle.is_a?(Poodle) #=> true
my_poodle.is_a?(FurCoat) #=> false
my_poodle.wear #=> NoMethodError

# now we mix in the FurCoat module
my_poodle.extend(FurCoat)

# my_poodle is now also a FurCoat
my_poodle.is_a?(Poodle) #=> true (still)
my_poodle.is_a?(FurCoat) #=> true
my_poodle.wear #=> the wear method now works

EDIT (due to your updated question):

You still do not need to rewrite Class to achieve what you want, you just need to monkey-patch the kind_of? and is_a? (and potentially instance_of?) methods on Ruby's Kernel module. Since Ruby has open classes this is easily done:

class Module
def obj_implements_interface?(obj)
false
end
end

module Kernel
alias_method :orig_is_a?, :is_a?

def is_a?(klass)
orig_is_a?(klass) || klass.obj_implements_interface?(self)
end
end

And then define for each class (or module) what it means for an object to implement its interface:

class Dog
def self.obj_implements_interface?(obj)
obj.respond_to?(:bark) && obj.respond_to?(:num_legs) && obj.num_legs == 4
end
end

module FurCoat
def self.obj_implements_interface?(obj)
obj.respond_to?(:wear)
end
end

Now test it:

my_poodle = Poodle.new
my_poodle.is_a?(FurCoat) #=> false

# now define a wear method on my_poodle
def my_poodle.wear; end
my_poodle.is_a?(FurCoat) #=> true

What happens when a method is used on an object created from a built in class?

Hence, when calling a method on an object you are actually "talking" to that object, inspecting and using its attributes that are stored in its instance variables. All good for now.

No, that is very much not what you are doing in an Object-Oriented Program. (Or really any well-designed program.)

What you are describing is a break of encapsulation, abstraction, and information hiding. You should never inspect and/or use another object's instance variables or any of its other private implementation details.

In Object-Orientation, all computation is performed by sending messages between objects. The only thing you can do is sending messages to objects and the only thing you can observe about an object is the responses to those messages.

Only the object itself can inspect and use its attributes and instance variables. No other object can, not even objects of the same type.

If you send an object a message and you get a response, the only thing you know is what is in that response. You don't know how the object created that response: did the object compute the answer on the fly? Was the answer already stored in an instance variable and the object just responded with that? Did the object delegate the problem to a different object? Did it print out the request, fax it to a temp agency in the Philippines, and have a worker compute the answer by hand with pen and paper? You don't know. You can't know. You mustn't know. That is at the very heart of Object-Orientation.

This is, BTW, exactly how messaging works in real-life. If you send someone a message asking "what is π²" and they answer with "9.8696044011", then you have no idea whether they computed this by hand, used a calculator, used their smart phone, looked it up, asked a friend, or hired someone to answer the question for them.

You can imagine objects as being little computers themselves: they have internal storage, RAM, HDD, SSD, etc. (instance variables), they have code running on them, the OS, the basic system libraries, etc. (methods), but one computer cannot read another computer's RAM (access its instance variables) or run its code (execute its methods). It can only send it a request over the network and look at the response.

So, in some sense, your question is meaningless: from the point of view of Object-Oriented Abstraction, is should be impossible to answer your question, because it should be impossible to know how an object is implemented internally.

It could use instance variables, or it could not. It could be implemented in Ruby, or it could be implemented in another programming language. It could be implemented as a standard Ruby object, or it could be implemented as some secret internal private part of the Ruby implementation.

In fact, it could even not exist at all! (For example, in many Ruby implementations small integers do not actually exist as objects at all. The Ruby implementation will just make it look like they do.)

My question is, when creating an object from a built in class (String, Array, Integer...), are we actually storing some information on some instance variables for that object during its creation?

[…] [W]hat happens when we call something like string.upcase, how does the #upcase method "work" on string? I guess that in order to return the string in uppercase, the string object previously declared has some instance variables attached to, and the instances methods work on those variables?

There is nothing in the Ruby Language Specification that says how the String#upcase method is implemented. The Ruby Language Specification only says what the result is, but it doesn't say anything about how the result is computed.

Note that this is not specific to Ruby. This is how pretty much every programming language works. The Specification says what the results should be, but the details of how to compute those results is left to the implementor. By leaving the decision about the internal implementation details up to the implementor, this frees the implementor to choose the most efficient, most performant implementation that makes sense for their particular implementation.

For example, in the Java platform, there are existing methods available for converting a string to upper case. Therefore, in an implementation like TruffleRuby, JRuby, or XRuby, which sits on top of the Java platform, it makes sense to just call the existing Java methods for converting strings to upper case. Why waste time implementing an algorithm for converting strings to upper case when somebody else has already done that for you? Likewise, in an implementation like IronRuby or Ruby.NET, which sit on top of the .NET platform, you can just use .NET's builtin methods for converting strings to upper case. In an implementation like Opal, you can just use ECMAScript's methods for converting strings to upper case. And so on.

Unfortunately, unlike many other programming languages, the Ruby Language Specification does not exist as a single document in a single place. Ruby does not have a single formal specification that defines what certain language constructs mean.

There are several resources, the sum of which can be considered kind of a specification for the Ruby programming language.

Some of these resources are:

  • The ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification – Note that the ISO Ruby Specification was written around 2009–2010 with the specific goal that all existing Ruby implementations at the time would easily be compliant. Since YARV only implements Ruby 1.9+ and MRI only implements Ruby 1.8 and lower, this means that the ISO Ruby Specification only contains features that are common to both Ruby 1.8 and Ruby 1.9. Also, the ISO Ruby Specification was specifically intended to be minimal and only contain the features that are absolutely required for writing Ruby programs. Because of that, it does for example only specify Strings very broadly (since they have changed significantly between Ruby 1.8 and Ruby 1.9). It obviously also does not specify features which were added after the ISO Ruby Specification was written, such as Ractors or Pattern Matching.
  • The Ruby Spec Suite aka ruby/spec – Note that the ruby/spec is unfortunately far from complete. However, I quite like it because it is written in Ruby instead of "ISO-standardese", which is much easier to read for a Rubyist, and it doubles as an executable conformance test suite.
  • The Ruby Programming Language by David Flanagan and Yukihiro 'matz' Matsumoto – This book was written by David Flanagan together with Ruby's creator matz to serve as a Language Reference for Ruby.
  • Programming Ruby by Dave Thomas, Andy Hunt, and Chad Fowler – This book was the first English book about Ruby and served as the standard introduction and description of Ruby for a long time. This book also first documented the Ruby core library and standard library, and the authors donated that documentation back to the community.
  • The Ruby Issue Tracking System, specifically, the Feature sub-tracker – However, please note that unfortunately, the community is really, really bad at distinguishing between Tickets about the Ruby Programming Language and Tickets about the YARV Ruby Implementation: they both get intermingled in the tracker.
  • The Meeting Logs of the Ruby Developer Meetings.
  • New features are often discussed on the mailing lists, in particular the ruby-core (English) and ruby-dev (Japanese) mailing lists.
  • The Ruby documentation – Again, be aware that this documentation is generated from the source code of YARV and does not distinguish between features of Ruby and features of YARV.
  • In the past, there were a couple of attempts of formalizing changes to the Ruby Specification, such as the Ruby Change Request (RCR) and Ruby Enhancement Proposal (REP) processes, both of which were unsuccessful.
  • If all else fails, you need to check the source code of the popular Ruby implementations to see what they actually do.

For example, this is what the ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification has to say about String#upcase:

15.2.10.5.42 String#upcase

upcase

  • Visibility: public
  • Behavior: The method returns a new direct instance of the class String which contains all the characters of the receiver, with all the lower-case characters replaced with the corresponding upper-case characters.

As you can see, there is no mention of instances variables or really any details at all about how the method is implemented. It only specifies the result.

If a Ruby implementor wants to use instance variables, they are allowed to use instances variables, if a Ruby implementor doesn't want to use instance variables, they are allowed to do that, too.

If you check the Ruby Spec Suite for String#upcase, you will find specifications like these (this is just an example, there are quite a few more):

describe "String#upcase" do
it "returns a copy of self with all lowercase letters upcased" do
"Hello".upcase.should == "HELLO"
"hello".upcase.should == "HELLO"
end

describe "full Unicode case mapping" do
it "works for all of Unicode with no option" do
"äöü".upcase.should == "ÄÖÜ"
end

it "updates string metadata" do
upcased = "aßet".upcase

upcased.should == "ASSET"
upcased.size.should == 5
upcased.bytesize.should == 5
upcased.ascii_only?.should be_true
end
end
end

Again, as you can see, the Spec only describes results but not mechanisms. And this is very much intentional.

The same is true for the Ruby-Doc documentation of String#upcase:

upcase(*options)string

Returns a string containing the upcased characters in self:

s = 'Hello World!' # => "Hello World!"
s.upcase # => "HELLO WORLD!"

The casing may be affected by the given options; see Case Mapping.

There is no mention of any particular mechanism here, nor in the linked documentation about Unicode Case Mapping.

All of this only tells us how String#upcase is specified and documented, though. But how is it actually implemented? Well, lucky for us, the majority of Ruby implementations are Free and Open Source Software, or at the very least make their source code available for study.

In Rubinius, you can find the implementation of String#upcase in core/string.rb lines 819–822 and it looks like this:

def upcase
str = dup
str.upcase! || str
end

It just delegates the work to String#upcase!, so let's look at that next, it is implemented right next to String#upcase in core/string.rb lines 824–843 and looks something like this (simplified and abridged):

def upcase!
return if @num_bytes == 0

ctype = Rubinius::CType

i = 0
while i < @num_bytes
c = @data[i]
if ctype.islower(c)
@data[i] = ctype.toupper!(c)
end
i += 1
end
end

So, as you can see, this is indeed just standard Ruby code using instance variables like @num_bytes which holds the length of the String in platform bytes and @data which is an Array of platform bytes holding the actual content of the String. It uses two helper methods from the Rubinius::CType library (a library for manipulating individual characters as byte-sized integers). The "actual" conversion to upper case is done by Rubinius::CType::toupper!, which is implemented in core/ctype.rb and is extremely simple (to the point of being simplistic):

def self.toupper!(num)
num - 32
end

Another very simple example is the implementation of String#upcase in Opal, which you can find in opal/corelib/string.rb and looks like this:

def upcase
`self.toUpperCase()`
end

Opal is an implementation of Ruby for the ECMAScript platform. Opal cleverly overloads the Kernel#` method, which is normally used to spawn a sub shell (which doesn't exist in ECMAScript) and execute commands in the platform's native command language (which on the ECMAScript platform arguably is ECMAScript). In Opal, Kernel#` is instead used to inject arbitrary ECMAScript code into Ruby.

So, all that `self.toUpperCase()` does, is call the String.prototype.toUpperCase method on self, which does work because of how the String class is defined in Opal:

class ::String < `String`

In other words, Opal implements Ruby's String class by simply inheriting from ECMAScript's String "class" (really the String Constructor function) and is therefore able to very easily and elegantly reuse all the work that has been done implementing Strings in ECMAScript.

Another very simple example is TruffleRuby. Its implementation of String#upcase can be found in src/main/ruby/truffleruby/core/string.rb and looks like this:

def upcase(*options)
s = Primitive.dup_as_string_instance(self)
s.upcase!(*options)
s
end

Similar to Rubinius, String#upcase just delegates to String#upcase!, which is not surprising since TruffleRuby's core library was originally forked from Rubinius's. This is what String#upcase! looks like:

def upcase!(*options)
mapped_options = Truffle::StringOperations.validate_case_mapping_options(options, false)
Primitive.string_upcase! self, mapped_options
end

The Truffle::StringOperations::valdiate_case_mapping_options helper method is not terribly interesting, it is just used to implement the rather complex rules for what the Case Mapping Options that you can pass to the various String methods are allowed to look like. The actual "meat" of TruffleRuby's implementation of String#upcase! is just this: Primitive.string_upcase! self, mapped_options.

The syntax Primitive.some_name was agreed upon between the developers of multiple Ruby implementations as "magic" syntax within the core of the implementation itself to be able to call out from Ruby code into "primitives" or "intrinsics" that are provided by the runtime system, but are not necessarily implemented in Ruby.

In other words, all that Primitive.string_upcase! self, mapped_options tells us is "there is a magic function called string_upcase! defined somewhere deep in the bowels of TruffleRuby itself, which knows how to convert a string to upper case, but we are not supposed to know how it works".

If you are really curious, you can find the implementation of Primitive.string_upcase! in src/main/java/org/truffleruby/core/string/StringNodes.java.



Related Topics



Leave a reply



Submit