Ruby Variable as Same Object (Pointers)

ruby variable as same object (pointers?)

class Ref
  def initialize val
    @val = val
  end

  attr_accessor :val

  def to_s
    @val.to_s
  end
end

a = Ref.new(4)
b = a

puts a   #=> 4
puts b   #=> 4

a.val = 5

puts a   #=> 5
puts b   #=> 5

When you do b = a, b points to the same object as a (they have the same object_id).

When you do a = some_other_thing, a will point to another object, while b remains unchanged.

For Fixnum, nil, true and false, you cannot change the value without changing the object_id. However, you can change other objects (strings, arrays, hashes, etc.) without changing object_id, since you don't use the assignment (=).

Example with strings:

a = 'abcd'
b = a

puts a  #=> abcd
puts b  #=> abcd

a.upcase!          # changing a

puts a  #=> ABCD
puts b  #=> ABCD

a = a.downcase     # assigning a

puts a  #=> abcd
puts b  #=> ABCD

Example with arrays:

a = [1]
b = a

p a  #=> [1]
p b  #=> [1]

a << 2            # changing a

p a  #=> [1, 2]
p b  #=> [1, 2]

a += [3]          # assigning a

p a  #=> [1, 2, 3]
p b  #=> [1, 2]

How to assign a pointer to a variable in Ruby

In Ruby every variable is just a pointer/reference to an instance, there isn't anything else.

a = { foo: "Hello" }
b = a[:foo]
b.gsub!('e', 'a')
puts a.inspect
# => { foo: "Hallo" }

So in your example b is not assigned a copy of "Hello", it really is a reference to the same instance stored in the hash.

When you assign to a variable though, you are replacing the reference/pointer stored in the variable with something else.

What you seem to try to do is go one level deeper: You want to have a pointer to a reference (kind of a P** in C, I guess). In Ruby you can solve this with any kind of additional object which can store your value. How about an array?

a = { foo: [1] }
b = a[:foo]
b[0] = 2
puts a.inspect
# => { foo: [2] }

Object assignment and pointers

There are a lot of questions in this question. The main thing to know is assignment never makes a copy in ruby, but methods often return new objects instead of modifying existing objects. For immutable objects like Fixnums, you can ignore this, but for objects like arrays or Foo instances, to make a copy you must do bar.dup.

As for the array example, foo += is not concatenating onto the array stored in foo, to do that you would do foo.concat(['a']). Instead, it is making a new array and assigning foo to that. The documentation for the Array class mentions which methods mutate the array in place and which return a new array.

Is it possible to use pointers in Ruby?

In Ruby, (almost) every variable is in fact a reference/pointer to an object, e.g.

a = [0, 1, 23]
b = a
a << 42
p b

will give [0, 1, 23, 42] because a and b are pointing to the same object.

So in fact, you are using pointers all the time.

If you want to do pointer arithmetic as in C, this is not possible with Ruby.

Ruby object prints out as pointer

When you use new method, you get 'reference' on newly created object. puts kernel method returns some internal ruby information about this object. If you want to get any information about state your object, you can use getter method:

class Adder
  def initialize(my_num)
    @my_num = my_num
  end
  def my_num
    @my_num
  end
end
y = Adder.new(12)
puts y.my_num  # => 12

Or you can use 'attr_reader' method that define a couple of setter and getter methods behind the scene:

class Adder
  attr_accessor :my_num

  def initialize(my_num)
    @my_num = my_num
  end      
end
y = Adder.new(12)
puts y.my_num  # => 12

How does Ruby differentiate VALUE with value and pointer?

There are many different implementations of Ruby. The Ruby Language Specification doesn't prescribe any particular internal representation for objects – why should it? It's an internal representation, after all!

For example, JRuby doesn't represent objects as C pointers at all, it represents them as Java objects. IronRuby represents them as .NET objects. Opal represents them as ECMAScript objects. MagLev represents them as Smalltalk objects.

However, there are indeed some implementations that use the strategy you describe. The now abandoned MRI did it that way, YARV and Rubinius also do it.

This is actually a very old trick, dating back to at least the 1960s. It's called a tagged pointer representation, and like the name suggests, you need to tag the pointer with some additional metadata in order to know whether or not it is actually a pointer to an object or an encoding of some other datatype.

Some CPUs have special tag bits specifically for that purpose. (For example, on the AS/400, the CPU doesn't even have pointers, it has 128bit object references, even though the original CPU was only 48bit wide, and the newer POWER-based CPUs 64 bit; the extra bits are used to encode all sorts of metadata like type, owner, access restrictions, etc.) Some CPUs have tag bits for other purposes that can be "abused" for this purpose. However, most modern mainstream CPUs don't have tag bits.

But, you can use a trick! On many modern CPUs, unaligned memory accesses (accessing an address that does not start at a word boundary) are really slow (on some, they aren't even possible at all), which means that on a 32bit CPU, all pointers that are realistically being used, end with two 00 bits and on 64 bit CPUs with three 000 bits. You can use these bits as tag bits: pointers that end with 00 are indeed pointers, pointers that end with 01, 10, or 11 are an encoding of some other data type.

In MRI, the pointers ending in 1 were used to encode 31/63 bit Fixnums. In YARV, they are used to encode 31/63 bit Fixnums, i.e. integers that are encoded as actual machine integers according to the formula 2n+1 (arithmetically speaking) or (n << 1) | 1 (as a bit pattern). On 64 bit platforms, YARV also uses pointers that end in 10 to encode 62 bit flonums using a similar scheme. (If you ever wondered why the object_id of a Fixnum in YARV is 2n+1, now you know: YARV uses the memory address for the object ID, and 2n+1 is the "memory address" of n.)

Now, what about nil, false and true? Well, there is no space for them in our current scheme. However, the very low memory addresses are usually reserved for the operating system kernel, which means that a pointer like 0 or 2 or 4 cannot realistically occur in a program. YARV uses that space to encode nil, false and true: false is encoded as 0 (which is convenient because that's also the encoding of false in C), nil is encoded as 0b1000 and true is encoded as 0b10100 (it used to be 0, 0b10 and 0b100 in older versions before the introduction of flonums).

Theoretically, there is a lot of space there to encode other objects as well, but YARV doesn't do that. Some Smalltalk or Lisp VMs, for example, encode ASCII or BMP Unicode character objects there, or some often used objects such as the empty list, empty array, or empty string.

There is still some piece missing, though: without an object header, with just the bare bit pattern, how can the VM access the class, the methods, the instance variables, etc.? Well, it can't. Those have to be special-cased and hardcoded into the VM. The VM simply has to know that a pointer ending in 1 is an encoded Fixnum and has to know that the class is Fixnum and the methods can be found there. And as for instance variables? Well, you could store them separately from the objects in a dictionary on the side. Or you go the Ruby route and simply disallow them altogether.

Ruby and pointers compared to other languages?

In general - ruby variables hold references, not pointers to the values. When you assign:

a = [3]
b = a

b references the same array as a. It's the same with strings.

Worth reading:
https://robertheaton.com/2014/07/22/is-ruby-pass-by-reference-or-pass-by-value/

EDIT:

a = [1]
b = a
b
# => [1]
a = [2]
a
# => [2]
b
# => [1]

Pointer in Ruby

All ruby variable references are essentially pointers (but not pointers-to-pointers), in C parlance.

You can mutate an object (assuming it's not immutable), and all variables that reference it will thus be pointing at the same (now mutated) object. But the only way to change which object a variable is referring to is with direct assignment to that variable -- and each variable is a separate reference; you can't alias a single reference with two names.

Ruby Variable as Same Object (Pointers)