Ruby Symbol#To_Proc Leaks References in 1.9.2-P180

Ruby Symbol#to_proc leaks references in 1.9.2-p180?

As a.map(&:foo) should be the exact equivalent to a.map{|x| x.foo}, it seems like you really hit a bug in the Ruby code here. It cannot hurt to file a bug report on (http://redmine.ruby-lang.org/), the worst that can happen is that its being ignored. You can decrease the chances of that by providing a patch for the issue.

EDIT: I threw on my IRB and tried your code. I can reproduce the issue you describe on ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-linux]. However, explicitely calling to_proc on the symbol does not suffer from the same problem:

irb(main):001:0> class C; def foo; end; end
=> nil
irb(main):002:0> a = 10.times.map { C.new }
=> [...]
irb(main):004:0> b = a.map(&:foo.to_proc)
=> [nil, nil, nil, nil, nil, nil, nil, nil, nil, nil]
irb(main):005:0> ObjectSpace.each_object(C){}
=> 10
irb(main):006:0> a = b = nil
=> nil
irb(main):007:0> GC.start
=> nil
irb(main):008:0> ObjectSpace.each_object(C){}
=> 0

It seems we are facing an issue with the implicit Symbol -> Proc conversion here. Maybe I will try to dive a bit into the Ruby source later. If so, I will keep you updated.

EDIT 2:

Simple workaround for the problem:

class Symbol
  def to_proc
    lambda { |x| x.send(self) }
  end
end

class C
  def foo; "foo"; end
end

a = 10.times.map { C.new }
b = a.map(&:foo)
p b
a = b = nil
GC.start
p ObjectSpace.each_object(C) {}

prints 0.

Explicitly yielding n values to Symbol#to_proc

Here's how you can do it:

class Enumerator
  def explicitly
    each { |e| yield(*e) }
  end
end

I executed your tests against this and the proper results are returned.

UPDATE: Changed to not capture the block explicitly.

Chaining methods using Symbol#to_proc shorthand in Ruby?

No, there's no shorthand for that. You could define a method:

def really_empty?(x)
  x.strip.empty?
end

and use method:

array.reject(&method(:really_empty?))

or use a lambda:

really_empty = ->(x) { x.strip.empty? }
array.reject(&really_empty)

but I wouldn't call either of those better unless you have a use for really_empty? in enough places that splitting up the logic makes sense.

However, since you're using Rails, you could just use blank? instead of .strip.empty?:

array.reject(&:blank?)

Note that nil.blank? is true whereas nil.strip.empty? just hands you an exception so they're not quite equivalent; however, you probably want to reject nils as well so using blank? might be better anyway. blank? also returns true for false, {}, and [] but you probably don't have those in your array of strings.

Finding the cause of a memory leak in Ruby

It looks like you are entering The Lost World here. I don’t think the problem is with c-bindings in racc either.

Ruby memory management is both elegant and cumbersome. It stores objects (named RVALUEs) in so-called heaps of size of approx 16KB. On a low level, RVALUE is a c-struct, containing a union of different standard ruby object representations.

So, heaps store RVALUE objects, which size is not more than 40 bytes. For such objects as String, Array, Hash etc. this means that small objects can fit in the heap, but as soon as they reach a threshold, an extra memory outside of the Ruby heaps will be allocated.

This extra memory is flexible; is will be freed as soon as an object became GC’ed. That’s why your testcase with big_string shows the memory up-down behaviour:

def report
  puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
          .strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil 
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB

But the heaps (see GC[:heap_length]) themselves are not released back to OS, once acquired. Look, I’ll make a humdrum change to your testcase:

- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)

And, voilá:

# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB

The memory is not released back to OS anymore, because each element of the array I introduced suits the RVALUE size and is stored in the ruby heap.

If you’ll examine the output of GC.stat after the GC was run, you’ll find that GC[:heap_used] value is decreased as expected. Ruby now has a lot of empty heaps, ready.

The summing up: I don’t think, the c code leaks. I think the problem is within base64 representation of huge image in your css. I have no clue, what’s happening inside parser, but it looks like the huge string forces the ruby heap count to increase.

Hope it helps.

How can I call a Proc that takes a block in a different context?

To solve this, you need to re-bind the Proc to the new class.

Here's your solution, leveraging some good code from Rails core_ext:

require 'rspec'

# Same as original post

class SomeClass
  def instance_method(x)
    "Hello #{x}"
  end
end

# Same as original post

class AnotherClass
  def instance_method(x)
    "Goodbye #{x}"
  end

  def make_proc
    Proc.new do |x, &block|
      instance_method(block.call(x))
    end
  end
end

### SOLUTION ###

# From activesupport lib/active_support/core_ext/kernel/singleton_class.rb

module Kernel
  # Returns the object's singleton class.
  def singleton_class
    class << self
      self
    end
  end unless respond_to?(:singleton_class) # exists in 1.9.2

  # class_eval on an object acts like singleton_class.class_eval.
  def class_eval(*args, &block)
    singleton_class.class_eval(*args, &block)
  end
end

# From activesupport lib/active_support/core_ext/proc.rb 

class Proc #:nodoc:
  def bind(object)
    block, time = self, Time.now
    object.class_eval do
      method_name = "__bind_#{time.to_i}_#{time.usec}"
      define_method(method_name, &block)
      method = instance_method(method_name)
      remove_method(method_name)
      method
    end.bind(object)
  end
end

# Here's the method you requested

def change_scope_of_proc(new_self, proc)
  return proc.bind(new_self)
end

# Same as original post

describe "change_scope_of_proc" do
  it "should change the instance method that is called" do
    some_class = SomeClass.new
    another_class = AnotherClass.new
    proc = another_class.make_proc
    fixed_proc = change_scope_of_proc(some_class, proc)
    result = fixed_proc.call("Wor") do |x|
      "#{x}ld"
    end
    result.should == "Hello World"
  end
end

Ruby Memory Management

Don't do this:

def method(x)
  x.split( doesn't matter what the args are )
end

or this:

def method(x)
  x.gsub( doesn't matter what the args are )
end

Both will permanently leak memory in ruby 1.8.5 and 1.8.6. (not sure about 1.8.7 as I haven't tried it, but I really hope it's fixed.) The workaround is stupid and involves creating a local variable. You don't have to use the local, just create one...

Things like this are why I have lots of love for the ruby language, but no respect for MRI

Why use procs instead of methods?

Proc is a callable piece of code. You can store it in a variable, pass as an argument and otherwise treat it as a first-class value.

Why not just use a method?

Depends on what you mean by "method" here.

class Foo
  def bar
    puts "hello"
  end
end

f = Foo.new

In this code snippet usage of method bar is pretty limited. You can call it, and that's it. However, if you wanted to store a reference to it (to pass somewhere else and there call it), you can do this:

f = Foo.new
bar_method = f.method(:bar)

Here bar_method is very similar to lambda (which is similar to Proc). bar_method is a first-class citizen, f.bar is not.

For more information, read the article mentioned by @minitech.

Ruby Symbol#To_Proc Leaks References in 1.9.2-P180