Why Do Ruby Procs/Blocks with Splat Arguments Behave Differently Than Methods and Lambdas

Why do Ruby procs/blocks with splat arguments behave differently than methods and lambdas?

There are two types of Proc objects: lambda which handles argument list in the same way as a normal method, and proc which use "tricks" (Proc#lambda?). proc will splat an array if it's the only argument, ignore extra arguments, assign nil to missing ones. You can partially mimic proc behavior with lambda using destructuring:

->((x, y)) { [x, y] }[1]         #=> [1, nil]
->((x, y)) { [x, y] }[[1, 2]]    #=> [1, 2]
->((x, y)) { [x, y] }[[1, 2, 3]] #=> [1, 2]
->((x, y)) { [x, y] }[1, 2]      #=> ArgumentError

Why does yielding to lambda splat array arguments in Ruby?

I'm answering my own question here, because this is a known bug:

https://bugs.ruby-lang.org/issues/12705

And it was fixed in Ruby 2.4.1 (thanks @ndn)

Differences between Proc and Lambda

There are two main differences between lambdas and non-lambda Procs:

Just like methods, lambdas return from themselves, whereas non-lambda Procs return from the enclosing method, just like blocks.
Just like methods, lambdas have strict argument checking, whereas non-lambda Procs have loose argument checking, just like blocks.

Or, in short: lambdas behave like methods, non-lambda Procs behave like blocks.

What you are seeing there is an instance of #2. Try it with a block and a method in addition to a non-lambda Proc and a lambda, and you'll see. (Without this behavior, Hash#each would be a real PITA to use, since it does yield an array with two-elements, but you pretty much always want to treat it as two arguments.)

Splat parameters behave differently for attribute writers compared to regular method

It has nothing to do with splat. It's the assignment operator. In ruby, the assignment operator returns the value assigned. The return value from the method is ignored.

So a=1 return 1, not [1].

But, as mentioned by @mudasobwa, you're not even calling the method here. But if you were, that's what would happen (ignoring the return value).

class Foo

  def a=(*params)
    params
  end

end

f = Foo.new

f.a = 1 # => 1
f.a = 1,2 # => [1, 2]

To not ignore the return value, call that setter without using assignment operator.

f.send 'a=', 1 # => [1]

Return statements inside procs, lambdas, and blocks

As one answer in the linked question shows:

The return keyword always returns from the method or lambda in the current context. In blocks, it will return from the method in which the closure was defined. It cannot be made to return from the calling method or lambda.

Your first example was successful because you defined victor in the same function you wanted to return from, so a return was legal in that context. In your second example, victor was defined in the top-level. The effect of that return, then, would not be to return from batman_yield (the calling method), but [if it were valid] to return from the top-level itself (where the Proc was defined).

Clarification: while you can access the return value of a block (i.e. "The value of the last expression evaluated in the block is passed back to the method as the value of the yield" - as per your comment), you can't use the return keyword, for the reason stated above. Example:

def batman_yield
    value = yield
    return value
    "Iron man will win!"
end

victor = Proc.new { return "Batman will win!" }
victor2 = Proc.new { "Batman will win!" }

#batman_yield(&victor) === This code throws an error.
puts batman_yield(&victor2) # This code works fine.

what are procs and lambdas? practical examples please

Try Robert Sosinski's Tutorial or Learning to Program by Chris Pine.

For more foundation I suggest you read Why’s (poignant) Guide to Ruby. This guide is responsible for creating many of nowadays Ruby's Pro! Make sure to take a look!

Explanation by Joey deVilla

Another important but subtle difference is in the way procs created with lambda and procs created with Proc.new handle the return statement:

In a lambda-created proc, the return statement returns only from the proc itself
In a Proc.new-created proc, the return statement is a little more surprising: it returns control not just from the proc, but also from the method enclosing the proc!

Here's lambda-created proc's return in action. It behaves in a way that you probably expect:

def whowouldwin

  mylambda = lambda {return "Freddy"}
  mylambda.call

  # mylambda gets called and returns "Freddy", and execution
  # continues on the next line

  return "Jason"

end

whowouldwin
=> "Jason"

Now here's a Proc.new-created proc's return doing the same thing. You're about to see one of those cases where Ruby breaks the much-vaunted Principle of Least Surprise:

def whowouldwin2

  myproc = Proc.new {return "Freddy"}
  myproc.call

  # myproc gets called and returns "Freddy", 
  # but also returns control from whowhouldwin2!
  # The line below *never* gets executed.

  return "Jason"

end

whowouldwin2         
=> "Freddy"

Thanks to this surprising behaviour (as well as less typing), I tend to favour using lambda over Proc.new when making procs.

Why do while modifiers behave differently with blocks?

This seems to be just a quirk of the language. ... while condition acts a a statement modifier except when it is after a begin ... end, where it behaves like a do ... while loop from other languages.

Yukihiro Matsumoto (Matz, the creator of Ruby) has said he regrets this, and would like to remove this behaviour in the future if possible.

There’s a bit more info in this blog post from New Relic, where I found the link to the mail list post: https://blog.newrelic.com/2014/11/13/weird-ruby-begin-end/

Reusable Code Chunks - Ruby Blocks, Procs, Lambdas

What you asking is, i.e. block and yield.
It works like this:

def add_with_extra(a, b)
  c = a + b
  d = yield(c)
  c + d
end

# > add_with_extra(3, 5) { |c| c * 2 }
# => 24
# > add_with_extra(3, 5) { |c| c / 2 }
# => 12

But in your case it will look like this:

case key
  when :morals
    ids.each_with_index do |id, index|
      do_processing(ids, basic_relation, key_ids, key, id, index) do |relation, ids, index, id|
        relation[:values] = S(@ids[:values][index])
      end
    end
  when :values
    ids.flatten.uniq.each do |id|
      do_processing(ids, basic_relation, key_ids, key, id, index) do |relation, ids, index, id|
        ids.each_with_index { |array, index| relation[:morals] = S(A(relation[:morals]) - A(@ids[:morals][index])) unless array.include? id }
      end
    end
  else
    ids.each do |id|
      do_processing(ids, basic_relation, key_ids, key, id, index)
    end
end

Which is not really well readable and understandable. Instead I suggest to make some refactoring:

def prepare_relation(basic_relation, key, id)
  relation = basic_relation.dup
  relation[:for] = key
  relation[key] = id
  relation
end

def add_to_stack(relation, key_ids, key, id)
  @stack << relation
  key_ids << id unless @stack.find{ |relation| relation.class != Hash && relation[:for] == key.to_s && relation[key] == id }
end

basic_relation = {for: nil, state: nil}
@objects.each_key { |key| basic_relation[key] = nil unless key == :flows }
basic_relation.merge!(ids_string)

@ids.each do |key, ids|
  next if key == :flows
  key_ids = []
  lookup_ids = key == :values ? ids.flatten.uniq : ids

  lookup_ids.each_with_index do |id, index|
    relation = prepare_relation(basic_relation, key, id)
    relation[:values] = S(@ids[:values][index]) if key == :morals
    if key == :values
      ids.each_with_index do |array, index|
        relation[:morals] = S(A(relation[:morals]) - A(@ids[:morals][index])) unless array.include? id
      end
    end
    add_to_stack(relation, key_ids, key, id)
  end

  unless key_ids.empty?
    TaleRelation.where(for: key , key: key_ids).each do |activerecord|
      activerecord[:state] = nil
      @stack << activerecord
    end
  end
end

Here i generalized the main difference of your switch:

when :morals
    ids.each_with_index do |id, index|
...
when :values
    ids.flatten.uniq.each do |id|
...
else
    ids.each do |id|

The real difference is only with :values case, because each_with_index is suitable for the last case also - we just won't be using index.
Then what was not common turns into simple two if's:

relation[:values] = S(@ids[:values][index]) if key == :morals
if key == :values
  ids.each_with_index do |array, index|
    relation[:morals] = S(A(relation[:morals]) - A(@ids[:morals][index])) unless array.include? id
  end
end

P.S. You shouldn't call methods A or S. Methods names must be lowercased and should have a meaning.

Why is `to_ary` called from a double-splatted parameter in a code block?

I don't have full answers to your questions, but I'll share what I've found out.

Short version

Procs allow to be called with number of arguments different than defined in the signature. If the argument list doesn't match the definition, #to_ary is called to make implicit conversion. Lambdas and methods require number of args matching their signature. No conversions are performed and that's why #to_ary is not called.

Long version

What you describe is a difference between handling params by lambdas (and methods) and procs (and blocks). Take a look at this example:

obj = Object.new
def obj.to_ary; "baz" end
lambda{|**foo| print foo}.call(obj)   
# >> ArgumentError: wrong number of arguments (given 1, expected 0)
proc{|**foo| print foo}.call(obj)
# >> TypeError: can't convert Object to Array (Object#to_ary gives String)

Proc doesn't require the same number of args as it defines, and #to_ary is called (as you probably know):

For procs created using lambda or ->(), an error is generated if wrong number of parameters are passed to the proc. For procs created using Proc.new or Kernel.proc, extra parameters are silently discarded and missing parameters are set to nil. (Docs)

What is more, Proc adjusts passed arguments to fit the signature:

proc{|head, *tail| print head; print tail}.call([1,2,3])
# >> 1[2, 3]=> nil

Sources: makandra, SO question.

#to_ary is used for this adjustment (and it's reasonable, as #to_ary is for implicit conversions):

obj2 = Class.new{def to_ary; [1,2,3]; end}.new
proc{|head, *tail| print head; print tail}.call(obj2)
# >> 1[2, 3]=> nil

It's described in detail in a ruby tracker.

You can see that [1,2,3] was split to head=1 and tail=[2,3]. It's the same behaviour as in multi assignment:

head, *tail = [1, 2, 3]
# => [1, 2, 3]
tail
# => [2, 3]

As you have noticed, #to_ary is also called when when a proc has double-splatted keyword args:

proc{|head, **tail| print head; print tail}.call(obj2)
# >> 1{}=> nil
proc{|**tail| print tail}.call(obj2)
# >> {}=> nil

In the first case, an array of [1, 2, 3] returned by obj2.to_ary was split to head=1 and empty tail, as **tail wasn't able to match an array of[2, 3].

Lambdas and methods don't have this behaviour. They require strict number of params. There is no implicit conversion, so #to_ary is not called.

I think that this difference is implemented in these two lines of the Ruby soruce:

    opt_pc = vm_yield_setup_args(ec, iseq, argc, sp, passed_block_handler,
(is_lambda ? arg_setup_method : arg_setup_block));

and in this function. I guess #to_ary is called somewhere in vm_callee_setup_block_arg_arg0_splat, most probably in RARRAY_AREF. I would love to read a commentary of this code to understand what happens inside.

Is Ruby's code block same as C#'s lambda expression?

Ruby actually has 4 constructs that are all extremely similar

The Block

The idea behind blocks is sort of a way to implement really light weight strategy patterns. A block will define a coroutine on the function, which the function can delegate control to with the yield keyword. We use blocks for just about everything in ruby, including pretty much all the looping constructs or anywhere you would use using in c#. Anything outside the block is in scope for the block, however the inverse is not true, with the exception that return inside the block will return the outer scope. They look like this

def foo
  yield 'called foo'
end

#usage
foo {|msg| puts msg} #idiomatic for one liners

foo do |msg| #idiomatic for multiline blocks
  puts msg
end

Proc

A proc is basically taking a block and passing it around as a parameter. One extremely interesting use of this is that you can pass a proc in as a replacement for a block in another method. Ruby has a special character for proc coercion which is &, and a special rule that if the last param in a method signature starts with an &, it will be a proc representation of the block for the method call. Finally, there is a builtin method called block_given?, which will return true if the current method has a block defined. It looks like this

def foo(&block)
  return block
end

b = foo {puts 'hi'}
b.call # hi

To go a little deeper with this, there is a really neat trick that rails added to Symbol (and got merged into core ruby in 1.9). Basically, that & coercion does its magic by calling to_proc on whatever it is next to. So the rails guys added a Symbol#to_proc that would call itself on whatever is passed in. That lets you write some really terse code for any aggregation style function that is just calling a method on every object in a list

class Foo
  def bar
    'this is from bar'
  end
end

list = [Foo.new, Foo.new, Foo.new]

list.map {|foo| foo.bar} # returns ['this is from bar', 'this is from bar', 'this is from bar']
list.map &:bar # returns _exactly_ the same thing

More advanced stuff, but imo that really illustrates the sort of magic you can do with procs

Lambdas

The purpose of a lambda is pretty much the same in ruby as it is in c#, a way to create an inline function to either pass around, or use internally. Like blocks and procs, lambdas are closures, but unlike the first two it enforces arity, and return from a lambda exits the lambda, not the containing scope. You create one by passing a block to the lambda method, or to -> in ruby 1.9

l = lambda {|msg| puts msg} #ruby 1.8
l = -> {|msg| puts msg} #ruby 1.9

l.call('foo') # => foo

Methods

Only serious ruby geeks really understand this one :) A method is a way to turn an existing function into something you can put in a variable. You get a method by calling the method function, and passing in a symbol as the method name. You can re bind a method, or you can coerce it into a proc if you want to show off. A way to re-write the previous method would be

l = lambda &method(:puts)
l.call('foo')

What is happening here is that you are creating a method for puts, coercing it into a proc, passing that in as a replacement for a block for the lambda method, which in turn returns you the lambda

Feel free to ask about anything that isn't clear (writing this really late on a weeknight without an irb, hopefully it isn't pure gibberish)

EDIT: To address questions in the comments

list.map &:bar Can I use this syntax
with a code block that takes more than
one argument? Say I have hash = { 0 =>
"hello", 1 => "world" }, and I want to
select the elements that has 0 as the
key. Maybe not a good example. – Bryan
Shen

Gonna go kind of deep here, but to really understand how it works you need to understand how ruby method calls work.

Basically, ruby doesn't have a concept of invoking a method, what happens is that objects pass messages to each other. The obj.method arg syntax you use is really just sugar around the more explicit form, which is obj.send :method, arg, and is functionally equivalent to the first syntax. This is a fundamental concept in the language, and is why things like method_missing and respond_to? make sense, in the first case you are just handling an unrecognized message, the second you are checking to see if it is listening for that message.

The other thing to know is the rather esoteric "splat" operator, *. Depending on where its used, it actually does very different things.

def foo(bar, *baz)

In a method call, if it is the last parameter, splat will make that parameter glob up all additional parameters passed in to the function (sort of like params in C#)

obj.foo(bar, *[biz, baz])

When in a method call (or anything else that takes argument lists), it will turn an array into a bare argument list. The snippet below is equivilent to the snippet above.

obj.foo(bar, biz, baz)

Now, with send and * in mind, Symbol#to_proc is basically implemented like this

class Symbol
  def to_proc
    Proc.new { |obj, *args| obj.send(self, *args) }
  end
end

So, &:sym is going to make a new proc, that calls .send :sym on the first argument passed to it. If any additional args are passed, they are globbed up into an array called args, and then splatted into the send method call.

I notice that & is used in three
places: def foo(&block), list.map
&:bar, and l = lambda &method(:puts).
Do they share the same meaning? –
Bryan Shen

Yes, they do. An & will call to_proc on what ever it is beside. In the case of the method definition it has a special meaning when on the last parameter, where you are pulling in the co-routine defined as a block, and turning that into a proc. Method definitions are actually one of the most complex parts of the language, there are a huge amount of tricks and special meanings that can be in the parameters, and the placement of the parameters.

b = {0 => "df", 1 => "kl"} p b.select
{|key, value| key.zero? } I tried to
transform this to p b.select &:zero?,
but it failed. I guess that's because
the number of parameters for the code
block is two, but &:zero? can only
take one param. Is there any way I can
do that? – Bryan Shen

This should be addressed earlier, unfortunately you can't do it with this trick.

"A method is a way to turn an existing
function into something you can put in
a variable." why is l = method(:puts)
not sufficient? What does lambda &
mean in this context? – Bryan Shen

That example was exceptionally contrived, I just wanted to show equivalent code to the example before it, where I was passing a proc to the lambda method. I will take some time later and re-write that bit, but you are correct, method(:puts) is totally sufficient. What I was trying to show is that you can use &method(:puts) anywhere that would take a block. A better example would be this

['hello', 'world'].each &method(:puts) # => hello\nworld

l = -> {|msg| puts msg} #ruby 1.9:
this doesn't work for me. After I
checked Jörg's answer, I think it
should be l = -> (msg) {puts msg}. Or
maybe i'm using an incorrect version
of Ruby? Mine is ruby 1.9.1p738 –
Bryan Shen

Like I said in the post, I didn't have an irb available when I was writing the answer, and you are right, I goofed that (spend the vast majority of my time in 1.8.7, so I am not used to the new syntax yet)

There is no space between the stabby bit and the parens. Try l = ->(msg) {puts msg}. There was actually a lot of resistance to this syntax, since it is so different from everything else in the language.

Why Do Ruby Procs/Blocks with Splat Arguments Behave Differently Than Methods and Lambdas