Ruby Equivalent of C#'s 'Yield' Keyword, Or, Creating Sequences Without Preallocating Memory

Ruby equivalent of C#'s 'yield' keyword, or, creating sequences without preallocating memory

It's supported by Enumerator since Ruby 1.9 (and back-ported to 1.8.7). See Generator: Ruby.

Cliche example:

fib = Enumerator.new do |y|
  y.yield i = 0
  y.yield j = 1
  while true
    k = i + j
    y.yield k
    i = j
    j = k
  end
end

100.times { puts fib.next() }

Is 'yield' keyword a syntactic sugar ? What is its Implementation

The generated code depends on the original, but generally speaking a state machine gets generated which keeps track of the current state of the collection.

See yield statement implementation, this answer by Eric Lippert and this blog post by Jon Skeet.

Converting Ruby to C#

I don't know C# at all, so anything I say about C# should be taken with a grain of salt. However, I will try to explain what goes on in that piece of Ruby code.

class << Cache

Ruby has something called singleton methods. These have nothing to do with the Singleton Software Design Pattern, they are just methods that are defined for one and only one object. So, you can have two instances of the same class, and add methods to one of those two objects.

There are two different syntaxes for singleton methods. One is to just prefix the name of the method with the object, so def foo.bar(baz) would define a method bar only for object foo. The other method is called opening up the singleton class and it looks syntactically similar to defining a class, because that's also what happens semantically: singleton methods actually live in an invisible class that gets inserted between the object and its actual class in the class hierarchy.

This syntax looks like this: class << foo. This opens up the singleton class of object foo and every method defined inside of that class body becomes a singleton method of object foo.

Why is this used here? Well, Ruby is a pure object-oriented language, which means that everything, including classes is an object. Now, if methods can be added to individual objects, and classes are objects, this means that methods can be added to individual classes. In other words, Ruby has no need for the artificial distinction between regular methods and static methods (which are a fraud, anyway: they aren't really methods, just glorified procedures). What is a static method in C#, is just a regular method on a class object's singleton class.

All of this is just a longwinded way of explaining that everything defined between class << Cache and its corresponding end becomes static.

  STALE_REFRESH = 1
  STALE_CREATED = 2

In Ruby, every variable that starts with a capital letter, is actually a constant. However, in this case we won't translate these as static const fields, but rather an enum, because that's how they are used.

  # Caches data received from a block
  #
  # The difference between this method and usual Cache.get
  # is following: this method caches data and allows user
  # to re-generate data when it is expired w/o running
  # data generation code more than once so dog-pile effect
  # won't bring our servers down
  #
  def smart_get(key, ttl = nil, generation_time = 30.seconds)

This method has three parameters (four actually, we will see exactly why later), two of them are optional (ttl and generation_time). Both of them have a default value, however, in the case of ttl the default value isn't really used, it serves more as a marker to find out whether the argument was passed in or not.

30.seconds is an extension that the ActiveSupport library adds to the Integer class. It doesn't actually do anything, it just returns self. It is used in this case just to make the method definition more readable. (There are other methods which do something more useful, e.g. Integer#minutes, which returns self * 60 and Integer#hours and so on.) We will use this as an indication, that the type of the parameter should not be int but rather System.TimeSpan.

    # Fallback to default caching approach if no ttl given
    return get(key) { yield } unless ttl

This contains several complex Ruby constructs. Let's start with the easiest one: trailing conditional modifiers. If a conditional body contains only one expression, then the conditional can be appended to the end of the expression. So, instead of saying if a > b then foo end you can also say foo if a > b. So, the above is equivalent to unless ttl then return get(key) { yield } end.

The next one is also easy: unless is just syntactic sugar for if not. So, we are now at if not ttl then return get(key) { yield } end

Third is Ruby's truth system. In Ruby, truth is pretty simple. Actually, falseness is pretty simple, and truth falls out naturally: the special keyword false is false, and the special keyword nil is false, everything else is true. So, in this case the conditional will only be true, if ttl is either false or nil. false isn't a terrible sensible value for a timespan, so the only interesting one is nil. The snippet would have been more clearly written like this: if ttl.nil? then return get(key) { yield } end. Since the default value for the ttl parameter is nil, this conditional is true, if no argument was passed in for ttl. So, the conditional is used to figure out with how many arguments the method was called, which means that we are not going to translate it as a conditional but rather as a method overload.

Now, on to the yield. In Ruby, every method can accept an implicit code block as an argument. That's why I wrote above that the method actually takes four arguments, not three. A code block is just an anonymous piece of code that can be passed around, stored in a variable, and invoked later on. Ruby inherits blocks from Smalltalk, but the concept dates all the way back to 1958, to Lisp's lambda expressions. At the mention of anonymous code blocks, but at the very least now, at the mention of lambda expressions, you should know how to represent this implicit fourth method parameter: a delegate type, more specifically, a Func.

So, what's yield do? It transfers control to the block. It's basically just a very convenient way of invoking a block, without having to explicitly store it in a variable and then calling it.

    # Create window for data refresh
    real_ttl = ttl + generation_time * 2
    stale_key = "#{key}.stale"

This #{foo} syntax is called string interpolation. It means "replace the token inside the string with whatever the result of evaluating the expression between the braces". It's just a very concise version of String.Format(), which is exactly what we are going to translate it to.

    # Try to get data from memcache
    value = get(key)
    stale = get(stale_key)

    # If stale key has expired, it is time to re-generate our data
    unless stale
      put(stale_key, STALE_REFRESH, generation_time) # lock
      value = nil # force data re-generation
    end

    # If no data retrieved or data re-generation forced, re-generate data and reset stale key
    unless value
      value = yield
      put(key, value, real_ttl)
      put(stale_key, STALE_CREATED, ttl) # unlock
    end

    return value
  end
end

This is my feeble attempt at translating the Ruby version to C#:

public class Cache<Tkey, Tvalue> {
    enum Stale { Refresh, Created }

    /* Caches data received from a delegate
     *
     * The difference between this method and usual Cache.get
     * is following: this method caches data and allows user
     * to re-generate data when it is expired w/o running
     * data generation code more than once so dog-pile effect
     * won't bring our servers down
    */
    public static Tvalue SmartGet(Tkey key, TimeSpan ttl, TimeSpan generationTime, Func<Tvalue> strategy)
    {
        // Create window for data refresh
        var realTtl = ttl + generationTime * 2;
        var staleKey = String.Format("{0}stale", key);

        // Try to get data from memcache
        var value = Get(key);
        var stale = Get(staleKey);

        // If stale key has expired, it is time to re-generate our data
        if (stale == null)
        {
            Put(staleKey, Stale.Refresh, generationTime); // lock
            value = null; // force data re-generation
        }

        // If no data retrieved or data re-generation forced, re-generate data and reset stale key
        if (value == null)
        {
            value = strategy();
            Put(key, value, realTtl);
            Put(staleKey, Stale.Created, ttl) // unlock
        }

        return value;
    }

    // Fallback to default caching approach if no ttl given
    public static Tvalue SmartGet(Tkey key, Func<Tvalue> strategy) => 
        Get(key, strategy);

    // Simulate default argument for generationTime
    // C# 4.0 has default arguments, so this wouldn't be needed.
    public static Tvalue SmartGet(Tkey key, TimeSpan ttl, Func<Tvalue> strategy) => 
        SmartGet(key, ttl, new TimeSpan(0, 0, 30), strategy);

    // Convenience overloads to allow calling it the same way as 
    // in Ruby, by just passing in the timespans as integers in 
    // seconds.
    public static Tvalue SmartGet(Tkey key, int ttl, int generationTime, Func<Tvalue> strategy) => 
        SmartGet(key, new TimeSpan(0, 0, ttl), new TimeSpan(0, 0, generationTime), strategy);

    public static Tvalue SmartGet(Tkey key, int ttl, Func<Tvalue> strategy) => 
        SmartGet(key, new TimeSpan(0, 0, ttl), strategy);
}

Please note that I do not know C#, I do not know .NET, I have not tested this, I don't even know if it is syntactically valid. Hope it helps anyway.

How to modify and presist member / embedded array in dynamic object (the one declared with dynamic keyword)

Execute this adjustment after decoding:

dObj.tags = new List<dynamic>( dObj.tags );

Despite of this hiccup, the serialization and deserialization of dynamic object with System.Web.Helpers.Json is bit faster than JSON.NET.

What does the yield keyword do in Ruby?

This is an example fleshing out your sample code:

class MyClass
  attr_accessor :items

  def initialize(ary=[])
    @items = ary
  end

  def each
    @items.each do |item| 
      yield item
    end
  end
end

my_class = MyClass.new(%w[a b c d])
my_class.each do |y|
  puts y
end
# >> a
# >> b
# >> c
# >> d

each loops over a collection. In this case it's looping over each item in the @items array, initialized/created when I did the new(%w[a b c d]) statement.

yield item in the MyClass.each method passes item to the block attached to my_class.each. The item being yielded is assigned to the local y.

Does that help?

Now, here's a bit more about how each works. Using the same class definition, here's some code:

my_class = MyClass.new(%w[a b c d])

# This points to the `each` Enumerator/method of the @items array in your instance via
#  the accessor you defined, not the method "each" you've defined.
my_class_iterator = my_class.items.each # => #<Enumerator: ["a", "b", "c", "d"]:each>

# get the next item on the array
my_class_iterator.next # => "a"

# get the next item on the array
my_class_iterator.next # => "b"

# get the next item on the array
my_class_iterator.next # => "c"

# get the next item on the array
my_class_iterator.next # => "d"

# get the next item on the array
my_class_iterator.next # => 
# ~> -:21:in `next': iteration reached an end (StopIteration)
# ~>    from -:21:in `<main>'

Notice that on the last next the iterator fell off the end of the array. This is the potential pitfall for NOT using a block because if you don't know how many elements are in the array you can ask for too many items and get an exception.

Using each with a block will iterate over the @items receiver and stop when it reaches the last item, avoiding the error, and keeping things nice and clean.

What is the yield keyword used for in C#?

The yield contextual keyword actually does quite a lot here.

The function returns an object that implements the IEnumerable<object> interface. If a calling function starts foreaching over this object, the function is called again until it "yields". This is syntactic sugar introduced in C# 2.0. In earlier versions you had to create your own IEnumerable and IEnumerator objects to do stuff like this.

The easiest way understand code like this is to type-in an example, set some breakpoints and see what happens. Try stepping through this example:

public void Consumer()
{
    foreach(int i in Integers())
    {
        Console.WriteLine(i.ToString());
    }
}

public IEnumerable<int> Integers()
{
    yield return 1;
    yield return 2;
    yield return 4;
    yield return 8;
    yield return 16;
    yield return 16777216;
}

When you step through the example, you'll find the first call to Integers() returns 1. The second call returns 2 and the line yield return 1 is not executed again.

Here is a real-life example:

public IEnumerable<T> Read<T>(string sql, Func<IDataReader, T> make, params object[] parms)
{
    using (var connection = CreateConnection())
    {
        using (var command = CreateCommand(CommandType.Text, sql, connection, parms))
        {
            command.CommandTimeout = dataBaseSettings.ReadCommandTimeout;
            using (var reader = command.ExecuteReader())
            {
                while (reader.Read())
                {
                    yield return make(reader);
                }
            }
        }
    }
}

yield keyword to digest code block like Ruby

I don't know Ruby, but it looks like basically you want to pass "some code to execute" - which is typically done in C# with a delegate (or an interface, for a specific meaning of course). As far as I can see, yield in Ruby is entirely different to yield return in C#. If you don't have any parameters or a return value, the Action delegate is suitable. You can then call the method using any approach to create a delegate, such as:

An existing method, either through a method group conversion or an explicit delegate creation expression
A lambda expression
An anonymous method

For example:

Foo(() => Console.WriteLine("Called!"));

...

static void Foo(Action action)
{
    Console.WriteLine("In Foo");
    action();
    Console.WriteLine("In Foo again");
    action();
}

There's a lot to learn about delegates - I don't have time to go into them in detail right now, unfortunately, but I suggest you find a good book about C# and learn about them from that. (In particular, the fact that they can accept parameters and return values is crucial in various situations, including LINQ.)

How would you write this C# code (that uses the yield keyword) succinctly in Ruby?

The common idiom in ruby to implement such sequences, is to define a method that executes a block for each element in the sequence if one is given or return an enumerator if it is not. That would look like this:

def fibs
  return enum_for(:fibs) unless block_given?
  a = 0
  b = 1
  while true
    yield b
    b += a
    a = b - a
  end
end

fibs
#=> #<Enumerable::Enumerator:0x7f030eb37988>
fibs.first(20)
#=> [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
fibs.take_while {|x| x < 1000}.select {|x| x%2 == 0}
#=> [2, 8, 34, 144, 610]
fibs.take_while {|x| x < 1000}.select {|x| x%2 == 0}.inject(:+)
=> 798

Ruby yield example explanation?

Let's build this up step-by-step. We will simplify things a bit by taking it out of the class context.

For this example it is intuitive to think of an iterator as being a more-powerful replacement for a traditional for-loop.

So first here's a for-loop version:

seq1 = (0..2)
seq2 = (0..2)
for x in seq1
  for y in seq2
    p [x,y] # shorthand for puts [x, y].inspect
  end
end

Now let's replace that with more Ruby-idiomatic iterator style, explicitly supplying blocks to be executed (i.e., the do...end blocks):

seq1.each do |x|
  seq2.each do |y|
    p [x,y]
  end
end

So far, so good, you've printed out your cartesian product. Now your assignment asks you to use yield as well. The point of yield is to "yield execution", i.e., pass control to another block of code temporarily (optionally passing one or more arguments).

So, although it's not really necessary for this toy example, instead of directly printing the value like above, you can yield the value, and let the caller supply a block that accepts that value and prints it instead.

That could look like this:

 def prod(seq1, seq2)
    seq1.each do |x|
      seq2.each do |y|
        yield [x,y]
      end
    end
  end

Callable like this:

prod (1..2), (1..2) do |prod| p prod end

The yield supplies the product for each run of the inner loop, and the yielded value is printed by the block supplied by the caller.

Ruby Equivalent of C#'s 'Yield' Keyword, Or, Creating Sequences Without Preallocating Memory