What's the Easiest Way to Print Output from Parallel Operations in Ruby Without Jumbling Up the Output

What's the easiest way to print output from parallel operations in Ruby without jumbling up the output?

puts has a race condition, since it may write the new-line separately from the line. You may see this sort of noise using puts in a multi-threaded application:

thread 0thread 1
thread 0thread 2
thread 1
thread 0thread 3
thread 2
thread 1

Instead, use print or printf

print "thread #{i}" + "\n"
print "thread #{i}\n"
printf "thread %d\n", i

Or, since you want to write to STDERR:

$stderr.print "thread #{i}\n"

Is it a bug in Ruby? Not if the comments are to be taken as the standard. Here's the definition of IO.puts from MRI 1.8.7 though 2.2.2:

/*
* call-seq:
* ios.puts(obj, ...) => nil
*
* Writes the given objects to <em>ios</em> as with
* <code>IO#print</code>. Writes a record separator (typically a
* newline) after any that do not already end with a newline sequence.
* If called with an array argument, writes each element on a new line.
* If called without arguments, outputs a single record separator.
*
* $stdout.puts("this", "is", "a", "test")
*
* <em>produces:</em>
*
* this
* is
* a
* test
*/

Is Ruby Thread-Safe by default?

Huh!! Finally I found a way to prove, that it will not result 100000 always on irb.

Running following code gave me the idea,

100.times do
i = 0
1000.times do
Thread.start { 100.times { i += 1 } }
end
puts i
end

I see different values, most of the times. Mostly, it ranges from 91k to 100000.

How to use shell variables with GNU parallel?

The idea is to use a single function to do everything.

#!/bin/bash

#Key database in the shell script
# REEED,r-key
# YEEELLOW,y-key
# WHITEEE,w-key

doit() {
# get CSV's 2nd element and make it look like the one in script.
color=`echo $3 | cut -d, -f2 | sed 's/e/eee/g' | ./capitalize.sh`
#extract this element's value from the script's comments.
key=`grep -i $color $1 | cut -d, -f2`
echo "color: $color"
echo "key: $key"
}
export -f doit

#note that I would use parallel's `-C,` here instead of `cut`.
parallel -C, doit $0 < list.csv

Why do we need monads?

Why do we need monads?

  1. We want to program only using functions. ("functional programming (FP)" after all).
  2. Then, we have a first big problem. This is a program:

    f(x) = 2 * x

    g(x,y) = x / y

    How can we say what is to be executed first? How can we form an ordered sequence of functions (i.e. a program) using no more than functions?

    Solution: compose functions. If you want first g and then f, just write f(g(x,y)). This way, "the program" is a function as well: main = f(g(x,y)). OK, but ...

  3. More problems: some functions might fail (i.e. g(2,0), divide by 0). We have no "exceptions" in FP (an exception is not a function). How do we solve it?

    Solution: Let's allow functions to return two kind of things: instead of having g : Real,Real -> Real (function from two reals into a real), let's allow g : Real,Real -> Real | Nothing (function from two reals into (real or nothing)).

  4. But functions should (to be simpler) return only one thing.

    Solution: let's create a new type of data to be returned, a "boxing type" that encloses maybe a real or be simply nothing. Hence, we can have g : Real,Real -> Maybe Real. OK, but ...

  5. What happens now to f(g(x,y))? f is not ready to consume a Maybe Real. And, we don't want to change every function we could connect with g to consume a Maybe Real.

    Solution: let's have a special function to "connect"/"compose"/"link" functions. That way, we can, behind the scenes, adapt the output of one function to feed the following one.

    In our case: g >>= f (connect/compose g to f). We want >>= to get g's output, inspect it and, in case it is Nothing just don't call f and return Nothing; or on the contrary, extract the boxed Real and feed f with it. (This algorithm is just the implementation of >>= for the Maybe type). Also note that >>= must be written only once per "boxing type" (different box, different adapting algorithm).

  6. Many other problems arise which can be solved using this same pattern: 1. Use a "box" to codify/store different meanings/values, and have functions like g that return those "boxed values". 2. Have a composer/linker g >>= f to help connecting g's output to f's input, so we don't have to change any f at all.

  7. Remarkable problems that can be solved using this technique are:

    • having a global state that every function in the sequence of functions ("the program") can share: solution StateMonad.

    • We don't like "impure functions": functions that yield different output for same input. Therefore, let's mark those functions, making them to return a tagged/boxed value: IO monad.

Total happiness!



Related Topics



Leave a reply



Submit