What Task Is Best Done in a Functional Programming Style

What task is best done in a functional programming style?

I've just recently discovered the functional programming style [...]
Well, recently I was given a chance to give a talk on how to reduce
software development efforts, and I
wanted to introduce the concept of
functional programming.

If you've only just discovered functional programming, I do not recommend trying to speak authoritatively on the subject. I know for the first 6 months while I was learnig F#, all of my code was just C# with a little more awkward syntax. However, after that period of time, I was able to write consistently good code in an idiomatic, functional style.

I recommend that you do the same: wait for 6 months or so until functional programming style comes more naturally, then give your presentation.

I'm trying to
illustrate the benefits of functional
programming, and I had the idea of
showing people 2 set of code that does
the same thing, one coded in a very
imperative way, and the other in a
very functional way, to show that
functional programming can made code
way shorter, easier to understand and
thus maintain. Is there such example,
beside the famous sum of squares
example by Luca Bolognese?

I gave an F# presentation to the .NET users group in my area, and many people in my group were impressed by F#'s pattern matching. Specifically, I showed how to traverse an abstract syntax tree in C# and F#:

using System;

namespace ConsoleApplication1
{
public interface IExprVisitor<t>
{
t Visit(TrueExpr expr);
t Visit(And expr);
t Visit(Nand expr);
t Visit(Or expr);
t Visit(Xor expr);
t Visit(Not expr);

}

public abstract class Expr
{
public abstract t Accept<t>(IExprVisitor<t> visitor);
}

public abstract class UnaryOp : Expr
{
public Expr First { get; private set; }
public UnaryOp(Expr first)
{
this.First = first;
}
}

public abstract class BinExpr : Expr
{
public Expr First { get; private set; }
public Expr Second { get; private set; }

public BinExpr(Expr first, Expr second)
{
this.First = first;
this.Second = second;
}
}

public class TrueExpr : Expr
{
public override t Accept<t>(IExprVisitor<t> visitor)
{
return visitor.Visit(this);
}
}

public class And : BinExpr
{
public And(Expr first, Expr second) : base(first, second) { }
public override t Accept<t>(IExprVisitor<t> visitor)
{
return visitor.Visit(this);
}
}

public class Nand : BinExpr
{
public Nand(Expr first, Expr second) : base(first, second) { }
public override t Accept<t>(IExprVisitor<t> visitor)
{
return visitor.Visit(this);
}
}

public class Or : BinExpr
{
public Or(Expr first, Expr second) : base(first, second) { }
public override t Accept<t>(IExprVisitor<t> visitor)
{
return visitor.Visit(this);
}
}

public class Xor : BinExpr
{
public Xor(Expr first, Expr second) : base(first, second) { }
public override t Accept<t>(IExprVisitor<t> visitor)
{
return visitor.Visit(this);
}
}

public class Not : UnaryOp
{
public Not(Expr first) : base(first) { }
public override t Accept<t>(IExprVisitor<t> visitor)
{
return visitor.Visit(this);
}
}

public class EvalVisitor : IExprVisitor<bool>
{
public bool Visit(TrueExpr expr)
{
return true;
}

public bool Visit(And expr)
{
return Eval(expr.First) && Eval(expr.Second);
}

public bool Visit(Nand expr)
{
return !(Eval(expr.First) && Eval(expr.Second));
}

public bool Visit(Or expr)
{
return Eval(expr.First) || Eval(expr.Second);
}

public bool Visit(Xor expr)
{
return Eval(expr.First) ^ Eval(expr.Second);
}

public bool Visit(Not expr)
{
return !Eval(expr.First);
}

public bool Eval(Expr expr)
{
return expr.Accept(this);
}
}

public class PrettyPrintVisitor : IExprVisitor<string>
{
public string Visit(TrueExpr expr)
{
return "True";
}

public string Visit(And expr)
{
return string.Format("({0}) AND ({1})", expr.First.Accept(this), expr.Second.Accept(this));
}

public string Visit(Nand expr)
{
return string.Format("({0}) NAND ({1})", expr.First.Accept(this), expr.Second.Accept(this));
}

public string Visit(Or expr)
{
return string.Format("({0}) OR ({1})", expr.First.Accept(this), expr.Second.Accept(this));
}

public string Visit(Xor expr)
{
return string.Format("({0}) XOR ({1})", expr.First.Accept(this), expr.Second.Accept(this));
}

public string Visit(Not expr)
{
return string.Format("Not ({0})", expr.First.Accept(this));
}

public string Pretty(Expr expr)
{
return expr.Accept(this).Replace("(True)", "True");
}
}

class Program
{
static void TestLogicalEquivalence(Expr first, Expr second)
{
var prettyPrinter = new PrettyPrintVisitor();
var eval = new EvalVisitor();
var evalFirst = eval.Eval(first);
var evalSecond = eval.Eval(second);

Console.WriteLine("Testing expressions:");
Console.WriteLine(" First = {0}", prettyPrinter.Pretty(first));
Console.WriteLine(" Eval(First): {0}", evalFirst);
Console.WriteLine(" Second = {0}", prettyPrinter.Pretty(second));
Console.WriteLine(" Eval(Second): {0}", evalSecond);;
Console.WriteLine(" Equivalent? {0}", evalFirst == evalSecond);
Console.WriteLine();
}

static void Main(string[] args)
{
var P = new TrueExpr();
var Q = new Not(new TrueExpr());

TestLogicalEquivalence(P, Q);

TestLogicalEquivalence(
new Not(P),
new Nand(P, P));

TestLogicalEquivalence(
new And(P, Q),
new Nand(new Nand(P, Q), new Nand(P, Q)));

TestLogicalEquivalence(
new Or(P, Q),
new Nand(new Nand(P, P), new Nand(Q, Q)));

TestLogicalEquivalence(
new Xor(P, Q),
new Nand(
new Nand(P, new Nand(P, Q)),
new Nand(Q, new Nand(P, Q)))
);

Console.ReadKey(true);
}
}
}

The code above is written in an idiomatic C# style. It uses the visitor pattern rather than type-testing to guarantee type safety. This is about 218 LOC.

Here's the F# version:

#light
open System

type expr =
| True
| And of expr * expr
| Nand of expr * expr
| Or of expr * expr
| Xor of expr * expr
| Not of expr

let (^^) p q = not(p && q) && (p || q) // makeshift xor operator

let rec eval = function
| True -> true
| And(e1, e2) -> eval(e1) && eval(e2)
| Nand(e1, e2) -> not(eval(e1) && eval(e2))
| Or(e1, e2) -> eval(e1) || eval(e2)
| Xor(e1, e2) -> eval(e1) ^^ eval(e2)
| Not(e1) -> not(eval(e1))

let rec prettyPrint e =
let rec loop = function
| True -> "True"
| And(e1, e2) -> sprintf "(%s) AND (%s)" (loop e1) (loop e2)
| Nand(e1, e2) -> sprintf "(%s) NAND (%s)" (loop e1) (loop e2)
| Or(e1, e2) -> sprintf "(%s) OR (%s)" (loop e1) (loop e2)
| Xor(e1, e2) -> sprintf "(%s) XOR (%s)" (loop e1) (loop e2)
| Not(e1) -> sprintf "NOT (%s)" (loop e1)
(loop e).Replace("(True)", "True")

let testLogicalEquivalence e1 e2 =
let eval1, eval2 = eval e1, eval e2
printfn "Testing expressions:"
printfn " First = %s" (prettyPrint e1)
printfn " eval(e1): %b" eval1
printfn " Second = %s" (prettyPrint e2)
printfn " eval(e2): %b" eval2
printfn " Equilalent? %b" (eval1 = eval2)
printfn ""

let p, q = True, Not True
let tests =
[
p, q;

Not(p), Nand(p, p);

And(p, q),
Nand(Nand(p, q), Nand(p, q));

Or(p, q),
Nand(Nand(p, p), Nand(q, q));

Xor(p, q),
Nand(
Nand(p, Nand(p, q)),
Nand(q, Nand(p, q))
)
]
tests |> Seq.iter (fun (e1, e2) -> testLogicalEquivalence e1 e2)

Console.WriteLine("(press any key)")
Console.ReadKey(true) |> ignore

This is 65 LOC. Since it uses pattern matching rather than the visitor pattern, we don't lose any type-safety, and the code is very easy to read.

Any kind of symbolic processing is orders of magnitude easier to write in F# than C#.

[Edit to add:] Oh, and pattern matching isn't just a replacement for the visitor pattern, it also allows you to match against the shape of data. For example, here's a function which converts Nand's to their equivalents:

let rec simplify = function
| Nand(p, q) when p = q -> Not(simplify p)
| Nand(Nand(p1, q1), Nand(p2, q2))
when equivalent [p1; p2] && equivalent [q1; q2]
-> And(simplify p1, simplify q1)
| Nand(Nand(p1, p2), Nand(q1, q2))
when equivalent [p1; p2] && equivalent [q1; q2]
-> Or(simplify p1, simplify q1)
| Nand(Nand(p1, Nand(p2, q1)), Nand(q2, Nand(p3, q3)))
when equivalent [p1; p2; p3] && equivalent [q1; q2; q3]
-> Xor(simplify p1, simplify q1)
| Nand(p, q) -> Nand(simplify p, simplify q)
| True -> True
| And(p, q) -> And(simplify p, simplify q)
| Or(p, q) -> Or(simplify p, simplify q)
| Xor(p, q) -> Xor(simplify p, simplify q)
| Not(Not p) -> simplify p
| Not(p) -> Not(simplify p)

Its not possible to write this code concisely at all in C#.

How can you do anything useful without mutable state?

Or if you play a video game, there are
tons of state variables, beginning
with the positions of all the
characters, who tend to move around
constantly. How can you possibly do
anything useful without keeping track
of changing values?

If you're interested, here's a series of articles which describe game programming with Erlang.

You probably won't like this answer, but you won't get functional program until you use it. I can post code samples and say "Here, don't you see" -- but if you don't understand the syntax and underlying principles, then your eyes just glaze over. From your point of view, it looks as if I'm doing the same thing as an imperative language, but just setting up all kinds of boundaries to purposefully make programming more difficult. My point of view, you're just experiencing the Blub paradox.

I was skeptical at first, but I jumped on the functional programming train a few years ago and fell in love with it. The trick with functional programming is being able to recognize patterns, particular variable assignments, and move the imperative state to the stack. A for-loop, for example, becomes recursion:

// Imperative
let printTo x =
for a in 1 .. x do
printfn "%i" a

// Recursive
let printTo x =
let rec loop a = if a <= x then printfn "%i" a; loop (a + 1)
loop 1

Its not very pretty, but we got the same effect with no mutation. Of course, wherever possible, we like avoid looping altogether and just abstract it away:

// Preferred
let printTo x = seq { 1 .. x } |> Seq.iter (fun a -> printfn "%i" a)

The Seq.iter method will enumerate through the collection and invoke the anonymous function for each item. Very handy :)

I know, printing numbers isn't exactly impressive. However, we can use the same approach with games: hold all state in the stack and create a new object with our changes in the recursive call. In this way, each frame is a stateless snapshot of the game, where each frame simply creates a brand new object with the desired changes of whatever stateless objects needs updating. The pseudocode for this might be:

// imperative version
pacman = new pacman(0, 0)
while true
if key = UP then pacman.y++
elif key = DOWN then pacman.y--
elif key = LEFT then pacman.x--
elif key = UP then pacman.x++
render(pacman)

// functional version
let rec loop pacman =
render(pacman)
let x, y = switch(key)
case LEFT: pacman.x - 1, pacman.y
case RIGHT: pacman.x + 1, pacman.y
case UP: pacman.x, pacman.y - 1
case DOWN: pacman.x, pacman.y + 1
loop(new pacman(x, y))

The imperative and functional versions are identical, but the functional version clearly uses no mutable state. The functional code keeps all state is held on the stack -- the nice thing about this approach is that, if something goes wrong, debugging is easy, all you need is a stack trace.

This scales up to any number of objects in the game, because all objects (or collections of related objects) can be rendered in their own thread.

Just about every user application I
can think of involves state as a core
concept.

In functional languages, rather than mutating the state of objects, we simply return a new object with the changes we want. Its more efficient than it sounds. Data structures, for example, are very easy to represent as immutable data structures. Stacks, for example, are notoriously easy to implement:

using System;

namespace ConsoleApplication1
{
static class Stack
{
public static Stack<T> Cons<T>(T hd, Stack<T> tl) { return new Stack<T>(hd, tl); }
public static Stack<T> Append<T>(Stack<T> x, Stack<T> y)
{
return x == null ? y : Cons(x.Head, Append(x.Tail, y));
}
public static void Iter<T>(Stack<T> x, Action<T> f) { if (x != null) { f(x.Head); Iter(x.Tail, f); } }
}

class Stack<T>
{
public readonly T Head;
public readonly Stack<T> Tail;
public Stack(T hd, Stack<T> tl)
{
this.Head = hd;
this.Tail = tl;
}
}

class Program
{
static void Main(string[] args)
{
Stack<int> x = Stack.Cons(1, Stack.Cons(2, Stack.Cons(3, Stack.Cons(4, null))));
Stack<int> y = Stack.Cons(5, Stack.Cons(6, Stack.Cons(7, Stack.Cons(8, null))));
Stack<int> z = Stack.Append(x, y);
Stack.Iter(z, a => Console.WriteLine(a));
Console.ReadKey(true);
}
}
}

The code above constructs two immutable lists, appends them together to make a new list, and appends the results. No mutable state is used anywhere in the application. It looks a little bulky, but that's only because C# is a verbose language. Here's the equivalent program in F#:

type 'a stack =
| Cons of 'a * 'a stack
| Nil

let rec append x y =
match x with
| Cons(hd, tl) -> Cons(hd, append tl y)
| Nil -> y

let rec iter f = function
| Cons(hd, tl) -> f(hd); iter f tl
| Nil -> ()

let x = Cons(1, Cons(2, Cons(3, Cons(4, Nil))))
let y = Cons(5, Cons(6, Cons(7, Cons(8, Nil))))
let z = append x y
iter (fun a -> printfn "%i" a) z

No mutable necessary to create and manipulate lists. Nearly all data structures can be easily converted into their functional equivalents. I wrote a page here which provides immutable implementations of stacks, queues, leftist heaps, red-black trees, lazy lists. Not a single snippet of code contains any mutable state. To "mutate" a tree, I create a brand new one with new node I want -- this is very efficient because I don't need to make a copy of every node in the tree, I can reuse the old ones in my new tree.

Using a more significant example, I also wrote this SQL parser which is totally stateless (or at least my code is stateless, I don't know whether the underlying lexing library is stateless).

Stateless programming is just as expressive and powerful as stateful programming, it just requires a little practice to train yourself to start thinking statelessly. Of course, "stateless programming when possible, stateful programming where necessary" seems to be the motto of most impure functional languages. There's no harm in falling back on mutables when the functional approach just isn't as clean or efficient.

When to use a functional programming language?

Functional languages are, in my opinion, good for mainly two things: Game AIs and mathematical computations. It's good in game AIs because of its nice list manipulations (at least in Lisp and Scheme), and for mathematical computations because of its syntax. Scheme, Lisp, and Haskell have a syntax that makes mathematical computations easy to read. The last thing I have to add is that functional languages are really fun languages. My Scheme course was one of the courses I had the most fun with.

Functional Programming for Basic Algorithms

How good is 'pure' functional
programming for basic routine
implementations, e.g. list sorting,
string matching etc.?

Very. I'll do your problems in Haskell, and I'll be slightly verbose about it. My aim is not to convince you that the problem can be done in 5 characters (it probably can in J!), but rather to give you an idea of the constructs.

import Data.List -- for `sort`
stdlistsorter :: (Ord a) => [a] -> [a]
stdlistsorter list = sort list

Sorting a list using the sort function from Data.List

import Data.List -- for `delete`
selectionsort :: (Ord a) => [a] -> [a]
selectionsort [] = []
selectionsort list = minimum list : (selectionsort . delete (minimum list) $ list)

Selection sort implementation.

quicksort :: (Ord a) => [a] -> [a]  
quicksort [] = []
quicksort (x:xs) =
let smallerSorted = quicksort [a | a <- xs, a <= x]
biggerSorted = quicksort [a | a <- xs, a > x]
in smallerSorted ++ [x] ++ biggerSorted

Quick sort implementation.

import Data.List -- for `isInfixOf`
stdstringmatch :: (Eq a) => [a] -> [a] -> Bool
stdstringmatch list1 list2 = list1 `isInfixOf` list2

String matching using isInfixOf function from Data.list

It's common to implement such basic
functions within the base interpreter
of any functional language, which
means that they will be written in an
imperative language (c/c++). Although
there are many exceptions..

Depends. Some functions are more naturally expressed imperatively. However, I hope I have convinced you that some algorithms are also expressed naturally in a functional way.

At least, I wish to ask: How difficult
is it to emulate imperative style
while coding in 'pure' functional
language?

It depends on how hard you find Monads in Haskell. Personally, I find it quite difficult to grasp.

What are the benefits of functional programming?

The style of functional programming is to describe what you want, rather than how to get it. ie: instead of creating a for-loop with an iterator variable and marching through an array doing something to each cell, you'd say the equivalent of "this label refers to a version of this array where this function has been done on all the elements."

Functional programming moves more basic programming ideas into the compiler, ideas such as list comprehensions and caching.

The biggest benefit of Functional programming is brevity, because code can be more concise. A functional program doesn't create an iterator variable to be the center of a loop, so this and other kinds of overhead are eliminated from your code.

The other major benefit is concurrency, which is easier to do with functional programming because the compiler is taking care of most of the operations which used to require manually setting up state variables (like the iterator in a loop).

Some performance benefits can be seen in the context of a single-processor as well, depending on the way the program is written, because most functional languages and extensions support lazy evaluation. In Haskell you can say "this label represents an array containing all the even numbers". Such an array is infinitely large, but you can ask for the 100,000th element of that array at any moment without having to know--at array initialization time--just what the largest value is you're going to need. The value will be calculated only when you need it, and no further.

Do functional languages cope well with complexity?

Does program flow become difficult to follow more quickly than if a >non-functional language is used?

"Program flow" is probably the wrong concept to analyze a large functional program. Control flow can become baroque because there are higher-order functions, but these are generally easy to understand because there is rarely any shared mutable state to worry about, so you can just think about arguments and results. Certainly my experience is that I find it much easier to follow an aggressively functional program than an aggressively object-oriented program where parts of the implementation are smeared out over many classes. And I find it easier to follow a program written with higher-order functions than with dynamic dispatch. I also observe that my students, who are more representative of programmers as a whole, have difficulties with both inheritance and dynamic dispatch. They do not have comparable difficulties with higher-order functions.

Are there other issues or things to consider when writing a large
software project using a functional language?

The critical thing is a good module system. Here is some commentary.

  • The most powerful module system I know of the unit system of PLT Scheme designed by Matthew Flatt and Matthias Felleisen. This very powerful system unfortunately lacks static types, which I find a great aid to programming.

  • The next most powerful system is the Standard ML module system. Unfortunately Standard ML, while very expressive, also permits a great many questionable constructs, so it is easy for an amateur to make a real mess. Also, many programmers find it difficult to use Standard ML modules effectively.

    The Objective Caml module system is very similar, but there are some differences which tend to mitigate the worst excesses of Standard ML. The languages are actually very similar, but the styles and idioms of Objective Caml make it significantly less likely that beginners will write insane programs.

  • The least powerful/expressive module system for a functional langauge is the Haskell module system. This system has a grave defect that there are no explicit interfaces, so most of the cognitive benefit of having modules is lost. Another sad outcome is that while the Haskell module system gives users a hierarchical name space, use of this name space (import qualified, in case you're an insider) is often deprecated, and many Haskell programmers write code as if everything were in one big, flat namespace. This practice amounts to abandoning another of the big benefits of modules.

If I had to write a big system in a functional language and had to be sure that other people understood it, I'd probably pick Standard ML, and I'd establish very stringent programming conventions for use of the module system. (E.g., explicit signatures everywhere, opague ascription with :>, and no use of open anywhere, ever.) For me the simplicity of the Standard ML core language (as compared with OCaml) and the more functional nature of the Standard ML Basis Library (as compared with OCaml) are more valuable than the superior aspects of the OCaml module system.

I've worked on just one really big Haskell program, and while I found (and continue to find) working in Haskell very enjoyable, I really missed not having explicit signatures.

Do functional languages cope well with complexity?

Some do. I've found ML modules and module types (both the Standard ML and Objective Caml) flavors invaluable tools for managing complexity, understanding complexity, and placing unbreachable firewalls between different parts of large programs. I have had less good experiences with Haskell


Final note: these aren't really new issues. Decomposing systems into modules with separate interfaces checked by the compiler has been an issue in Ada, C, C++, CLU, Modula-3, and I'm sure many other languages. The main benefit of a system like Standard ML or Caml is the that you get explicit signatures and modular type checking (something that the C++ community is currently struggling with around templates and concepts). I suspect that these issues are timeless and are going to be important for any large system, no matter the language of implementation.

Functional Programming Performance

Relying so heavily on indices is arguably not really idiomatic functional style, and combining indexing and lists is a recipe for less-than-ideal performance.

Here's an index-free implementation:

import scala.util.control.TailCalls._

def solve(xs: Vector[Int]): Int = {
def go(xs: Vector[Int], previous: Int): TailRec[Int] = {
val min = xs.min

splitOn(xs, min).foldLeft(done(min - previous)) {
case (acc, part) => for {
total <- acc
cost <- go(part, min)
} yield total + cost
}.map(math.min(xs.size, _))
}

go(xs, 0).result
}

That's not quite the whole story, though—I've factored out the splitting part into a method named splitOn that takes a sequence and a delimiter. Because this is a very simple and generic operation, it's a good candidate for optimization. The following is a quick attempt:

def splitOn[A](xs: Vector[A], delim: A): Vector[Vector[A]] = {
val builder = Vector.newBuilder[Vector[A]]
var i = 0
var start = 0

while (i < xs.size) {
if (xs(i) == delim) {
if (i != start) {
builder += xs.slice(start, i)
}
start = i + 1
}
i += 1
}

if (i != start) builder += xs.slice(start, i)

builder.result
}

While this implementation is imperative, from the outside the method is perfectly functional—it doesn't have side effects, etc.

This is often a good way to improve the performance of functional code: we've separated our program into a generic piece (splitting a list on a delimiter) and our problem-specific logic. Because the former is so simple, we can ask treat it (and test it) as a black box, while keeping the code we're using to solve the problem clean and functional.

In this case the performance still isn't great—this implementation is about twice as fast as yours on my machine—but I don't think you're going to get much better than that while trampolining with TailCalls.

Proper commenting for functional programming

The common style for Lisp comments is

  • Four semicolons for commentary on a whole subsection of a file.
  • Three semicolons for introducing a single procedure.
  • Two semicolons for a description of the expression/procedure definition on the following line.
  • One semicolon for an endline comment.

Procedure overview comments should probably follow the style of RnRS documens, so to just add comments to your procedure as-is, would look something like


;;; Procedure: display-n NUM ...
;; Output each argument to the screen in the order they are provided.
(define
display-n (lambda nums
(letrec ((display-n-inner (lambda (nums)
(display (car nums))
(if (not (equal? (cdr nums) '()))
(display-n-inner (cdr nums))))))
(display-n-inner nums))))

N.B. I don't use three semicolons for the whole procedure description, since it screws up fill-paragraph in Emacs.


Now about the code, I would ditch the whole define-variable-as-a-lambda thing. Yes, I get that this is the "purest" way to define a function, and it makes for a nice consistency with defining procedures are the results of LETs and other procedures, but there's a reason for syntactic sugar, and it's to make things more readable. Same for the LETREC—just use an internal DEFINE, which is the same thing but more readable.

It's not a huge deal that DISPLAY-N-INNER's parameter is called NUMS, since the procedure's so short and DISPLAY-N just hands its NUMS straight to it anyways. "DISPLAY-N-INNER" is sort of a lame name, though. You would give it something with more semantic meaning, or give it a simple name like "ITER" or "LOOP".

Now about the logic of the procedure. First, (equal? (cdr nums) '()) is silly, and is better as (null? (cdr nums)). Actually, when you are operating over an entire list, it's best to make the base case a test of whether the list itself, and not its CDR, is empty. This way the procedure won't error if you pass it no arguments (unless you want it to do that, but I think it makes more sense for DISPLAY-N to do nothing if it gets nothing). Furthermore, you should test whether to stop the procedure, not whether to continue.



Related Topics



Leave a reply



Submit