C#/F# Performance Comparison

Why is F# so much slower than C#? (prime number benchmark)

The F# program is slower because your programs are not equivalent. Your C# code has a break statement in the inner for loop, but your F# program does not. Thus, for every even number, the C# code will stop after checking for divisibility by 2, while the F# program will check every number from 2 to i. With such a large difference in work done, it's actually surprising that the F# code is only three times slower!

Now, F# deliberately does not have a break statement, so you can't quite translate the C# code directly to F#. But you can use functions that include short-circuiting logic. For example, in the comments, Aaron M. Eshbach suggested the following:

{2 .. 100000}
|> Seq.filter (fun i -> {2 .. i-1} |> Seq.forall (fun j -> i % j <> 0))
|> Seq.iter (printfn "%i")

This uses Seq.forall, which does do short-circuiting: it will check each input in the sequence against the condition, and if the condition ever returns false, it will stop and make no more checks. (Because functions in the Seq module are lazy and will do no more work than absolutely required to get their output). This is like having a break in your C# code.

I'll go through this step by step so you can see how it works:

{2 .. 100000}

This creates a lazy sequence of ints that starts at 2 and goes up to (and including) 100000.

|> Seq.filter (fun i -> (some expression involving i))

I've broken the next line into two sections: the outer Seq.filter part, and the inner expression involving i. The Seq.filter part takes the sequence and filters it: for every item in the sequence, call it i and evaluate the expression. If that expression evaluates to true, then keep the item and pass it through to the next step in the chain. If the expression is false, then throw that item away.

Now, the expression involving i is:

{2 .. i-1} |> Seq.forall (fun j -> i % j <> 0)

This first constructs a lazy sequence that starts at 2 and goes up to i minus one, inclusive. (Or you could think of it as starting at 2 and going up to i, but not including i). It then checks whether every item of that sequence fulfills a certain condition (that's the Seq.forall function). And, as an implementation detail of Seq.forall, because it's lazy and does no more work than it absolutely has to, the minute it finds a false result it will stop and not go through the input sequence any longer. (Because once you find a single counter-example, it is no longer possible for the forall function to return true, so it stops as soon as its result is known.) And what is the expression being checked in Seq.forall? It's fun j -> i % j <> 0. So j is the inner loop variable, i is the outer variable (the one assigned in the Seq.filter part), and the logic is just the same as your C# loops.

Now, remember that we're inside a Seq.filter here. So if the Seq.forall returns true, then Seq.filter will keep the value of i. But if Seq.forall returns false, then Seq.filter will discard this value of i from passing through to the next step.

Finally, we have this line as the next (and final) step:

|> Seq.iter (printfn "%i")

What this does is pretty much exactly the same as:

for number in inputSoFar do
    printfn "%i" number

The (printfn "%i") call might look unusual to you if you're new to F#. This is currying, and it's a very useful concept and one that it's important to get used to. So spend some time thinking about this: in F#, the following two lines of code are completely equivalent:

(fun y -> someFunctionCall x y)
(someFunctionCall x)

So fun item -> printfn "%i" item can always be replaced by printfn "%i. And Seq.iter is the equivalent of a for loop:

inputSoFar |> Seq.iter (someFunctionCall x)

is exactly equivalent to:

for item in inputSoFar do
    someFunctionCall x item

So there you have it: why your F# program is slower, and also how to write an F# program that will follow the same logic as the C# one, but will have the equivalent of a break statement in it.

Aggregation function - f# vs c# performance

An immediate way to speed it up would be to combine these:

[0 .. (Array.length data) - 1]
    |> List.map (fun i -> (data.[i], data2.[i])) 
    |> Seq.filter (fun (date, _) ->

into a single list comprehension, and also as the other matthew said, do a single string comparison:

let aggrF (data:float[]) (data2:float[]) std edd pob sac =
    let isValidTime = match pob with
                        | "Peak" -> (fun x -> ispeak x)
                        | "Offpeak" -> (fun x -> not(ispeak x))
                        | _ -> (fun _ -> true)

    let data = [ for i in 0 .. (Array.length data) - 1 do 
                  let (date, value) = (data.[i], data2.[i])
                  if isbetween date std edd && isValidTime date then
                      yield (date, value)
                  else
                      () ]

    match sac with 
    | 0 -> data |> Seq.averageBy (fun (_, value) -> value)
    | 2 -> data.Length
    | _ -> data |> Seq.sumBy (fun (_, value) -> value)

Or use a tail recursive function:

let aggrF (data:float[]) (data2:float[]) std edd pob sac =
    let isValidTime = match pob with
                        | "Peak" -> (fun x -> ispeak x)
                        | "Offpeak" -> (fun x -> not(ispeak x))
                        | _ -> (fun _ -> true)

    let endDate = edd + 1.0

    let rec aggr i sum count =
        if i >= (Array.length data) || data.[i] >= endDate then
            match sac with 
            | 0 -> sum / float(count)
            | 2 -> float(count)
            | _ -> float(sum)
        else if data.[i] >= std && isValidTime data.[i] then
            aggr (i + 1) (sum + data2.[i]) (count + 1)
        else
            aggr (i + 1) sum count

    aggr 0 0.0 0

Why is this simple F# code 36 times slower than C#/C++ versions?

This is very definitely happening directly as a consequence of using the expression:

for i in seq{1..N} do

On my machine, this gives the result:

100000000
00:00:09.1500924

If I change the loop to:

for i in 1..N do

The result changes dramatically:

100000000
00:00:00.1001864

Why?

The IL generated by these two approaches is quite different. The second case, using the 1..N syntax simply gets compiled the same way as a C# for(int i=1; i<N+1; ++i) loop.

The first case is quite different, this version produces a full sequence which is then enumerated by a foreach loop.

The C# and F# versions making use of IEnumerables differ in that they use different range functions to generate them.

The C# version uses System.Linq.Enumerable.RangeIterator to generate the value range, while the F# version uses Microsoft.FSharp.Core.Operators.OperatorIntrinsics.RangeInt32. I think it's safe to assume that the performance difference we're seeing between the C# and F# versions in this particular case are a result of the performance characteristics of these two functions.

svick is correct to point out in his comment that the + operator is actually being passed as an argument to the integralRangeStep function.

For the non-trivial case where n <> m this results in the F# compiler using a ProperIntegralRangeEnumerator with the implementation found here: https://github.com/Microsoft/visualfsharp/blob/master/src/fsharp/FSharp.Core/prim-types.fs#L6463

let inline integralRangeStepEnumerator (zero,add,n,step,m,f) : IEnumerator<_> =
    // Generates sequence z_i where z_i = f (n + i.step) while n + i.step is in region (n,m)
    if n = m then
        new SingletonEnumerator<_> (f n) |> enumerator 
    else
        let up = (n < m)
        let canStart = not (if up then step < zero else step > zero) // check for interval increasing, step decreasing 
        // generate proper increasing sequence
        { new ProperIntegralRangeEnumerator<_,_>(n,m) with 
                member x.CanStart = canStart
                member x.Before a b = if up then (a < b) else (a > b)
                member x.Equal a b = (a = b)
                member x.Step a = add a step
                member x.Result a = f a } |> enumerator

We can see that stepping through the Enumerator results in calls to the supplied add function rather than a more straightforward, direct addition.

Note: All timings run in Release mode (Tail Calls: On, Optimisation: On).

Bad F# code performance on simple loop compared to C# - Why?

Converting the F# code from

{ 1I..(bigint (Int32.MaxValue / 100)) } |> Seq.sum;;
Real: 00:00:14.014, CPU: 00:00:14.196, GC gen0: 1743, gen1: 0

let mutable t = 1I
let mutable res = 0I
let max = bigint (Int32.MaxValue / 100)
while t < max do             
    res <- res + t
    t <- t + 1I;;
Real: 00:00:05.379, CPU: 00:00:05.450, GC gen0: 748, gen1: 0

Approxiamately triples the speed, and is also closer to the original C# code.

The most likely reason is that both {...} and for i in ... create a dummy seq. By removing this, you avoid the seq overhead.

EDIT

For some reason F# is generating a ridiculous amount of IL for this code, and is using a really weird comparison.

If we explicitly force the comparison, the speed doubles (which is a little ridiculous)

This code gets pretty much exactly the same speed as the C# for me (on mono).

let mutable t = 1I
let mutable res = 0I
let max = (bigint (Int32.MaxValue / 100));;
while System.Numerics.BigInteger.op_GreaterThan(max,t) do             
     res <- res + t;t<-System.Numerics.BigInteger.op_Increment(t) 
printfn "%A" res

but is unnecersarrily verbose.

I will probably file a compiler bug about this.

F# vs. C# performance Signatures with sample code

I think that the difference may arise from different Lists in F# and C#. F# uses singly linked lists (see http://msdn.microsoft.com/en-us/library/dd233224.aspx) whereas in C# System.Collections.Generic.List ist used, which is based on arrays.

Concatenation is much faster for singly linked lists, especially when you're parsing big files (you need to allocate/copy the whole array list from time to time).

Try using a LinkedList in the C# code, I'm curious about the results :) ...

PS: Also, this would be a good example on when to use a profiler. You could easily find the "hot spot" of the C# code...

EDIT

So, I tried this for myself: I used two identical files in order to prevent caching effects. The files were 3.000.000 lines with 10 times 'abcdef', separated by comma.

The main program looks like this:

static void Main(string[] args) {
   var dt = DateTime.Now;
   CSVprocess.ImportDataC("test.csv"); // C# implementation
   System.Console.WriteLine("Time {0}", DateTime.Now - dt);
   dt = DateTime.Now;
   CSVImport.ImportData("test1.csv"); // F# implementation
   System.Console.WriteLine("Time {0}", DateTime.Now - dt);
}

(I also tried it with first executing the F# implementation and then the C#...)

The result is:

C#: 3.7 seconds
F#: 7.6 seconds

Running the C# solution after the F# solution gives the same performance for the F# version but 4.7 seconds for C# (I assume due to heavy memory allocation by the F# solution). Running each solution alone doesn't change the above results.

Using a file with 6.000.000 lines gives ~ 7 seconds for the C# solution, the F# solution produces an OutOfMemoryException (I'm running this on a maching with 12GB Ram ...)

So for me it seems that the conventional 'wisdom' is true and C# using a simple loop is faster for this kind of tasks ...

F# seems slower than other languages... what can I do to speed it up?

Unless you can give a reasonably sized code sample, it's difficult to tell. Anyway, the imperative F# version should be as efficient as the imperative C# version. I think one approach is to benchmark the two to see what is causing the difference (then someone can help with making that bit faster).

I briefly looked at your code and here are some assorted (untested) suggestions.

You can replace discriminated union Cell with an enum (this means you'll use value types and integer comparison instead of reference types and runtime type tests):
```
type Cell =    
  | Orange = 1
  | Yellow = 2
  | Barren = 3
```

You can mark some trivial functions as inline. For example:

let inline myincr (arr:int array) idx =
  arr.[idx] <- arr.[idx] + 1

Don't use exceptions for control-flow. This is often done in OCaml, but .NET exceptions are slow and should be only used for exceptions. You can replace the for loop in your sample with a while loop and a mutable flag or with a tail-recursive function (a tail-recursive function is compiled into a loop, so it will be efficient, even in imperative solution).

Why is this F# code slower than the C# equivalent?

You can use List.forall instead of converting to a seq and then doing Seq.length:

let divisors = [3..20] |> List.rev
let isDivOneToTwenty n = divisors |> List.forall (fun d -> n % d = 0)

Seq.length will need to traverse the entire sequence to determine the number of elements, while forall can return as soon as an element fails the predicate.

you can also write findNum as:

let rec findNum n = if isDivOneToTwenty n then n else findNum (n + 2)