Why Is Seq(X) So Much Slower Than 1:Length(X)

Why is seq(x) so much slower than 1:length(x)?

seq is a generic S3 method, so probably some time is lost dispatching.
seq.default is almost 100 lines long!

You're probably already aware of seq_along, which calls a .Primitive directly and is bit better than 1:length(x) and the best method I have found for long loops:

f3 <- function(){
x <- 1:5; y <- numeric(length(x))
for(i in seq_along(x)) y[i] <- x[i]^2
y
}
> microbenchmark(f1(), f3())
Unit: microseconds
expr min lq median uq max neval
f1() 27.095 27.916 28.327 29.148 89.495 100
f3() 26.684 27.505 27.916 28.327 36.538 100

Is seq_len(x) always faster than 1:x in R? Is one more preferred than the other?

First note that seq_len(x) and : are somewhat more limited than seq():

  • seq_len(b) generates an integer sequence from 1 to b
  • a:b generates a numeric sequence from a to b with spacing 1 (a and b need not be integer (and so the result need not be an integer sequence)
  • seq(a, b, c) generates a numeric sequence from a to b with step length c. a, b, c can be numeric or integer or mixed.

If seq_len and : are appropriate for you, they should be favored over seq() since they are/call primitives (fast C functions).

As proposed in the comments to your question this can be seen from a benchmark:

bench::mark(
":" = 1:1e5,
"seq" = seq(1, 1e5),
"seq_len" = seq_len(1e5)
)

# A tibble: 3 × 4
expression min median `itr/sec`
<bch:expr> <bch:tm> <bch:tm> <dbl>
1 : 132ns 227ns 2677132.
2 seq 4.15µs 4.71µs 197204.
3 seq_len 113ns 170ns 2236432.

It seems that seq_len() is somewhat faster than :.

N.B.

Some of these functions have some quirks to be aware of:

E.g.

x <- numeric(0)
y <- length(x)

you get

1:y
[1] 1 0

but

seq_len(y)
integer(0)

so the former can be an issue when dealing with arrays.

Why are F# list ranges so much slower than for loops?

Using ILSpy, listIter looks like this:

public static void listIter(int n)
{
ListModule.Iterate<int>(
new listIter@17(n),
SeqModule.ToList<int>(
Operators.CreateSequence<int>(
Operators.OperatorIntrinsics.RangeInt32(1, 1, 10000000)
)
)
);
}

Here are the basic steps involved:

  1. RangeInt32 creates an IEnumerable (which is inexplicably wrapped by CreateSequence)
  2. SeqModule.ToList builds a list from that sequence
  3. An instance of listIter@17 (your lambda) is new'd up
  4. ListModule.Iterate traverses the list calling the lambda for each element

vs forLoop, which doesn't look much different from what you've written:

public static void forLoop(int n)
{
for (int x = 1; x < 10000001; x++)
{
int num = x + n;
double num2 = Math.Pow((double)num, 0.1);
}
}

...no IEnumerable, lambda (it's automatically inlined), or list creation. There's a potentially significant difference in the amount of work being done.

EDIT

For curiosity's sake, here are FSI timings for list, seq, and for loop versions:


listIter - Real: 00:00:03.889, CPU: 00:00:04.680, GC gen0: 57, gen1: 51, gen2: 6
seqIter - Real: 00:00:01.340, CPU: 00:00:01.341, GC gen0: 0, gen1: 0, gen2: 0
forLoop - Real: 00:00:00.565, CPU: 00:00:00.561, GC gen0: 0, gen1: 0, gen2: 0

and the seq version for reference:

let seqIter n =
{1..length} |> Seq.iter (fun x -> doSomething (x+n))

Why is apply() method slower than a for loop in R?

As Chase said: Use the power of vectorization. You're comparing two bad solutions here.

To clarify why your apply solution is slower:

Within the for loop, you actually use the vectorized indices of the matrix, meaning there is no conversion of type going on. I'm going a bit rough over it here, but basically the internal calculation kind of ignores the dimensions. They're just kept as an attribute and returned with the vector representing the matrix. To illustrate :

> x <- 1:10
> attr(x,"dim") <- c(5,2)
> y <- matrix(1:10,ncol=2)
> all.equal(x,y)
[1] TRUE

Now, when you use the apply, the matrix is split up internally in 100,000 row vectors, every row vector (i.e. a single number) is put through the function, and in the end the result is combined into an appropriate form. The apply function reckons a vector is best in this case, and thus has to concatenate the results of all rows. This takes time.

Also the sapply function first uses as.vector(unlist(...)) to convert anything to a vector, and in the end tries to simplify the answer into a suitable form. Also this takes time, hence also the sapply might be slower here. Yet, it's not on my machine.

IF apply would be a solution here (and it isn't), you could compare :

> system.time(loop_million <- mash(million))
user system elapsed
0.75 0.00 0.75
> system.time(sapply_million <- matrix(unlist(sapply(million,squish,simplify=F))))
user system elapsed
0.25 0.00 0.25
> system.time(sapply2_million <- matrix(sapply(million,squish)))
user system elapsed
0.34 0.00 0.34
> all.equal(loop_million,sapply_million)
[1] TRUE
> all.equal(loop_million,sapply2_million)
[1] TRUE

Why sequence expressions for arrays are so slow in F#?

In cases like this I always check the generated code using one of the many good decompilers in .NET.

let explicitArray () = 
let a = Array.zeroCreate count
for v in 1..count do
a.[v-1] <- v
a

This is compiled into the equivalent C#:

public static int[] explicitArray()
{
int[] a = ArrayModule.ZeroCreate<int>(10000000);
for (int v = 1; v < 10000001; v++)
{
a[v - 1] = v;
}
return a;
}

This is about as efficient it gets.

let arrayExpression () = 
[| for v in 1..count -> v |]

This on the other hand becomes:

public static int[] arrayExpression()
{
return SeqModule.ToArray<int>(new Program.arrayExpression@7(0, null, 0, 0));
}

This equivalent to:

let arrayExpression () = 
let e = seq { for v in 1..count -> v }
let a = List() // do not set capacity
for v in e do
a.Add(v)
a.ToArray()

When iterating over a seq (alias for IEnumerable) first MoveNext is called, then Current. These are virtual calls which the JIT:er seldomly can inline. Checking the JIT:ed assembly code we see this:

mov         rax,qword ptr [rbp+10h]  
cmp byte ptr [rax],0
mov rcx,qword ptr [rbp+10h]
lea r11,[7FFC07830030h]
# virtual call .MoveNext
call qword ptr [7FFC07830030h]
movzx ecx,al
# if .MoveNext returns false then exit
test ecx,ecx
je 00007FFC079408A0
mov rcx,qword ptr [rbp+10h]
lea r11,[7FFC07830038h]
# virtual call .Current
call qword ptr [7FFC07830038h]
mov edx,eax
mov rcx,rdi
# call .Add
call 00007FFC65C8B300
# loop
jmp 00007FFC07940863

If we compare this with the JIT:ed code for the code that used ResizeArray (List)

lea         edx,[rdi-1]  
mov rcx,rbx
# call .Add
call 00007FFC65C8B300
mov edx,edi
mov rcx,rbx
# call .Add
call 00007FFC65C8B300
lea edx,[rdi+1]
mov rcx,rbx
# call .Add
call 00007FFC65C8B300
lea edx,[rdi+2]
mov rcx,rbx
# call .Add
call 00007FFC65C8B300
add edi,4
# loop
cmp edi,989682h
jl 00007FFC07910384

Here the JIT:er has unrolled the loop 4 times here and we have only non-virtual calls to List.Add.

This explains why F# array expressions are slower than your other two examples.

In order to fix this one would have to fix the optimizer in F# to recognize the shape of expressions such as this:

seq { for v in 1..count -> v } |> Seq.toArray

And optimize them into:

let a = Array.zeroCreate count
for v in 1..count do
a.[v-1] <- v
a

The challenge is to find an optimization that is general enough to be useful but also doesn't break the semantics of F#.

Why is this F# code slower than the C# equivalent?

You can use List.forall instead of converting to a seq and then doing Seq.length:

let divisors = [3..20] |> List.rev
let isDivOneToTwenty n = divisors |> List.forall (fun d -> n % d = 0)

Seq.length will need to traverse the entire sequence to determine the number of elements, while forall can return as soon as an element fails the predicate.

you can also write findNum as:

let rec findNum n = if isDivOneToTwenty n then n else findNum (n + 2)


Related Topics



Leave a reply



Submit