Does Swift Implement Tail Call Optimization? and in Mutual Recursion Case

Does Swift implement tail call optimization? and in mutual recursion case?

The best way to check is to examine the assembly language code generated by the compiler. I took the code above and compiled it with:

swift -O3 -S tco.swift >tco.asm

The relevant part of the output

.globl    __TF3tco3sumFTSiSi_Si
.align 4, 0x90
__TF3tco3sumFTSiSi_Si:
pushq %rbp
movq %rsp, %rbp
testq %rdi, %rdi
je LBB0_4
.align 4, 0x90
LBB0_1:
movq %rdi, %rax
decq %rax
jo LBB0_5
addq %rdi, %rsi
jo LBB0_5
testq %rax, %rax
movq %rax, %rdi
jne LBB0_1
LBB0_4:
movq %rsi, %rax
popq %rbp
retq
LBB0_5:
ud2

.globl __TF3tco5isOddFSiSb
.align 4, 0x90
__TF3tco5isOddFSiSb:
pushq %rbp
movq %rsp, %rbp
testq %rdi, %rdi
je LBB1_1
decq %rdi
jo LBB1_9
movb $1, %al
LBB1_5:
testq %rdi, %rdi
je LBB1_2
decq %rdi
jo LBB1_9
testq %rdi, %rdi
je LBB1_1
decq %rdi
jno LBB1_5
LBB1_9:
ud2
LBB1_1:
xorl %eax, %eax
LBB1_2:
popq %rbp
retq

.globl __TF3tco6isEvenFSiSb
.align 4, 0x90
__TF3tco6isEvenFSiSb:
pushq %rbp
movq %rsp, %rbp
movb $1, %al
LBB2_1:
testq %rdi, %rdi
je LBB2_5
decq %rdi
jo LBB2_7
testq %rdi, %rdi
je LBB2_4
decq %rdi
jno LBB2_1
LBB2_7:
ud2
LBB2_4:
xorl %eax, %eax
LBB2_5:
popq %rbp
retq

There are no call instructions in the generated code, only conditional jumps (je / jne / jo / jno). This clearly suggests that Swift does do tail call optimizations in both cases.

In addition, the isOdd/isEven functions are interesting in that the compiler not only seems to perform TCO but also inlines the other function in each case.

Compiling Tail-Call Optimization In Mutual Recursion Across C and Haskell

I believe cross-language C-Haskell tail calls are very, very hard to achieve.

I do not know the exact details, but the C runtime and the Haskell runtime are vastly different. The main factors for this difference, as far as I can see, are:

  • different paradigm: purely functional vs imperative
  • garbage collection vs manual memory management
  • lazy semantics vs strict one

The kinds of optimizations which are likely to survive across language boundaries given such differences are next to zero. Perhaps, in theory, one could invent an ad hoc C runtime together with a Haskell runtime so that some optimizations are feasible, but GHC and GCC were not designed in this way.

Just to show an example of the potential differences, assume we have the following Haskell code

p :: Int -> Bool
p x = x==42

main = if p 42
then putStrLn "A" -- A
else putStrLn "B" -- B

A possible implementation of the main could be the following:

  • push the address of A on the stack
  • push the address of B on the stack
  • push 42 on the stack
  • jump to p
  • A: print "A", jump to end
  • B: print "B", jump to end

while p is implemented as follows:

  • p: pop x from the stack
  • pop b from stack
  • pop a from stack
  • test x against 42
  • if equal, jump to a
  • jump to b

Note how p is invoked with two return addresses, one for each possible result. This is different from C, whose standard implementations use only one return address. When crossing boundaries the compiler must account for this difference and compensate.

Above I also did not account for the case when the argument of p is a thunk, to keep it simple. The GHC allocator can also trigger garbage collection.

Note that the above fictional implementation was actually used in the past by GHC (the so called "push/enter" STG machine). Even if now it is no longer in use, the "eval/apply" STG machine is only marginally closer to the C runtime. I'm not even sure about GHC using the regular C stack: I think it does not, using its own one.

You can check the GHC developer wiki to see the gory details.

Rebol Tail Call Optimization

Not without code, sorry. Rebol isn't compiled, so there's no way to know ahead of time exactly what constitutes a tail call. Even calls to the return function propagate back up the call stack, quickly but not by a goto.

IIRC the author of tail-func works on Rebol 3 now, and whether or not he does it should be easy to port over. Now that you mention it I'll take a look. Function generators and preprocessors are easy to do in Rebol.

Why does the JVM still not support tail-call optimization?

Diagnosing Java Code: Improving the Performance of Your Java Code (alt) explains why the JVM does not support tail-call optimization.

But although it is well known how to automatically transform a tail-recursive function into a simple loop, the Java specification doesn't require that this transformation be made. Presumably, one reason it is not a requirement is that, in general, the transformation can't be made statically in an object-oriented language. Instead, the transformation from tail-recursive function to simple loop must be done dynamically by a JIT compiler.

It then gives an example of Java code that won't transform.

So, as the example in Listing 3 shows, we cannot expect static compilers to perform transformation of tail recursion on Java code while preserving the semantics of the language. Instead, we must rely on dynamic compilation by the JIT. Depending on the JVM, the JIT may or may not do this.

Then it gives a test you can use to figure out if your JIT does this.

Naturally, since this is an IBM paper, it includes a plug:

I ran this program with a couple of
the Java SDKs, and the results were
surprising. Running on Sun's Hotspot
JVM for version 1.3 reveals that
Hotspot doesn't perform the
transformation. At default settings,
the stack space is exhausted in less
than a second on my machine. On the
other hand, IBM's JVM for version 1.3
purrs along without a problem,
indicating that it does transform the
code in this way.

Why is tail call optimization not occurring here?

VSadov provides the explicit reason for this in his response:

Generally JIT emits tail calls when it finds that profitable.

In addition, he goes on to state:

This is a part that is not expressible in C#. Unlike inlining, which
can be forced via attributes, tailcalling cannot be currently forced.
If one needs to write the code like emitted by EmitMethodCall, he
cannot use C#.

So the answer is that while tailcalls are definitely available and used, there is no way to either predict when they will be used or force them to be used in C#.



Related Topics



Leave a reply



Submit