What Is the Maximum Depth of the Java Call Stack

What is the maximum depth of the java call stack?

It depends on the amount of virtual memory allocated to the stack.

http://www.odi.ch/weblog/posting.php?posting=411

You can tune this with the -Xss VM parameter or with the Thread(ThreadGroup, Runnable, String, long) constructor.

Maximum Depth of a Call Hierarchy

I often see junior developers complain about heavily factored out code that the call stack is too deep to keep track of. They especially complain about it during debugging. Sometimes when first learning the code as well. I usually answer with a question:

If "printf" was implemented internally using 12 functions does it matter? (and I've seen implementations where this is sort of true, it was 4 functions not 12 but the point holds)

The truth is, if at any point in the code you need to dig through more than two levels to understand what's going on then you haven't named your functions properly or your function prototype/API is confusing. Whichever it is it's a sign of bad design.

Personally I don't see actual call depth per se as a problem. It's just that if it ever manifest itself as a problem then it's a symptom of badly named or designed code.

If you need to pass an argument unchanged through more than one layer of functions than that argument should be a private variable of the class.
Sometimes a deep call stack within a single class indicates that the chain of functions are actually simple functions that's been prematurely coupled. It's often better to write simple functions that accept an argument and return something and then explicitly call them like: C(B(A())). In other words, keep your code orthogonal.
If when reading the code you're forced to dig through layers of functions then the functions have not been named properly.
If the functions are well named but you still need to dig through layers then it could indicate that you have another class hidden in your class. Refactor the code to extract the functionality of the deepest functions into its own class because it seems to be doing other things not directly related to what the class is supposed to do.

Is there a way to know maximally reached JVM call stack depth for a particular program run?

One can easily make JVMTI agent that will trace MethodEntry / MethodExit events and correspondingly increase or decrease stack depth counter. Here is an example of such agent. When the program ends, it will print the maximum recorded Java stack depth.

#include <jvmti.h>
#include <stdint.h>
#include <stdio.h>

static volatile int max_depth = 0;

static int adjust_stack_depth(jvmtiEnv *jvmti, int delta) {
    intptr_t depth = 0;
    (*jvmti)->GetThreadLocalStorage(jvmti, NULL, (void**)&depth);
    (*jvmti)->SetThreadLocalStorage(jvmti, NULL, (const void*)(depth + delta));
    return (int)depth;
}

void JNICALL MethodEntry(jvmtiEnv *jvmti, JNIEnv* jni, jthread thread, jmethodID method) {
    adjust_stack_depth(jvmti, +1);
}

void JNICALL MethodExit(jvmtiEnv *jvmti, JNIEnv* jni, jthread thread, jmethodID method,
                        jboolean was_popped_by_exception, jvalue return_value) {
    int depth = adjust_stack_depth(jvmti, -1);
    if (depth > max_depth) {
        max_depth = depth;  // TODO: replace with atomic CAS to avoid race condition
    }
}

JNIEXPORT jint JNICALL Agent_OnLoad(JavaVM *vm, char *options, void *reserved) {
    jvmtiEnv* jvmti;
    (*vm)->GetEnv(vm, (void**)&jvmti, JVMTI_VERSION_1_0);

    jvmtiCapabilities capabilities = {0};
    capabilities.can_generate_method_entry_events = 1;
    capabilities.can_generate_method_exit_events = 1;
    (*jvmti)->AddCapabilities(jvmti, &capabilities);

    jvmtiEventCallbacks callbacks = {0};
    callbacks.MethodEntry = MethodEntry;
    callbacks.MethodExit = MethodExit;
    (*jvmti)->SetEventCallbacks(jvmti, &callbacks, sizeof(callbacks));

    (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE, JVMTI_EVENT_METHOD_ENTRY, NULL);
    (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE, JVMTI_EVENT_METHOD_EXIT, NULL);

    return 0;
}

JNIEXPORT void JNICALL Agent_OnUnload(JavaVM *vm) {
    printf("Max stack depth = %d\n", max_depth);
}

Compile:

gcc -fPIC -shared -I $JAVA_HOME/include -I $JAVA_HOME/include/linux -o libmaxdepth.so maxdepth.c

Run:

java -agentpath:/path/to/libmaxdepth.so MyProgram

However, tracing each method entry and exit is very expensive. A less accurate, but much more efficient alternative would be a sampling profiler which periodically records a stack trace of a running thread, e.g. async-profiler or Java Flight Recorder.

Why is the max recursion depth I can reach non-deterministic?

The observed behavior is affected by the HotSpot optimizer, however it is not the only cause. When I run the following code

public static void main(String[] argv) {
    System.out.println(System.getProperty("java.version"));
    System.out.println(countDepth());
    System.out.println(countDepth());
    System.out.println(countDepth());
    System.out.println(countDepth());
    System.out.println(countDepth());
    System.out.println(countDepth());
    System.out.println(countDepth());
}
static int countDepth() {
    try { return 1+countDepth(); }
    catch(StackOverflowError err) { return 0; }
}

with JIT enabled, I get results like:

> f:\Software\jdk1.8.0_40beta02\bin\java -Xss68k -server -cp build\classes X
1.8.0_40-ea
2097
4195
4195
4195
12587
12587
12587

> f:\Software\jdk1.8.0_40beta02\bin\java -Xss68k -server -cp build\classes X
1.8.0_40-ea
2095
4193
4193
4193
12579
12579
12579

> f:\Software\jdk1.8.0_40beta02\bin\java -Xss68k -server -cp build\classes X
1.8.0_40-ea
2087
4177
4177
12529
12529
12529
12529

Here, the effect of the JIT is clearly visible, obviously the optimized code needs less stack space, and it’s shown that tiered compilation is enabled (indeed, using -XX:-TieredCompilation shows a single jump if the program runs long enough).

In contrast, with disabled JIT I get the following results:

> f:\Software\jdk1.8.0_40beta02\bin\java -Xss68k -server -Xint -cp build\classes X
1.8.0_40-ea
2104
2104
2104
2104
2104
2104
2104

> f:\Software\jdk1.8.0_40beta02\bin\java -Xss68k -server -Xint -cp build\classes X
1.8.0_40-ea
2076
2076
2076
2076
2076
2076
2076

> f:\Software\jdk1.8.0_40beta02\bin\java -Xss68k -server -Xint -cp build\classes X
1.8.0_40-ea
2105
2105
2105
2105
2105
2105
2105

The values still vary, but not within the single runtime thread and with a lesser magnitude.

So, there is a (rather small) difference that becomes much larger if the optimizer can reduce the stack space required per method invocation, e.g. due to inlining.

What can cause such a difference? I don’t know how this JVM does it but one scenario could be that the way a stack limit is enforced requires a certain alignment of the stack end address (e.g. matching memory page sizes) while the memory allocation returns memory with a start address that has a weaker alignment guaranty. Combine such a scenario with ASLR and there might be always a difference, within the size of the alignment requirement.

How to predict the maximum call depth of a recursive method?

This is clearly JVM- and possibly also architecture-specific.

I've measured the following:

  static int i = 0;
  public static void rec0() {
      i++;
      rec0();
  }

  public static void main(String[] args) {
      ...
      try {
          i = 0; rec0();
      } catch (StackOverflowError e) {
          System.out.println(i);
      }
      ...
  }

using

Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
Java HotSpot(TM) 64-Bit Server VM (build 23.5-b02, mixed mode)

running on x86.

With a 20MB Java stack (-Xss20m), the amortized cost fluctuated around 16-17 bytes per call. The lowest I've seen was 16.15 bytes/frame. I therefore conclude that the cost is 16 bytes and the rest is other (fixed) overhead.

A function that takes a single int has basically the same cost, 16 bytes/frame.

Interestingly, a function that takes ten ints requires 32 bytes/frame. I am not sure why the cost is so low.

The above results apply after the code's been JIT compiled. Prior to compilation the per-frame cost is much, much higher. I haven't yet figured out a way to estimate it reliably. However, this does mean that you have no hope of reliably predicting maximum recursion depth until you can reliably predict whether the recursive function has been JIT compiled.

All of this was tested with a ulimit stack sizes of 128K and 8MB. The results were the same in both cases.

Set maximum recursion depth in java

Create this class:

public class RecursionLimiter {
    public static int maxLevel = 10;

    public static void emerge() {
        if (maxLevel == 0)
            return;
        try {
            throw new IllegalStateException("Too deep, emerging");
        } catch (IllegalStateException e) {
            if (e.getStackTrace().length > maxLevel + 1)
                throw e;
        }
    }
}

Then import static and insert emerge() call into the beginning of any method in your code that can be deeply recursive. You can adjust maximum allowed recursion level via the maxLevel variable. The emerge() procedure will interrupt execution on a level greater than the value of that variable. You can switch off this behaviour by setting maxLevel to 0. This solution is thread-safe because it doesn't use any counter at all.

Why does the JVM have a maximum inline depth?

Some significant searching uncovers this interesting little fragment (I actually got as far as page 4 of the Google search):

    if (inline_depth() > MaxInlineLevel) {
        return "inlining too deep";
    }
    if (method() == callee_method
            && inline_depth() > MaxRecursiveInlineLevel) {
        return "recursively inlining too deep";
    }

Which suggest that the MaxInlineLevel is as expected a hard limit to how deep you go before you stop inlining. It also suggests that the MaxRecursiveInlineLevel refers only to direct recursive calls, not mutal recursive calls such as foo() calls bar() which calls foo().

So I think I was right in my guess comment - MaxInlineLevel is to protect against mutual recursion because to detect that you would need to keep references to the full depth of the inlining call stack.

MaxInlineResursionLevel controls foo() calls foo() inlining.

Note that the referenced code may not be a real JVM.

Comments by @apangin locates a more modern version of hotspot from Open JDK 8 suggest that it is nowadays no longer quite as simple as that. It looks like the full stack is searched for recursive calls so mutual recursion may also now be blocked from going past MaxRecursiveInlineLevel.

Recursive call stack depth

This is my mistake again... the setting for the Java stack is -Xss (the -Xms setting is the starting heap size), sorry. So if you use the JVM Arguments section in the Debugger tab of the launcher, and set something like -Xss5m, you should get further.

In a simple experiment with a recursive function, the default stack allowed me a depth of 227 calls. Using -Xss5m gave me 4020 calls, and -Xss10m gave me 8050 calls. Note that these stack sizes are somewhat less that the Gb sizes you were trying - 5Mb of stack is a lot of calls!

Determining the size of the Operand Stack for a Stack Frame

how does the JVM know how much space to allocate for the operand stack section

"The maximum depth of the operand stack of a frame is determined at compile-time and is supplied along with the code for the method associated with the frame." JVMS §2.6.2, §4.7.3.

what happens if the operand stack fills up?

As cited above, the size of the operand stack for each individual frame is known beforehand. The computation cannot use more operand stack than specified in the class file for a particular method, otherwise class verification would fail.

"At no point during execution can the operand stack grow to a depth greater than that implied by the max_stack item." JVMS §4.9.2.

How to measure the length of a call stack?

Actually, every function call ends in the call stack.

Your example looks like C; in C, there is always a main function; even the main function ends on the call stack.

I don't think there is a way to examine the call stack in C; especially since the compiler is allowed to optimise away whatever it wants. For instance, it could optimise tail-recursion, and then the call stack would be smaller than you'd expect.

In Python the call stack is easy to examine; just crash the function whenever you want, by throwing an exception (for instance with assert(False)). Then the program will produce an error message containing the full "stack trace", including the list of every function on the stack.

Here is an example of a stack trace in python:

def fact1(n):
    assert(n != 1)
    return n * fact1(n-1)

def main():
    f = fact1(3)
    print(f)

main()

Output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in main
  File "<stdin>", line 3, in fact1
  File "<stdin>", line 3, in fact1
  File "<stdin>", line 2, in fact1
AssertionError

And another example just for fun:

def print_even(n):
    if (n <= 1):
        print('yes' if n == 0 else 'no')
        assert(False)
    else:
        print_odd(n-1)

def print_odd(n):
    if (n <= 1):
        print('yes' if n == 1 else 'no')
        assert(False)
    else:
        print_even(n-1)

def main():
    n = 5
    print('Is {} even?'.format(n))
    print_even(n)

main()

Output:

Is 5 even?
no
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in main
  File "<stdin>", line 6, in print_even
  File "<stdin>", line 6, in print_odd
  File "<stdin>", line 6, in print_even
  File "<stdin>", line 6, in print_odd
  File "<stdin>", line 4, in print_even
AssertionError

What Is the Maximum Depth of the Java Call Stack