Is There a Limit of Stack Size of a Process in Linux

Is there a limit of stack size of a process in linux

The stack is normally limited by a resource limit. You can see what the default settings are on your installation using ulimit -a:

stack size              (kbytes, -s) 8192

(this shows that mine is 8MB, which is huge).

If you remove or increase that limit, you still won't be able to use all the RAM in the machine for the stack - the stack grows downward from a point near the top of your process's address space, and at some point it will run into your code, heap or loaded libraries.

How do I find the maximum stack size?

You can query the maximum process and stack sizes using getrlimit. Stack frames don't have a fixed size; it depends on how much local data (i.e., local variables) each frame needs.

To do this on the command-line, you can use ulimit.

If you want to read these values for a running process, I don't know of any tool that does this, but it's easy enough to query the /proc filesystem:

cat /proc/<pid>/limits

Maximum size of stack of multi threaded process

each thread of a process gets a stack, while there's typically only
one heap for the process.

That's correct.

Is this limit applicable at process level or each thread can have
1MB/8MB stack?

Each thread gets its own stack; the stack-size-limit is per-thread (i.e. it is not a shared limit for all threads in the process)

And what happens to the memory allotted to stack after thread exit?

The memory pages are released and become available for use by other code in the future.

Linux process stack overrun by local variables (stack guarding)

_chkstk does stack probes to make sure each page is touched in order after a (potentially) large allocation, e.g. an alloca. Because Windows will only grow the stack one page at a time up to the stack size limit.

Touching that "guard page" triggers stack growth. It doesn't guard against stack overflow; I think you're misinterpreting the meaning of "guard page" in this usage.

The function name is also potentially misleading. _chkstk docs simply say: Called by the compiler when you have more than one page of local variables in your function. It doesn't truly check anything, it just makes sure that intervening pages have been touched before memory around esp/rsp gets used. i.e. the only possible effects are: nothing (possibly including a valid soft page fault) or an invalid page-fault on stack overflow (trying to touch a page that Windows refused to grow the stack to include.) It ensures that the stack pages are allocated by unconditionally writing them.

I guess you could look at this as checking for a stack clash by making sure you touch an unmappable page before continuing in the case of stack overflow.

Linux will grow the main-thread stack¹ by any number of pages (up to the stack size limit set by ulimit -s; default 8MiB) when you touch memory below old stack pages if it's above the current stack pointer.

If you touch memory outside the growth limit, or don't move the stack pointer first, it will just segfault. Thus Linux doesn't need stack probes, merely to move the stack pointer by as many bytes as you want to reserve. Compilers know this and emit code accordingly.

See also How is Stack memory allocated when using 'push' or 'sub' x86 instructions? for more low-level details on what the Linux kernel does, and what glibc pthreads on Linux does.

A sufficiently large alloca on Linux can move the stack all the way past the bottom of the stack growth region, beyond the guard pages below that, and into another mapping; this is a Stack Clash. https://blog.qualys.com/securitylabs/2017/06/19/the-stack-clash It of course requires that the program uses a potentially-huge size for alloca, dependent on user input. The mitigation for CVE-2017-1000364 is to leave a 1MiB guard region, requiring a much larger alloca than normal to get past the guard pages.

This 1MiB guard region is below the ulimit -s (8MiB) growth limit, not below the current stack pointer. It's separate from Linux's normal stack growth mechanism.

`gcc -fstack-check`

The effect of gcc -fstack-check is essentially the same as what's always needed on Windows (which MSVC does by calling _chkstk): touch stack pages in between previous and new stack pointer when moving it by a large or runtime-variable amount.

But the purpose / benefit of these probes is different on Linux; it's never needed for correctness in a bug-free program on GNU/Linux. It "only" defends against stack-clash bugs/exploits.

On x86-64 GNU/Linux, gcc -fstack-check will (for functions with a VLA or large fixe-size array) add a loop that does stack probes with or qword ptr [rsp], 0 along with sub rsp,4096. For known fixed array sizes, it can be just a single probe. The code-gen doesn't look very efficient; it's normally never used on this target. (Godbolt compiler explorer example that passes a stack array to a non-inline function.)

https://gcc.gnu.org/onlinedocs/gccint/Stack-Checking.html describes some GCC internal parameters that control what -fstack-check does.

If you want absolute safety against stack-clash attacks, this should do it. It's not needed for normal operation, though, and a 1MiB guard page is enough for most people.

Note that -fstack-protector-strong is completely different, and guards against overwrite of the return address by buffer overruns on local arrays. Nothing to do with stack clashes, and the attack is against stuff already on the stack above a small local array, not against other regions of memory by moving the stack a lot.

Footnote 1: Thread stacks on Linux (for threads other than the initial one) have to be fully allocated up front because the magic growth feature doesn't work. Only the initial aka main thread of a process can have that.

(There's an mmap(MAP_GROWSDOWN) feature but it's not safe because there's no limit, and because nothing stops other dynamic allocations from randomly picking a page close below the current stack, limiting future growth to a tiny size before a stack clash. Also because it only grows if you touch the guard page, so it would need stack probes. For these showstopper reasons, MAP_GROWSDOWN is not used for thread stacks. The internal mechanism for the main stack relies on different magic in the kernel which does prevent other allocations from stealing space.)

What does ulimit -s unlimited do?

When you call a function, a new "namespace" is allocated on the stack. That's how functions can have local variables. As functions call functions, which in turn call functions, we keep allocating more and more space on the stack to maintain this deep hierarchy of namespaces.

To curb programs using massive amounts of stack space, a limit is usually put in place via ulimit -s. If we remove that limit via ulimit -s unlimited, our programs will be able to keep gobbling up RAM for their evergrowing stack until eventually the system runs out of memory entirely.

int eat_stack_space(void) { return eat_stack_space(); }
// If we compile this with no optimization and run it, our computer could crash.

Usually, using a ton of stack space is accidental or a symptom of very deep recursion that probably should not be relying so much on the stack. Thus the stack limit.

Impact on performace is minor but does exist. Using the time command, I found that eliminating the stack limit increased performance by a few fractions of a second (at least on 64bit Ubuntu).

Processes exceeding thread stack size limit on RedHat Enterprise Linux 6?

Turns out that RHEL6 2.11 have changed the thread model such that each thread where possible gets allocated its own thread pool, so on a larger system you may see it grabbing up to the 64MB. On 64 bit the max number of thread pools allowed is greater.

The fix for this was to add

export LD_PRELOAD=/path/to/libtcmalloc.so

in the script that starts the processes (rather than using glibc2.11)

Some more inforation on this is available from:

Linux glibc >= 2.10 (RHEL 6) malloc may show excessive virtual memory usage
https://www.ibm.com/developerworks/mydeveloperworks/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en

glibc bug malloc uses excessive memory for multi-threaded applications
http://sourceware.org/bugzilla/show_bug.cgi?id=11261

Apache hadoop have fixed the problem by setting MALLOC_ARENA_MAX
https://issues.apache.org/jira/browse/HADOOP-7154

Is There a Limit of Stack Size of a Process in Linux