Avoiding stack overflows by allocating stack parts on the heap?

2018-06-04 07:27:52

Is there a language where we can enable a mechanism that allocates new stack space on the heap when the original stack space is exceeded?

I remember doing a lab in my university where we fiddled with inline assembly in C to implement a heap-based extensible stack, so I know it should be possible in principle.

I understand it may be useful to get a stack overflow error when developing an app because it terminates a crazy infinite recursion quickly without making your system take lots of memory and begin to swap.

However, when you have a finished well-tested application that you want to deploy and you want it to be as robust as possible (say it's a pretty critical program running on a desktop computer), it would be nice to know it won't miserably fail on some other systems where the stack is more limited, where some objects take more space, or if the program is faced with a very particular case requiring more stack memory than in your tests.

I think it's because of these pitfalls that recursion is usually avoided in production code. But if we had a mechanism for automatic stack expansion in production code, we'd be able to write more elegant programs using recursion knowing it won't unexpectedly segfault while the system has 16 gigabytes of heap memory ready to be used...

There is precedent for it.

The runtime for GHC, a Haskell compiler, uses the heap instead of the stack. The stack is only used when you call into foreign code.

Google's Go implementation uses segmented stacks for goroutines, which enlarge the stack as necessary.

Mozilla's Rust used to use segmented stacks, although it was decided that it caused more problems than it solved (see [rust-dev] Abandoning segmented stacks in Rust).

If memory serves, some Scheme implementations put stack frames on the heap, then garbage collected the frames just like other objects.

In traditional programming styles for imperative languages, most code will avoid calling itself recursively. Stack overflows are rarely seen in the wild, and they're usually triggered by either sloppy programming or by malicious input--especially to recursive descent parsers and the like, which is why some parsers reject code when the nesting exceeds a threshold.

The traditional advice for avoiding stack overflows in production code:

Don't write recursive code. (Example: rewrite a search algorithm to use an explicit stack.)

If you do write recursive code, prove that recursion is bounded. (Example: searching a balanced tree is bounded by the logarithm of the size of the tree.)

If you can't prove that it's unbounded, add a bound to it. (Example: add a limit to the amount of nesting that a parser supports.)

I don't believe you will find a language mandating this. But a particular implementation could offer such a mechanism, and depending on the operating system it can very well be that the runtime environment enlarges the stack automatically as needed.

According to gcc 's documentation, gcc can generate code which does this, if you compile with the -fsplit_stack option:

-fsplit-stack
     Generate code to automatically split the stack before it overflows.
     The resulting program has a discontiguous stack which can only
     overflow if the program is unable to allocate any more memory.
     This is most useful when running threaded programs, as it is no
     longer necessary to calculate a good stack size to use for each
     thread.  This is currently only implemented for the i386 and
     x86_64 backends running GNU/Linux.

     When code compiled with -fsplit-stack calls code compiled
     without -fsplit-stack, there may not be much stack space
     available for the latter code to run.  If compiling all code,
     including library code, with -fsplit-stack is not an option,
     then the linker can fix up these calls so that the code compiled
     without -fsplit-stack always has a large stack.  Support for
     this is implemented in the gold linker in GNU binutils release 2.21
     and later.

The llvm code generation framework provides support for segmented stacks, which are used in the go language and were originally used in Mozilla's rust (although they were removed from rust on the grounds that the execution overhead is too high for a "high-performance language". (See this mailing list thread)

Despite the rust -team's objections, segmented stacks are surprisingly fast, although the stack-thrashing problem can impact particular programs. (Some Go programs suffer from this issue, too.)

Another mechanism for heap-allocating stack segments in a relatively efficient way was proposed by Henry Baker in his 1994 paper Cheney on the MTA and became the basis of the run-time for Chicken Scheme, a compiled mostly R5RS-compatible scheme implementation.

链接地址: http://www.djcxy.com/p/14124.html

上一篇: 什么实际上导致堆栈溢出错误？

下一篇: 通过在堆上分配堆栈部分来避免堆栈溢出？