Johannes Schlatow
2016-08-30 16:33:42 UTC
Hi all,
I guess you are familiar with the problem of stack overflows in multi-threaded
components. I already encountered a bunch of weird errors that were hard to
track down until I remembered that it could simply be caused by a too small stack.
I know that Genode's policy is to use single-threaded components. Occasionally,
however, one needs additional threads, especially when using 3rd party
libraries.
Stack overflows are not only very annoying and time consuming but can (imo) also be
mitigated rather easily. I therefore think it would be worth implementing
a protection or detection mechanism for this in Genode.
The "problem" here is actually the `Stack_allocator` which places the stacks of
the component threads consecutively in memory. I.e. if one thread exceeds its
allocated stack area, it likely corrupts the stack of another thread (of the
same component).
Hence, one could improve the `Stack_allocator` so that it keeps a guard page
between the stacks in order to cause a page fault on any stack overflow. This
comes at the cost of a slightly increased complexity and possibly also an
increased memory consumption (because it requires a second-level translation
table).
Alternatively, I can imagine a kernel-level (base-hw) approach which uses
canaries at the top of each stack. Every time the kernel switches to a user
thread, it checks whether the canary is still alive. If not, another thread's
stack must have overflowed. Of course, this method is only reliable if we can
assume that every memory word on the stack will be initialised (preferably
sequentially).
Note that I'm not eager to implement these techniques in the near future.
Nevertheless, I thought it would be great to start a discussion and collect
comments and additional ideas.
Looking forward to your feedback!
Cheers
Johannes
------------------------------------------------------------------------------
I guess you are familiar with the problem of stack overflows in multi-threaded
components. I already encountered a bunch of weird errors that were hard to
track down until I remembered that it could simply be caused by a too small stack.
I know that Genode's policy is to use single-threaded components. Occasionally,
however, one needs additional threads, especially when using 3rd party
libraries.
Stack overflows are not only very annoying and time consuming but can (imo) also be
mitigated rather easily. I therefore think it would be worth implementing
a protection or detection mechanism for this in Genode.
The "problem" here is actually the `Stack_allocator` which places the stacks of
the component threads consecutively in memory. I.e. if one thread exceeds its
allocated stack area, it likely corrupts the stack of another thread (of the
same component).
Hence, one could improve the `Stack_allocator` so that it keeps a guard page
between the stacks in order to cause a page fault on any stack overflow. This
comes at the cost of a slightly increased complexity and possibly also an
increased memory consumption (because it requires a second-level translation
table).
Alternatively, I can imagine a kernel-level (base-hw) approach which uses
canaries at the top of each stack. Every time the kernel switches to a user
thread, it checks whether the canary is still alive. If not, another thread's
stack must have overflowed. Of course, this method is only reliable if we can
assume that every memory word on the stack will be initialised (preferably
sequentially).
Note that I'm not eager to implement these techniques in the near future.
Nevertheless, I thought it would be great to start a discussion and collect
comments and additional ideas.
Looking forward to your feedback!
Cheers
Johannes
------------------------------------------------------------------------------