From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Subject: sched: Delay task stack freeing on RT Date: Tue, 28 Sep 2021 14:24:30 +0200 Anything which is done on behalf of a dead task at the end of finish_task_switch() is preventing the incoming task from doing useful work. While it is benefitial for fork heavy workloads to recycle the task stack quickly, this is a latency source for real-time tasks. Therefore delay the stack cleanup on RT enabled kernels. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210928122411.593486363@linutronix.de --- kernel/exit.c | 5 +++++ kernel/fork.c | 5 ++++- kernel/sched/core.c | 8 ++++++-- 3 files changed, 15 insertions(+), 3 deletions(-) @ kernel/exit.c:175 @ static void delayed_put_task_struct(stru kprobe_flush_task(tsk); perf_event_delayed_put(tsk); trace_sched_process_free(tsk); + + /* RT enabled kernels delay freeing the VMAP'ed task stack */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + put_task_stack(tsk); + put_task_struct(tsk); } --- a/kernel/fork.c +++ b/kernel/fork.c @ kernel/exit.c:292 @ static inline void free_thread_stack(str return; } - vfree_atomic(tsk->stack); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + vfree_atomic(tsk->stack); + else + vfree(tsk->stack); return; } #endif --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @ kernel/exit.c:4848 @ static struct rq *finish_task_switch(str if (prev->sched_class->task_dead) prev->sched_class->task_dead(prev); - /* Task is done with its stack. */ - put_task_stack(prev); + /* + * Release VMAP'ed task stack immediate for reuse. On RT + * enabled kernels this is delayed for latency reasons. + */ + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + put_task_stack(prev); put_task_struct_rcu_user(prev); }