If the buggy init exits, the kernel panics. I see no point for this. It is very
possible that the system is still usable enough, at least to read the logs and
prepare the bug report.
Change exit_child_reaper() to do BUG() instead of panic().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
--- 25/kernel/exit.c~4_EXIT_DONT_PANIC 2008-03-16 16:49:30.000000000 +0300
+++ 25/kernel/exit.c 2008-03-16 16:49:35.000000000 +0300
@@ -861,8 +861,10 @@ static inline void exit_child_reaper(str
if (likely(tsk->group_leader != task_child_reaper(tsk)))
return;
- if (tsk->nsproxy->pid_ns == &init_pid_ns)
- panic("Attempted to kill init!");
+ if (unlikely(tsk->nsproxy->pid_ns == &init_pid_ns)) {
+ printk(KERN_EMERG "Kernel panic - init is killed!\n");
+ BUG();
+ }
/*
* @tsk is the last thread in the 'cgroup-init' and is exiting.
--
BUG() does not seem right to me. This does not diagnose any kernel bug. The kernel source location and backtrace are not useful. In fact, they are likely to mislead the user into reporting the bug to the wrong place (because it will look like a kernel bug). I gather your motivation is to get something "recoverable" rather than always rebooting. This might be useful for developers like you and me. I suspect that conservative administrators of production systems prefer the current behavior. If the boot init dies, that is reasonably likely to be a "catastrophic" failure of the system as a whole as far as the proprietor of a production system is concerned. That is, the system may no longer behave as expected in ways essential for its normal operation. If it sticks around in that condition, appearing to be available but not doing everything it should, that is usually worse than a quick and orderly crash (which the installation's procedures and monitoring infrastructure are often prepared to handle). panic is a bit extreme for the situation, where we have no reason yet to think kernel data structures are inconsistent. A sync+reboot or sync+crash without bust_spinlocks et al might be better. For letting init die and calling it recoverable for hacking purposes, a sysctl to disable the panic/crash makes sense. But I don't think we should change the default setting. Have you tested how recoverable it really is? I wonder what happens with init having exited when things get reparented to it. Don't the zombies just pile up? Thanks, Roland --
That's not very different from any other "important" program crash, we don't sync+reboot when httpd or sshd is killed. -- Krzysztof Halasa --
Yes sure, we leak the re-parented zombies, and nobody can take care of /etc/inittab. As expected. Well, I think the generic "if we have a chance to survive, we should try to survive" rule is good. If the boot init dies, at least the admin has a chance to figure out what has happened, and -o remount,ro /. Every BUG/BUG_ON in fact means the system is not useable, but still it does not panic(), but tries to proceed. In short, I can't see why panic() is better. Except we have panic_timeout, OK, I won't argue (not that I agree ;). Oleg. --
A long long time ago someone posted a patch which arranged that if the init process died it got respawned via the same list of process names as happens on boot. Or for that matter we could catch the dying init and replace it with a kernel side while(1) wait(NULL); so that we at least continue the cleanup. Alan --
> But panic() isn't better? It doesn't provide any useful info. It is not misleading in the same way. It's clear that going to look at the For me and you, I agree. I think the common case is that there is no admin prepared to do any such thing, but just someone expecting a reboot to fix things and preferring that a failing system reboot itself in the middle of Many production systems probably set panic_on_oops. Having the init panic behavior keyed on that seems fine to me. I just don't like the "kernel bug at this source line" output when it's not true. Thanks, Roland --
Ah, OK. We can change this to dump_stack() without BUG(). (but again! panic() isn't better, it also looks like a kernel bug). Oleg. --
> Ah, OK. We can change this to dump_stack() without BUG(). I'm not sure that's a whole lot better as far as confusion. And what use is it? You know init exited. Just print its exit_code and/or group_exit_code and you know what kind of kernel entry the exit was. Thanks, Roland --
Agreed. It was either killed, or exited. The exit code provides enough info. Thanks, I'll resend this patch. Oleg. --
This would be highly undesirable in a production system, since it would leave the machine an unusable zombie. In a production system, the panic can be made to reboot the system, bringing it back online. -hpa --
I can't agree. Following this logic, we should always use panic() instead of BUG(). I think the system should try to use any chance to survive, and we have panic_on_oops. That said, Stephen has a good reason to nack this patch. But still I hope it is possible to find a simple solution. Not that I think this is really important, but this panic() is silly, imho. Oleg. --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
