I posted an earlier version of this patch series before, and I believe this has fixed the quibbles there were with it before. These changes pave the way for cleaning up ptrace a lot more in the future. This series is a prerequisite for the following "tracehook" series, but only because of patch conflicts. They can be reviewed separately. This series of patches is also available in GIT at the URL below, and in a mail folder of patch messages (use git-am or formail -s patch -p1) at: http://people.redhat.com/roland/utrace/2.6-current/ptrace-cleanup.mbox The following changes since commit 33af79d12e0fa25545d49e86afc67ea8ad5f2f40: Chandra Seetharaman (1): scsi_dh: Verify "dev" is a sdev before accessing it. are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace.git ptrace-cleanup Roland McGrath (4): do_wait reorganization ptrace children revamp do_wait: return security_task_wait() error code in place of -ECHILD fix dangling zombie when new parent ignores children include/linux/init_task.h | 4 +- include/linux/sched.h | 26 ++-- kernel/exit.c | 451 +++++++++++++++++++++++++++----------------- kernel/fork.c | 6 +- kernel/ptrace.c | 37 +++-- 5 files changed, 318 insertions(+), 206 deletions(-) Thanks, Roland --
This breaks out the guts of do_wait into three subfunctions.
The control flow is less nonobvious without so much goto.
do_wait_thread and ptrace_do_wait contain the main work of the outer loop.
wait_consider_task contains the main work of the inner loop.
Signed-off-by: Roland McGrath <roland@redhat.com>
---
kernel/exit.c | 215 ++++++++++++++++++++++++++++++++++++---------------------
1 files changed, 135 insertions(+), 80 deletions(-)
diff --git a/kernel/exit.c b/kernel/exit.c
index ceb2587..7453356 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1238,7 +1238,7 @@ static int wait_noreap_copyout(struct task_struct *p, pid_t pid, uid_t uid,
* the lock and this task is uninteresting. If we return nonzero, we have
* released the lock and the system call should return.
*/
-static int wait_task_zombie(struct task_struct *p, int noreap,
+static int wait_task_zombie(struct task_struct *p, int options,
struct siginfo __user *infop,
int __user *stat_addr, struct rusage __user *ru)
{
@@ -1246,7 +1246,10 @@ static int wait_task_zombie(struct task_struct *p, int noreap,
int retval, status, traced;
pid_t pid = task_pid_vnr(p);
- if (unlikely(noreap)) {
+ if (!likely(options & WEXITED))
+ return 0;
+
+ if (unlikely(options & WNOWAIT)) {
uid_t uid = p->uid;
int exit_code = p->exit_code;
int why, status;
@@ -1397,13 +1400,16 @@ static int wait_task_zombie(struct task_struct *p, int noreap,
* released the lock and the system call should return.
*/
static int wait_task_stopped(struct task_struct *p,
- int noreap, struct siginfo __user *infop,
+ int options, struct siginfo __user *infop,
int __user *stat_addr, struct rusage __user *ru)
{
int retval, exit_code, why;
uid_t uid = 0; /* unneeded, required by compiler */
pid_t pid;
+ if (!(p->ptrace & PT_PTRACED) && !(options & WUNTRACED))
+ return 0;
+
exit_code = 0;
spin_lock_irq(&p->sighand->siglock);
@@ -1421,7 +1427,7 @@ static int ...ptrace no longer fiddles with the children/sibling links, and the
old ptrace_children list is gone. Now ptrace, whether of one's own
children or another's via PTRACE_ATTACH, just uses the new ptraced
list instead.
There should be no user-visible difference that matters. The only
change is the order in which do_wait() sees multiple stopped
children and stopped ptrace attachees. Since wait_task_stopped()
was changed earlier so it no longer reorders the children list, we
already know this won't cause any new problems.
Signed-off-by: Roland McGrath <roland@redhat.com>
---
include/linux/init_task.h | 4 +-
include/linux/sched.h | 26 +++---
kernel/exit.c | 226 ++++++++++++++++++++++++---------------------
kernel/fork.c | 6 +-
kernel/ptrace.c | 37 +++++---
5 files changed, 160 insertions(+), 139 deletions(-)
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 9927a88..93c45ac 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -140,8 +140,8 @@ extern struct group_info init_groups;
.nr_cpus_allowed = NR_CPUS, \
}, \
.tasks = LIST_HEAD_INIT(tsk.tasks), \
- .ptrace_children= LIST_HEAD_INIT(tsk.ptrace_children), \
- .ptrace_list = LIST_HEAD_INIT(tsk.ptrace_list), \
+ .ptraced = LIST_HEAD_INIT(tsk.ptraced), \
+ .ptrace_entry = LIST_HEAD_INIT(tsk.ptrace_entry), \
.real_parent = &tsk, \
.parent = &tsk, \
.children = LIST_HEAD_INIT(tsk.children), \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ba2f859..1941d8b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1062,12 +1062,6 @@ struct task_struct {
#endif
struct list_head tasks;
- /*
- * ptrace_list/ptrace_children forms the list of my children
- * that were stolen by a ptracer.
- */
- struct list_head ptrace_children;
- struct list_head ptrace_list;
struct mm_struct *mm, *active_mm;
@@ -1089,18 +1083,25 @@ struct task_struct {
.../* * Nasty, nasty. * * We want to hold both the task-lock and the * tasklist_lock for writing at the same time. * But that's against the rules (tasklist_lock * is taken for reading by interrupts on other * cpu's that may have task_lock). hm, copying this code didn't do much to improve the world. Is there any prospect of "fixing" this somehow? Perhaps this code should be pulled up into a separate function, not --
Yes. After this settles, it will become tractable to change the whole locking plan for ptrace so we don't have this problem and get rid of this kludge altogether. Thanks, Roland --
This reverts the effect of commit f2cc3eb133baa2e9dc8efd40f417106b2ee520f3 "do_wait: fix security checks". That change reverted the effect of commit 73243284463a761e04d69d22c7516b2be7de096c. The rationale for the original commit still stands. The inconsistent treatment of children hidden by ptrace was an unintended omission in the original change and in no way invalidates its purpose. This makes do_wait return the error returned by security_task_wait() (usually -EACCES) in place of -ECHILD when there are some children the caller would be able to wait for if not for the permission failure. A permission error will give the user a clue to look for security policy problems, rather than for mysterious wait bugs. Signed-off-by: Roland McGrath <roland@redhat.com> --- kernel/exit.c | 30 ++++++++++++++++++++---------- 1 files changed, 20 insertions(+), 10 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index 1e90982..a2af6ca 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -1199,14 +1199,10 @@ static int eligible_child(enum pid_type type, struct pid *pid, int options, return 0; err = security_task_wait(p); - if (likely(!err)) - return 1; + if (err) + return err; - if (type != PIDTYPE_PID) - return 0; - /* This child was explicitly requested, abort */ - read_unlock(&tasklist_lock); - return err; + return 1; } static int wait_noreap_copyout(struct task_struct *p, pid_t pid, uid_t uid, @@ -1536,7 +1532,8 @@ static int wait_task_continued(struct task_struct *p, int options, * -ECHILD should be in *@notask_error before the first call. * Returns nonzero for a final return, when we have unlocked tasklist_lock. * Returns zero if the search for a child should continue; - * then *@notask_error is 0 if @p is an eligible child, or still -ECHILD. + * then *@notask_error is 0 if @p is an eligible child, + * or another error from security_task_wait(), or still -ECHILD. */ static int wait_consider_task(struct task_struct *parent, int ...
Acked-by: James Morris <jmorris@namei.org> -- James Morris <jmorris@namei.org> --
This fixes an arcane bug that we think was a regression introduced
by commit b2b2cbc4b2a2f389442549399a993a8306420baf. When a parent
ignores SIGCHLD (or uses SA_NOCLDWAIT), its children would self-reap
but they don't if it's using ptrace on them. When the parent thread
later exits and ceases to ptrace a child but leaves other live
threads in the parent's thread group, any zombie children are left
dangling. The fix makes them self-reap then, as they would have
done earlier if ptrace had not been in use.
Signed-off-by: Roland McGrath <roland@redhat.com>
---
kernel/exit.c | 26 ++++++++++++++++++++++++++
1 files changed, 26 insertions(+), 0 deletions(-)
diff --git a/kernel/exit.c b/kernel/exit.c
index a2af6ca..93d2711 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -703,6 +703,23 @@ static void exit_mm(struct task_struct * tsk)
}
/*
+ * Return nonzero if @parent's children should reap themselves.
+ *
+ * Called with write_lock_irq(&tasklist_lock) held.
+ */
+static int ignoring_children(struct task_struct *parent)
+{
+ int ret;
+ struct sighand_struct *psig = parent->sighand;
+ unsigned long flags;
+ spin_lock_irqsave(&psig->siglock, flags);
+ ret = (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
+ (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT));
+ spin_unlock_irqrestore(&psig->siglock, flags);
+ return ret;
+}
+
+/*
* Detach all tasks we were using ptrace on.
* Any that need to be release_task'd are put on the @dead list.
*
@@ -711,6 +728,7 @@ static void exit_mm(struct task_struct * tsk)
static void ptrace_exit(struct task_struct *parent, struct list_head *dead)
{
struct task_struct *p, *n;
+ int ign = -1;
list_for_each_entry_safe(p, n, &parent->ptraced, ptrace_entry) {
__ptrace_unlink(p);
@@ -726,10 +744,18 @@ static void ptrace_exit(struct task_struct *parent, struct list_head *dead)
* release_task() here because we already hold tasklist_lock.
*
* If it's our own child, there is no notification ...