Futex Crash with Swap enabled on Arm Board

Submitted by Vijayendra suman
on June 4, 2009 - 4:44am

After long time, this time on Arm

I got a crash like
BUG: scheduling while atomic: XXXXX/0x00000004/157, CPU#0
[] (dump_stack+0x0/0x14) from [] (__schedule+0x70/0x6f8)
[] (__schedule+0x0/0x6f8) from [] (schedule+0xbc/0xf4)
[] (schedule+0x0/0xf4) from [] (__down_read+0x100/0x118)
r4 = C046F600
[] (__down_read+0x0/0x118) from [] (compat_down_read+0x10/0)
r6 = C046F600 r5 = C027AB0C r4 = FFFFFFFF
[] (compat_down_read+0x0/0x14) from [] (do_page_fault+0x8c/)
[] (do_page_fault+0x0/0x228) from [] (do_DataAbort+0x3c/0xa)
[] (do_DataAbort+0x0/0xa0) from [] (__dabt_svc+0x4c/0x60)
r8 = 02279CFC r7 = 00000001 r6 = 02279CF8 r5 = C1293E2C
r4 = FFFFFFFF
[] (do_futex+0x0/0xfa0) from [] (sys_futex+0xe0/0xf4)
[] (sys_futex+0x0/0xf4) from [] (ret_fast_syscall+0x0/0x2c)
r8 = C001FFE8 r7 = 000000F0 r6 = 02279CF8 r5 = 04000001
r4 = 02279CF8

If you see the problem lies in the preemption count as 4 and even then the schedule function is called, the problem is because i had enabled a swap partition in the case Arm target board.

the code in the fault.c is

static notrace int
do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
{
struct task_struct *tsk;
struct mm_struct *mm;
int fault, sig, code;

tsk = current;
mm = tsk->mm;

/*
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_interrupt() || !mm)
goto no_context;

/*
* As per x86, we may deadlock here. However, since the kernel only
* validly references user space from well defined areas of the code,
* we can bug out early if this is from code which shouldn't.
*/
if (!down_read_trylock(&mm->mmap_sem)) {
if (!user_mode(regs) && !search_exception_tables(regs->ARM_pc))
goto no_context;
down_read(&mm->mmap_sem);
}

There is a check in for the in_interrupt in this case even though we are in the atomic context we are not getting page which is there in the swap partition,
This is what i am not sure why it is not getting,
check this link may be some one knows about it
http://www.linux-mips.org/archives/linux-mips/2008-11/msg00038.html

I think this is applicable to my case, but not an issue what i do is i add a check

in place of this

if (in_interrupt() || !mm)
goto no_context;
i add this

if (in_atomic() || !mm)
goto no_context;

in_interrupt is a subset of in_atomic

Though this solves the problem, but this really solves the issue i am not sure.

Now there is no crash, :)

But one issue i see that even though there is a page fault for the page pte_present flag is still set which represents that the memory is present in the memory

If any one has another way can give suggestion

Regards
Vijayendra Suman