Re: [PATCH] fix rcu vs hotplug race

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Ingo Molnar
Date: Monday, June 30, 2008 - 11:52 pm

* Ingo Molnar <mingo@elte.hu> wrote:


this is the patch i picked up:

-------------------------->
Subject: rcu: fix hotplug vs rcu race
From: Gautham R Shenoy <ego@in.ibm.com>
Date: Fri, 27 Jun 2008 10:17:38 +0530

Dhaval Giani reported this warning during cpu hotplug stress-tests:

| On running kernel compiles in parallel with cpu hotplug:
|
| WARNING: at arch/x86/kernel/smp.c:118
| native_smp_send_reschedule+0x21/0x36()
| Modules linked in:
| Pid: 27483, comm: cc1 Not tainted 2.6.26-rc7 #1
| [...]
|  [<c0110355>] native_smp_send_reschedule+0x21/0x36
|  [<c014fe8f>] force_quiescent_state+0x47/0x57
|  [<c014fef0>] call_rcu+0x51/0x6d
|  [<c01713b3>] __fput+0x130/0x158
|  [<c0171231>] fput+0x17/0x19
|  [<c016fd99>] filp_close+0x4d/0x57
|  [<c016fdff>] sys_close+0x5c/0x97

IMHO the warning is a spurious one.

cpu_online_map is updated by the _cpu_down() using stop_machine_run().
Since force_quiescent_state is invoked from irqs disabled section,
stop_machine_run() won't be executing while a cpu is executing
force_quiescent_state(). Hence the cpu_online_map is stable while we're
in the irq disabled section.

However, a cpu might have been offlined _just_ before we disabled irqs
while entering force_quiescent_state(). And rcu subsystem might not yet
have handled the CPU_DEAD notification, leading to the offlined cpu's
bit being set in the rcp->cpumask.

Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent sending
smp_reschedule() to an offlined CPU.

Here's the timeline:

CPU_A						 CPU_B
--------------------------------------------------------------
cpu_down():					.
.					   	.
.						.
stop_machine(): /* disables preemption,		.
		 * and irqs */			.
.						.
.						.
take_cpu_down();				.
.						.
.						.
.						.
cpu_disable(); /*this removes cpu 		.
		*from cpu_online_map 		.
		*/				.
.						.
.						.
restart_machine(); /* enables irqs */		.
------WINDOW DURING WHICH rcp->cpumask is stale ---------------
.						call_rcu();
.						/* disables irqs here */
.						.force_quiescent_state();
.CPU_DEAD:					.for_each_cpu(rcp->cpumask)
.						.   smp_send_reschedule();
.						.
.						.   WARN_ON() for offlined CPU!
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH] fix rcu vs hotplug race, Dhaval Giani, (Mon Jun 23, 3:37 am)
Re: [PATCH] fix rcu vs hotplug race, Ingo Molnar, (Mon Jun 23, 3:58 am)
Re: [PATCH] fix rcu vs hotplug race, Gautham R Shenoy, (Mon Jun 23, 4:49 am)
Re: [PATCH] fix rcu vs hotplug race, Ingo Molnar, (Tue Jun 24, 4:01 am)
Re: [PATCH] fix rcu vs hotplug race, Paul E. McKenney, (Thu Jun 26, 8:27 am)
Re: [PATCH] fix rcu vs hotplug race, Gautham R Shenoy, (Thu Jun 26, 9:47 pm)
Re: [PATCH] fix rcu vs hotplug race, Dipankar Sarma, (Thu Jun 26, 10:18 pm)
Re: [PATCH] fix rcu vs hotplug race, Dhaval Giani, (Thu Jun 26, 10:49 pm)
Re: [PATCH] fix rcu vs hotplug race, Paul E. McKenney, (Fri Jun 27, 7:58 am)
Re: [PATCH] fix rcu vs hotplug race, Gautham R Shenoy, (Mon Jun 30, 10:39 pm)
Re: [PATCH] fix rcu vs hotplug race, Ingo Molnar, (Mon Jun 30, 11:16 pm)
Re: [PATCH] fix rcu vs hotplug race, Dhaval Giani, (Mon Jun 30, 11:28 pm)
Re: [PATCH] fix rcu vs hotplug race, Ingo Molnar, (Mon Jun 30, 11:35 pm)
Re: [PATCH] fix rcu vs hotplug race, Ingo Molnar, (Mon Jun 30, 11:52 pm)
Re: [PATCH] fix rcu vs hotplug race, Ingo Molnar, (Tue Jul 1, 12:48 am)
Re: [PATCH] fix rcu vs hotplug race, Ingo Molnar, (Tue Jul 1, 1:32 am)
Re: [PATCH] fix rcu vs hotplug race, Paul E. McKenney, (Tue Jul 1, 12:46 pm)
Re: [PATCH] fix rcu vs hotplug race, Paul E. McKenney, (Fri Aug 1, 2:01 pm)