Re: [patch] generic-ipi: fix the smp_mb() placement

Previous thread: [GIT PATCH] STAGING patches for 2.6-git by Greg KH on Wednesday, October 29, 2008 - 3:38 pm. (50 messages)

Next thread: [PATCH] ring-buffer: add paranoid checks for loops by Steven Rostedt on Wednesday, October 29, 2008 - 3:48 pm. (8 messages)
From: Suresh Siddha
Date: Wednesday, October 29, 2008 - 3:42 pm

While looking at some other issue recently, we encountered this smp_mb()
placement issue.  x86 specific code also needs some similar fixes. Patch for
that will follow soon.

Please review the appended generic-ipi fix.

thanks,
suresh
---

From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: generic-ipi: fix the smp_mb() placement

smp_mb() is needed (to make the memory operations visible globally) before
sending the ipi on the sender and the receiver (on Alpha atleast) needs
smp_read_barrier_depends() in the handler before reading the call_single_queue
list in a lock-free fashion.

On x86, x2apic mode register accesses for sending IPI's don't have serializing
semantics. So the need for smp_mb() before sending the IPI becomes more
critical in x2apic mode.

Remove the unnecessary smp_mb() in csd_flag_wait(), as the presence of that
smp_mb() doesn't mean anything on the sender, when the ipi receiver is not
doing any thing special (like memory fence) after clearing the CSD_FLAG_WAIT.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---

diff --git a/kernel/smp.c b/kernel/smp.c
index f362a85..75c8dde 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -51,10 +51,6 @@ static void csd_flag_wait(struct call_single_data *data)
 {
 	/* Wait for response */
 	do {
-		/*
-		 * We need to see the flags store in the IPI handler
-		 */
-		smp_mb();
 		if (!(data->flags & CSD_FLAG_WAIT))
 			break;
 		cpu_relax();
@@ -76,6 +72,11 @@ static void generic_exec_single(int cpu, struct call_single_data *data)
 	list_add_tail(&data->list, &dst->list);
 	spin_unlock_irqrestore(&dst->lock, flags);
 
+	/*
+	 * Make the list addition visible before sending the ipi.
+	 */
+	smp_mb();
+
 	if (ipi)
 		arch_send_call_function_single_ipi(cpu);
 
@@ -157,7 +158,7 @@ void generic_smp_call_function_single_interrupt(void)
 	 * Need to see other stores to list head for checking whether
 	 * list is empty without holding q->lock
 	 */
-	smp_mb();
+	smp_read_barrier_depends();
 ...
From: Jens Axboe
Date: Thursday, October 30, 2008 - 12:20 am

Ditto

-- 
Jens Axboe

--

From: Suresh Siddha
Date: Thursday, October 30, 2008 - 9:30 am

No. We want the ipi receiver to see the new consistent data rather than possible
old consistent data.

And on x86, smp_wmb() is a simple barrier() (in !CONFIG_X86_OOSTORE) and
which doesn't do much in this case.

on x86 mfence (smp_mb()) will ensure that msr based APIC (x2apic) accesses (ipi)
will be visible only after the memory operations before smp_mb() are made
visible.

thanks,
suresh
--

From: Jens Axboe
Date: Thursday, October 30, 2008 - 10:25 am

OK, I'm convinced. I'll queue up the patch, thanks!

-- 
Jens Axboe

--

From: Ingo Molnar
Date: Thursday, October 30, 2008 - 11:53 am

nice! Did you see an actual lockup due to this? Seems like a v2.6.28 
fix to me in any case.

	Ingo
--

From: Suresh Siddha
Date: Thursday, October 30, 2008 - 1:23 pm

We didn't see the lockup in our tests but Xen folks reported similar failures

Yes.

thanks,
suresh
--

From: Jeremy Fitzhardinge
Date: Thursday, October 30, 2008 - 10:10 pm

...really?  I don't remember anything like that, but perhaps I'm 
forgetting something.  In Xen the IPI is sent with a hypercall, which is 
definitely a solid enough barrier for these purposes.

    J
--

From: Ingo Molnar
Date: Friday, October 31, 2008 - 2:39 am

i think Suresh might be referring to some of the fragilities Xen had 
with generic-ipi. But those AFAICT were due to the on-stack lifetime 
bug that Nick fixed via the kmalloc? v2.6.26-ish issue.

	Ingo
--

From: Jeremy Fitzhardinge
Date: Friday, October 31, 2008 - 4:12 am

Right, that's all I could think of.

    J
--

From: Suresh Siddha
Date: Friday, October 31, 2008 - 9:53 am

No. I am referring to Xen hypervisor code fix recently done by the Xen
team in the Intel.

http://xenbits.xensource.com/xen-unstable.hg?rev/50170dc8649c

thanks,
suresh
--

From: Jeremy Fitzhardinge
Date: Friday, October 31, 2008 - 1:30 pm

Ah, yes, OK then.

    J
--

From: Ingo Molnar
Date: Monday, November 3, 2008 - 3:17 am

ok - so that makes it a v2.6.28 item i guess.

	Ingo
--

From: Jeremy Fitzhardinge
Date: Monday, November 3, 2008 - 4:48 pm

The case Suresh is talking about was a fix to Xen itself, rather than on 
the kernel side, so it doesn't need to be a .28 issue on Xen's account.

    J
--

From: Ingo Molnar
Date: Tuesday, November 4, 2008 - 2:19 am

ok - but still the portion of the fix that strengthens barriers looks 
obvious to have and there's little downside that i can see.

Suresh, you might want to split the patch(es) in two: get the barrier 
strengthening changes into v2.6.28 (to fix the x2apic bug), while the 
aspects that _weaken_ barriers can wait for v2.6.29.

With that it would be a 100% safe change for v2.6.28-rc4.

	Ingo
--

From: Suresh Siddha
Date: Tuesday, November 4, 2008 - 3:25 pm

Ok. I just posted three patches (including the x86 specific change).

[patch 1/3] generic-ipi: add smp_mb() before sending the IPI
[patch 2/3] x86: Add smp_mb() before sending INVALIDATE_TLB_VECTOR
[patch 3/3] generic-ipi: fix the smp_mb() usage

First two patches are safe to go into v2.6.28. Third patch can wait for v2.6.29.

thanks,
suresh
--

From: Jens Axboe
Date: Wednesday, November 5, 2008 - 2:20 am

I already have the combined 1+3 patch queued up...

-- 
Jens Axboe

--

Previous thread: [GIT PATCH] STAGING patches for 2.6-git by Greg KH on Wednesday, October 29, 2008 - 3:38 pm. (50 messages)

Next thread: [PATCH] ring-buffer: add paranoid checks for loops by Steven Rostedt on Wednesday, October 29, 2008 - 3:48 pm. (8 messages)