Re: [RFC][RT][PATCH 3/4] rtmutex: Revert Optimize rt lock wakeup

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Gregory Haskins
Date: Tuesday, December 28, 2010 - 7:06 am

>>> On 12/23/2010 at 11:54 PM, in message
<1293166464.22802.415.camel@gandalf.stny.rr.com>, Steven Rostedt
<rostedt@goodmis.org> wrote: 

Well, I think that would be a good datapoint and is one of the things I'd like to see.


This is why I am skeptical.  You are essentially asserting there are two issues here, IIUC:

1) The intent of avoiding a wakeup is broken and we take the double whammy of a mb()
plus the wakeup() anyway.

2) mb() is apparently slower than wakeup().

I agree (1) is plausible, though I would like to see the traces to confirm.  Its been a long time
since I looked at that code, but I think the original code either ran in RUNNING_MUTEX and was
inadvertently broken in the mean time or the other cpu would have transitioned to RUNNING on
its own when we flipped the owner before the release-side check was performed.  Or perhaps
we just plain screwed this up and it was racy ;)  I'm not sure.  But as Peter (M) stated, it seems
like a shame to walk away from the concept without further investigation.  I think everyone can
agree that at the very least, if it is in fact taking a double whammy we should fix that.

For (2), I am skeptical in two parts ;).  You stated you thought mb() was just as expensive as a
wakeup which seems suspect to me, given a wakeup needs to be a superset of a barrier
II[R|U]C.  Lets call this "2a".  In addition, your results when you removed the logic and went 
straight to a wakeup() and found dbench actually was faster than the "fixed mb()" path would 
imply wakeup() is actually _faster_ than mb().  Lets call this "2b".

For (2a), I would like to see some traces that compare mb() to wakeup() (of a presumably 
already running task that happens in the INTERRUPTIBLE state) to be convinced that wakeup() is 
equal/faster.  I suspect it isn't

For (2b), I would suggest that we don't rely on dbench alone in evaluating the merit of the 
change.  In some ways, its a great test for this type of change since it leans heavily on the coarse 
VFS locks.  However, dbench is also pretty odd and thrives on somewhat chaotic behavior.  For 
instance, it loves the "lateral steal" logic, even though this patch technically breaks fairness.  So
I would therefore propose a suite of benchmarks known for creating as much lock contention as
possible should be run in addition to dbench alone.

Happy new year, all,
-Greg

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[RFC][RT][PATCH 0/4] rtmutex: Simplify PI code, Steven Rostedt, (Thu Dec 23, 3:47 pm)
Re: [RFC][RT][PATCH 3/4] rtmutex: Revert Optimize rt lock ..., Peter W. Morreale, (Fri Dec 24, 8:47 am)
Re: [RFC][RT][PATCH 3/4] rtmutex: Revert Optimize rt lock ..., Gregory Haskins, (Tue Dec 28, 7:06 am)
Re: [RFC][RT][PATCH 3/4] rtmutex: Revert Optimize rt lock ..., Peter W. Morreale, (Tue Jan 4, 8:19 am)
Re: [RFC][RT][PATCH 3/4] rtmutex: Revert Optimize rt lock ..., Peter W. Morreale, (Tue Jan 4, 10:15 am)
Re: [RFC][RT][PATCH 3/4] rtmutex: Revert Optimize rt lock ..., Peter W. Morreale, (Tue Jan 4, 10:24 am)
Re: [RFC][RT][PATCH 3/4] rtmutex: Revert Optimize rt lock ..., Peter W. Morreale, (Tue Jan 4, 10:45 am)
Re: [RFC][RT][PATCH 3/4] rtmutex: Revert Optimize rt lock ..., Peter W. Morreale, (Tue Jan 4, 1:48 pm)