RE: [PATCH] x86: Export tsc related information in sysfs

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Dan Magenheimer
Date: Monday, May 17, 2010 - 8:23 am

(note important alternate proposal suggested below, please
don't get bored and stop reading yet ;-)


You've convinced ME that it is NOT a good idea in many many cases
and that some of those cases are very very hard to detect.  Unfortunately,
convincing ME is not sufficient.  The problem is that Pandora's
box is already open and it is getting ever more wide open. (Each
of you has implied that TSC on future systems is getting even
more likely to be reliable.) Not only is rdtsc unprivileged
but so is the CPUID Invariant TSC bit.  And the kernel already
publishes current_clocksource and puts a tempting MHz rate in
its logs.  While we all might hope that every system programmer
will read this long thread and also be convinced never to use
rdtsc, the problem is there is a high and increasing chance that
any given systems programmer will write code that uses rdtsc and,
in all of his/her test machines will NEVER see a problem.  But,
sadly, some of the customers using that app WILL see the problem
BUT neither the customers nor the systems programmer may ever know
what really went wrong (unless they hire Thomas to debug it :-)

So let me suggest inverting the logic of the patch and maybe
it will serve all of us better.  (see proposal below)



OK, so let's invert the sense of the sysfs file and call it (for now)
"tsc_detected_as_UNreliable".  Then anytime the kernel detects
a failed warp test (or any other suspicious condition), it changes
the bit from 0 to 1 effectively saying "if you are using rdtsc
against our recommendation, we told you that it might go bad and
it has, so consider yourself warned that some of the timestamps
you've taken since the last time you've checked this flag
may be b*rked"

IMHO, addressing the issue directly and clearly documenting it
(instead of trying to hide the dirty laundry in the kernel)
will result in far better education of systems programmers
and far fewer end user problems. Which raises another good analogy:

You are telling teenagers to abstain and I am proposing that we
instead encourage them to use a condom.

You are simply not going to stop systems programmers from using
rdtsc... let's at least allow them to practice "safe TSC usage"
which, to extend the analogy, we all know is still not 100.0%
effective but, if they are going to do "it" anyway, is still
far better than completely unprotected TSC usage.

(The remainder of this response is discussion of individual
points raised, so can be skipped by most readers.)

====


These are all very good reasons where the kernel might turn
on a "tsc_detected_as_unreliable" bit.  And CPU-hotplug_add
too.

And, by the way, are any of these valid in a virtual machine?


(and in the "tsc_detected_as_unreliable" == 1 case)
Well, some possibilities are:
- Disable the heavy TSC usage and log/notify that it has been
  disabled because the TSC is not reliable.
- Switch to gettimeofday and log/notify that, sorry, overall
  performance has become slower due to the fact that the
  TSC has become reliable
- Log a message telling the user to confirm that the hardware
  they are using is on the app's supported/compatibility list
  and that they are using the latest firmware and that their
  thermal logs are OK, because something might not be quite
  right with their system


That might have helped, because the enterprise app I mentioned
earlier was 32-bit... but I'll bet the genie is out of the
bottle now and the app has already shipped using rdtsc.


<Embarrassed to not know this rule, Dan goes off and googles
but fails to find any good matches before his TSC goes bad>


Agreed, I've measured 30-ish and 60-ish cycles on a
couple of machines.


I think vsyscall is an excellent idea and I'm very much in
favor of continuing to improve it and encouraging people
to use it.  But until it either "always" "just works" in "all"
of the software/hardware environments used by a systems
programmer in their development and testing and in the systems
the apps get deployed on (including virtual machines) OR until
it clearly provides an indicator that it is hiding dirty performance
laundry, IMHO it won't convince the (admittedly undereducated)
pro-TSC crowd.


Here I was referring to clock skew/drift, not the "fixed offset"
problem.  I'm far from an expert in PLL's etc, but I think if
the clock signal is delayed far enough to cause TSC to skew
significantly, eventually some critical cache coherency protocol
is eventually going to miss a beat and screw up and corrupt data.


That has never been my intent, though others might be interested
in that.


I believe (and this is strictly a personal opinion based on my
view of human psychology) that adding more obscure interfaces and
more obscure options is a losing battle.


As we both know, the ticking time bomb is already there.  IMHO this
revised proposal provides you with "plausible deniability"**
"The kernel now advertises when it detects the TSC is bad or has gone
bad... did your app vendor check the kernel-provided info?"

** http://en.wikipedia.org/wiki/Plausible_deniability 
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH] x86: Export tsc related information in sysfs, Venkatesh Pallipadi, (Fri May 14, 6:40 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, Jaswinder Singh Rajput, (Sat May 15, 5:35 am)
Re: [PATCH] x86: Export tsc related information in sysfs, Venkatesh Pallipadi, (Sat May 15, 7:37 am)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Sat May 15, 3:32 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, Arjan van de Ven, (Sat May 15, 10:43 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, Thomas Gleixner, (Sun May 16, 2:20 am)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Sun May 16, 9:42 am)
RE: [PATCH] x86: Export tsc related information in sysfs, Thomas Gleixner, (Sun May 16, 12:14 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, Arjan van de Ven, (Sun May 16, 1:29 pm)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Sun May 16, 6:31 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, Arjan van de Ven, (Sun May 16, 10:06 pm)
RE: [PATCH] x86: Export tsc related information in sysfs, Thomas Gleixner, (Mon May 17, 3:20 am)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Mon May 17, 8:23 am)
RE: [PATCH] x86: Export tsc related information in sysfs, Thomas Gleixner, (Mon May 17, 3:36 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, Peter Zijlstra, (Tue May 18, 2:58 am)
Re: [PATCH] x86: Export tsc related information in sysfs, Peter Zijlstra, (Tue May 18, 3:03 am)
Re: [PATCH] x86: Export tsc related information in sysfs, Peter Zijlstra, (Tue May 18, 4:58 am)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Tue May 18, 8:13 am)
Re: [PATCH] x86: Export tsc related information in sysfs, H. Peter Anvin, (Tue May 18, 9:40 am)
Re: [PATCH] x86: Export tsc related information in sysfs, Peter Zijlstra, (Tue May 18, 9:52 am)
Re: [PATCH] x86: Export tsc related information in sysfs, H. Peter Anvin, (Tue May 18, 10:04 am)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Tue May 18, 10:49 am)
Re: [PATCH] x86: Export tsc related information in sysfs, H. Peter Anvin, (Tue May 18, 11:46 am)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Tue May 18, 12:00 pm)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Tue May 18, 12:16 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, H. Peter Anvin, (Tue May 18, 12:26 pm)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Tue May 18, 1:29 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, H. Peter Anvin, (Tue May 18, 1:34 pm)
RE: [PATCH] x86: Export tsc related information in sysfs, Dan Magenheimer, (Tue May 18, 2:02 pm)
Re: [PATCH] x86: Export tsc related information in sysfs, Peter Zijlstra, (Tue May 18, 11:26 pm)