On Fri, Aug 22, 2008 at 12:29:10PM +1000, Nick Piggin wrote:
Only from having to do this analysis over and over again trying to
understand what has changed in the elevator that has negated the
effect of some previous optimisation....
It's exactly this sort of complexity that is the problem. When the
behaviour of such things change, filesystems that are optimised for
the previous behaviour are not updated - we're not even aware that
the elevator has been changed in some subtle manner that breaks
the optimisations that have been done.
To keep on top of this, we keep adding new variations and types and
expect the filesystems to make best use of them (without
documentation) to optimise for certain situations. Example - the
new(ish) BIO_META tag that only CFQ understands. I can change the
way XFS issues bios to use this tag to make CFQ behave the same way
it used to w.r.t. metadata I/O from XFS, but then the deadline and
AS will probably regress because they don't understand that tag and
still need the old optimisations that just got removed. Ditto for
prioritised bio dispatch - CFQ supports it but none of the others
do.
IOWs, I am left with a choice - optimise for a specific elevator
(CFQ) to the detriment of all others (noop, as, deadline), or make
the filesystem work best with the simple elevator (noop) and
consider the smarter schedulers deficient if they are slower than
the noop elevator....
You're suggesting that I add complexity to solve the too much complexity
problem.... ;)
I get private email fairly often asking questions as to why XFS is
slower going from, say, 2.6.23 to 2.6.24 and then speeds back up in
2.6.25. I seen a number of cases where the answer to this was that
elevator 'x' with XFS in 2.6.x because for some reason it is much,
much slower than the others on that workload on that hardware.
As seen earlier in this thread, this can be caused by a problem with
the hardware, firmware, configuration, driver bugs, etc - there are
so many combinations of variables that can cause performance issues
that often the only 'macro' level change that you can make to avoid
them is to switch schedulers. IOWs, while a specific scheduler has
not changed, the code around it has changed sufficiently for a
specific elevator to show a regression compared to the otherr
elevators.....
Basically, the complexity of the interactions between the
filesystems, elevators and the storage devices is such that there
are transient second order effects occurring that are not reported
widely because they are easily worked around by switching elevators.
Which can still change by things like changing merging behaviour.
Granted, it is less complex, but still we can have subtle changes
having major impact in less commonly run workloads...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--