"After posting some benchmarks involving cfs, I got some feedback, so I decided to do a follow-up that'll hopefully fill in the gaps many people wanted to see filled," Rob Hussey began. He added, "this time around I've done the benchmarks against 2.6.21, 2.6.22-ck1, and 2.6.23-rc6-cfs-devel (latest git as of 12 hours ago)." Rob briefly summarized, "the only analysis I'll offer is that both sd and cfs are improvements, and I'm glad that there is a lot of work being done in this area of linux development. Much respect to Con Kolivas, Ingo Molnar, and Roman Zippel, as well all the others who have contributed."
Referring to a chart in which the blue line represented the CFS process scheduler, and the green line represented the SD "staircase" process scheduler, Ingo Molnar noted, "heh - am i the only one impressed by the consistency of the blue line in this graph? :-) [ and the green line looks a bit like a .. staircase? ]" He acknowledged some slowdown in CFS compared to SD in one of the benchmarks, "-ck1 is 0.8% faster in this particular test." Ingo then explained, "many things happened between 2.6.22-ck1 and 2.6.23-cfs-devel that could affect performance of this test. My initial guess would be sched_clock() overhead." In further testing he applied a low-res-sched-clock that resulted in better performance for CFS leading him to conclude, "the performance difference between -ck and -cfs-devel seems to be mostly down to the more precise (but slower) sched_clock() introduced in v2.6.23 and to the startup penalty of freshly created tasks." When asked if the low-res-sched-clock was likely to be merged, Ingo replied:
"I don't think so - we want precise/accurate scheduling before performance. (otherwise tasks working off the timer tick could steal away cycles without being accounted for them fairly, and could starve out all other tasks.) Unless the difference was really huge in real life - but it isn't."
Some of the concerns expressed about the Completely Fair Scheduler were reports that it might not handle 3D games as well as the SD scheduler. In a recent thread, Ingo Molnar noted, "people are regularly testing 3D smoothness, and they find CFS good enough and that matches my experience as well (as limited as it may be). In general my impression is that CFS and SD are roughly on par when it comes to 3D smoothness." He noted that all known regressions were reported against earlier versions of CFS that had long since been fixed, and that he was very interested in any new reports of regressions against the current version of the code, "what is more interesting (to me) is not the positive CFS feedback but negative CFS feedback (although positive feedback certain _feels_ good so don't hold it back intentionally ;-)," adding, "there are no open 3D related regressions for CFS at the moment." Ingo then offered benchmarks illustrating the improved 3D performance of CFS, with numbers showing it to perform as well and in some cases considerably better than the SD scheduler.
Linus Torvalds noted, "I don't think _any_ scheduler is perfect, and almost all of the time, the RightAnswer(tm) ends up being not 'one or the other', but 'somewhere in between'." He noted that he was confident that he'd made the right decision in merging CFS, then added, "but at the same time, no technical decision is ever written in stone. It's all a balancing act. I've replaced the scheduler before, I'm 100% sure we'll replace it again. Schedulers are actually not at all that important in the end: they are a very very small detail in the kernel."
During the recent debates about the Completely Fair Scheduler, Ingo Molnar explained why he rewrote the scheduler, "CFS started out as an experiment to simplify the scheduler, to clean up the after-effects of a better-desktop-scheduling patch Mike Galbraith sent me. Had anyone told me at that time that I'd end up writing a new scheduler I'd have laughed at the suggestion and I'd have pointed to the large number of pending patches of mine in forms of the -rt tree, the syslet/threadlet code and other stuff that needs fixing a lot more urgent than the task scheduler." Regarding the recent debate he added, "there was simply no code in existence before CFS which has proven the code simplicity/design virtues of 'fair scheduling' - SD was more of an argument _against_ it than for it. I think maybe even Con might have been surprised by that simplicity: in his first lkml reaction to CFS he also wrote that he finds the CFS code 'beautiful', and my reply to Con's mail still addresses a good number of points raised in this thread i think." Ingo also described his development style:
"I don't typically write code because I'm particularly 'convinced' about an idea or because I 'believe in' an idea, I mostly write code to _check_ whether an idea is worth advancing or not. Writing code is my form of 'thinking', and releasing patches is my form of telling others about my 'thoughts'. I might have guesses about how well something will work out in practice (and I'd certainly be a fool to go out coding blindly), but surprises happen almost always, both in positive and in negative direction, and even with relatively simple patches."
Discussion continues regarding the choice to merge the CFS scheduler into the upcoming 2.6.23 kernel. A recent thread looked at the possibility of having merged the plugsched code to allow for both the CFS and SD schedulers to coexist in the mainline kernel at the same time, thereby avoiding the recent flamewars. Linus Torvalds pointed to the ManagementStyle documentation and acknowledged that while it's better to avoid flamefests when possible, "at the same time, I don't like playing politics with technology. The kernel is a technical project, and I make technical decisions. So I absolutely detest adding code for 'political' reasons."
As to the technical reason why he wasn't interested in making the CPU scheduler pluggable, Linus explained, "this is one approach, but it's actually one that I personally think is often the worst possible choice. Why? Because it ends up meaning that you never get the cross-pollination from different approaches (they stay separate 'modes'), and it's also usually really bad for users in that it forces the user to make some particular choice that the user is usually not even aware of." He went on to note, "I personally think that it's much better to find a setup that works 'well enough' for people, without having modal behaviour. People complain and gripe now, but what people seem to be missing is that it's a journey, not an end-of-the-line destination. We haven't had a single release kernel with the new scheduler yet". He added, "this, btw, has nothing to do with schedulers per se. We have had these exact same issues in the memory management too - which is a lot more complex than scheduling".
"People who think SD was 'perfect' were simply ignoring reality," Linus Torvalds began in a succinct explanation as to why he chose the CFS scheduler written by Ingo Molnar instead of the SD scheduler written by Con Kolivas. He continued, "sadly, that seemed to include Con too, which was one of the main reasons that I never [entertained] the notion of merging SD for very long at all: Con ended up arguing against people who reported problems, rather than trying to work with them." He went on to stress the importance of working toward a solution that is good for everyone, "that was where the SD patches fell down. They didn't have a maintainer that I could trust to actually care about any other issues than his own." He then offered some praise to Ingo, "as a long-term maintainer, trust me, I know what matters. And a person who can actually be bothered to follow up on problem reports is a *hell* of a lot more important than one who just argues with reporters." Linus went on to note a comparison between the two schedulers:
"I realize that this comes as a shock to some of the SD people, but I'm told that there was a university group that did some double-blind testing of the different schedulers - old, SD and CFS - and that everybody agreed that both SD and CFS were better than the old, but that there was no significant difference between SD and CFS."
Con Kolivas [interview] continues to maintain the performance oriented -ck patchset that he started in early 2004 [story], "this patchset is designed to improve system responsiveness and interactivity. It is configurable to any workload but the default -ck patch is aimed at the desktop and -cks is available with more emphasis on serverspace." In Con's latest release, 2.6.21-ck1, he notes that he has updated the patchset to include his improved SD cpu scheduler [story], "the staircase-deadline cpu scheduler has replaced the old staircase design in this version."
Con goes on to explain, "the staircase-deadline cpu scheduler can be set in either purely forward-looking mode for absolutely rigid fairness and cpu distribution according to nice level, or it can allow a small per-process history to smooth out cpu usage perturbations common in interactive tasks by enabling this sysctl. While small fairness issues can arise with this enabled, overall fairness is usually still strongly maintained and starvation is never possible. Enabling this can significantly smooth out 3d graphics and games." Swap prefetch [story] is also among the patches included in the -ck patchset.
2.4 kernel maintainer [story] Willy Tarreau ran some tests to compare Con Kolivas [interview]'s Staircase Deadline CPU scheduler [story] with Ingo Molnar [interview]'s new Completely Fair Scheduler [story]. He summarized his experiences:
"I think that CFS is based on a more promising concept but is less mature and is dangerous right now with certain workloads. SD shows some strange behaviours like not using all CPU available and a little jerkyness, but is more robust and may be the less risky solution for a first step towards a better scheduler in mainline, but it may also probably be the last O(1) scheduler, which may be replaced sometime later when CFS (or any other one) shows at the same time the smoothness of CFS and the robustness of SD."
Some debate was raised by logic added since CFS version 3 to automatically nice the X process for better GUI responsiveness. The CFS changelog comment labels the change as a usibility fix explaining, "automatic renicing of kernel threads such as keventd, OOM tasks and tasks doing privileged hardware access (such as Xorg)." Ingo posted a standalone patch demonstrating how these processes are detected and automatically niced, offering it for inclusion into Con's Staircase scheduler. Willy concurred that it was a good idea, "I think it could be a good idea since you recommend to renice X with SD. Most of the problem users are facing with renicing X is that they need to change their configs or scripts. If the kernel can reliably detect X and handle it differently, why not do it ?" Con was less convinced, "hmm well I have tried my very best to do all the changes without changing 'policy' as much as possible since that trips over so many emotive issues that noone can agree on, and I don't have a strong opinion on this as I thought it would be better for it to be a config option for X in userspace instead. Either way it needs to be turned on/off by admin and doing it by default in the kernel is... not universally accepted as good."