"It looks to be about 2.1% increase in time to do the make/mount/unmount operations with the marker patches in place and no blktrace operations," Alan Brunelle summarized some benchmarks testing the overhead of the kernel markers patches. He continued, "with the blktrace operations in place we see about a 3.8% decrease in time to do the same ops." Block layer maintainer Jens Axboe responded favorably, "thanks for running these numbers. I don't think you have to bother with it more. My main concern was a performance regression, increasing the overhead of running blktrace." He added, "I'd say the above is Good Enough for me," acking the kernel marker patches.
Jens went on to muse, "I do wonder about that performance _increase_ with blktrace enabled. I remember that we have seen and discussed something like this before, it's still a puzzle to me..." Mathieu Desnoyers agreed, "interesting question indeed," going on to suggest possible future tests to understand the unexpected performance increase. blktrace is a block layer IO tracing tool for providing detailed information about request queue operations, originally developed by Jens Axboe and merged into the mainline kernel in 2.6.17-rc1.
Thomas Gleixner and Ingo Molnar [interview] posted an update of their high-res timers kernel patches for the 2.6.17 kernel, "upon which we based a tickless kernel (dyntick) implementation and a 'dynamic HZ' feature as well". The patch currently works for x86, with ports to x86_64, PPC and ARM in the works. Thomas explains, "the high-res timers feature (CONFIG_HIGH_RES_TIMERS) enables POSIX timers and nanosleep() to be as accurate as the hardware allows (around 1usec on typical hardware). This feature is transparent - if enabled it just makes these timers much more accurate than the current HZ resolution." He goes on to discribe the tickless kernel:
"The tickless kernel feature (CONFIG_NO_HZ) enables 'on-demand' timer interrupts: if there is no timer to be expired for say 1.5 seconds when the system goes idle, then the system will stay totally idle for 1.5 seconds. This should bring cooler CPUs and power savings: on our (x86) testboxes we have measured the effective IRQ rate to go from HZ to 1-2 timer interrupts per second.
"This feature is implemented by driving 'low res timer wheel' processing via special per-CPU high-res timers, which timers are reprogrammed to the next-low-res-timer-expires interval. This tickless-kernel design is SMP-safe in a natural way and has been developed on SMP systems from the beginning."
Linus Torvalds announced the release of the 2.6.17 Linux kernel this past weekend, following the previous stable kernel release by three months [story]. He noted, "not a lot of changes since the last -rc, the bulk is actually some last-minute MIPS updates and s390 futex changes, the rest tend to be various very small fixes that trickled in over the last week. Have fun with it". The latest kernel source can be downloaded from your nearest Linux Kernel Archive mirror [story]. You can browse through all the changes using the gitweb interface.
Andrew Morton [interview] offered a list of patches in his mm tree, summarizing for each his plans as to whether or not they will be pushed to Linus for inclusion in the upcoming 2.6.17 kernel. Comments on the patches range from the simple "will merge" to pushing them to others for review. One of the more entertaining comments followed a set of 33 patches where Andrew noted, "This is Oleg's romp through the core kernel. There's a ton of material here. I'll probably send it all to Linus and ask him to review it. (aka blame-shifting)." Later in the thread he explained, "it's just a whole lot of code in areas which are tricky and in which few people work and in which reviewing resources are slight."
One set of patches refused with the comment, "still don't have a compelling argument for this, IMO" was Con Kolivas [interview]' swap prefetching efforts [story]. The feature was discussed in a couple of follow up threads. In response to some concerns raised by Jens Axboe, Con explained the implementation a little further, "If the system is idle it doesn't cost anything to bring those pages in (laptop mode disables any prefetching if you're thinking about power consumption on laptops). And if the system wants the ram that has been filled with prefetched pages wrongly, the prefetched pages are at the tail end of the inactive LRU list with a copy on backing store so if they're not accessed they'll be the first thing dropped in preference to anything else, without any I/O."