Andrew Morton replied to a commit message making 4k stacks the default, saying, "this patch will cause kernels to crash." Ingo Molnar replied, "what mainline kernels crash and how will they crash? Fedora and other distros have had 4K stacks enabled for years." He added, "we've conducted tens of thousands of bootup tests with all sorts of drivers and kernel options enabled and have yet to see a single crash due to 4K stacks." During the lengthy discussion it was suggested that nfs+xfs+raid kernel configurations, and using ndiswrapper are the most common reasons for overflowing a 4K stack size.
Andi Kleen questioned the usefulness of 4k stacks, "as far as I can figure out they are not [a worthy goal]. They might have been a worthy goal on crappy 2.4 VMs, but these times are long gone." Arjan van de Ven suggested that though the 2.6 VM is much improved over the 2.4 VM, fragmentation with 8K stacks remains an unsolvable problem, "it's basic math; the Linux VM gets to deal with both short and long lasting allocations; no matter how hard you try to get some degree of fragmentation; especially due to the 15:1 acceleration you get due to the lowmem issue. And before you say 'you should use 64 bit on such machines'; I would love it if more people used 64 bit linux. Sadly the adoption rate of that is not very good still.... by far ;(" In another email, Arjan listed two advantages to 4K stacks, "1) less memory consumption in the lowmem zone (critical for enterprise use, also good for general performance), and 2) kernel stacks at 8K are one of the most prominent order-1 allocations in the kernel; again with big-memory systems the fragmentation of the lowmem zone is a problem (and the distros that ship 4K stacks went there because of customer complaints)".
Matt LaPlante reported that there's currently 151,809 bytes of trailing white space in the Linux kernel, requiring a 15 megabyte patch to remove it all. Andi Kleen argued that the white space didn't much matter, "you don't actually save anything on disk on most file systems (essentially everything except reiserfs on current Linux) because all files are rounded to block size (normally 4K). Same in page cache. And in tar files bzip2/gzip is very good at compacting them."
Andi went on to add that it's an issue that is slowly solving itself, "many kernel maintainers automatically remove trailing white space on any new lines these days. So as the kernel keeps changing it should eventually all disappear; except on essentially dead code." Pádraig Brady confirmed that things are naturally improving over time, as a similar report in 2001 found 224,654 bytes of trailing whitespace in the Linux kernel.
Sam Ravnborg took a look at the x86 unification patches and commented, "from the mails and discussions I expected it be be obvious what was i386 only, what was shared and what was x86_64 only." He listed 16 files in x86/pci and noted, "in the filename there is NOTHING for most of the non-shared code that tell that this file is used by only i386 or x86_64." Andi Kleen concurred, "exactly my point from KS. The big mash-up will not really make much difference in terms of Makefile clarity or whatever Thomas' point was. Apparently he wanted to eliminate a few lines of code from the Makefile and merge the header files completely?"
Linus Torvalds disagreed saying, "the problem right now is the *reverse* - even though they are in different subdirectories (and thus *look* like they are all separate), they aren't. So the merged end result is much better: at a first approximation everything is shared (largely true), and the ones that aren't shared can be made more obvious." He added, "at least things like "grep" will work sanely, and people will be *aware* that 'Oh, this touches a file that may be used by the other word-size'." Linus continued:
"Right now, we have people changing 'i386-only' files that turn out to be used by x86-64 too - through very subtle Makefile things that the person who only looks into the i386 Makefile will never even *see*.
"THAT is the problem (well, at least part of it)."
Thomas Gleixner described an effort to create a unified x86 architecture tree, "the core idea behind our project is simple to describe: we introduce a new arch/x86/ and include/asm-x86/ file hierarchy that includes all the existing 32-bit and 64-bit x86 code and allows the building of either a 32-bit (i386) kernel or a 64-bit (x86_64) kernel." Andi Kleen expressed some concern, "I think it's a bad idea because it means we can never get rid of any old junk. IMNSHO arch/x86_64 is significantly cleaner and simpler in many ways than arch/i386 and I would like to preserve that. Also in general arch/x86_64 is much easier to hack than arch/i386 because it's easier to regression test and in general has to care about much less junk. And I don't know of any way to ever fix that for i386 besides splitting the old stuff off completely." Additional concerns about legacy issues were countered by Linus Torvalds, "there really isn't that much legacy crud. There are things like random quirks, but every time I hear the (theoretical) argument about how much time and effort we save by having it duplicated somewhere else, I think about all the time we definitely waste by fixing the same bug twice (and worry about the cases where we don't)." Among the justifications for a unified architecture, Thomas noted:
"We believe that the whole x86 CPU family is very much related and should be supported in a single architecture tree. All 64-bit CPUs implement the ability to execute pure 32-bit kernels, and will probably do so for the next couple of decades. So it's not like it will ever be possible to get rid of our legacies: for example even the latest 64-bit CPUs implement the legacy 'A20 line' feature that was already a weird outdated hack in the days of 16-bit x8086 CPUs."
Included in Andrew Morton's potential 2.6.23 merge list [story] were a series of patches to make the x86-64 architecture tickless. Andi Kleen, the x86-64 maintainer replied, "I'm sceptical about the dynticks code. It just rips out the x86-64 timing code completely, which needs a lot more review and testing. Probably not .23." Linus Torvalds agreed, "we are *not* going to do another 'rip everything out, and replace it with new code' again. Over my dead body. We're going to do this thing gradually, or not at all." He went on to explain "the patch-set itself actually looks fine, as far as I'm concerned. But it does seem to have that 'enable everything in one go' problem. I'd much rather see one time source at a time being converted, and enabled then and there, so that when people report problems and do a bisection, if it was HPET that broke, you get the commit that changed HPET."
In response to the pains caused by the original dyntick merge in 2.6.21, Ingo Molnar acknowledged, "we had 12 -hrt/dynticks merge related regressions between 2.6.21-rc1 and -final, and 4 after final." He went on to point out, "it's all pretty quiet today on the dynticks regressions front. (there are no open regressions in either the upstream i386 code or in the devel patches we are aware of)." As to the source of the bugs, he explained, "the majority of the above bugs were in the infrastructure code. (the worst was the generic resume/suspend one fixed in 2.6.21.2) And sadly, a fair number of the infrastructure bugs we introduced during the frentic clockevents/dynticks rewrites/redesigns we did between .20 and .21. That was a royally stupid mistake for us to do - instead of patiently waiting for the bugs to be shaken out we destabilized the infrastructure. (it was a 'lets make this thing so nice that it's impossible to reject' instintic gut reaction.)" Linus replied, "one thing I'll happily talk about is that while 2.6.21 was painful, you and Thomas in particular were both very responsible about the thing, so no, I'm not at all complaining or worried about it in that sense! I just really _really_ wish we could have two fairly stable releases in a row. I think 2.6.22 has the potential to be a pretty good setup, and I'd really like to avoid having another 2.6.21 immediately afterwards."
In the ongoing effort to reduce the power consumption of the linux kernel [story] and take better advantage of the tickless kernel patch [story], Stephen Hemminger posted a patch to make it possible to unload the keyboard blink driver, "the blink driver wakes up every jiffy which wastes power unnecessarily. Using a notifier gives same effect. Also add ability to unload module." The blink driver was only recently merged, described as a "simple driver that blinks the keyboard LEDs when loaded. Useful for checking that the kernel is still alive or for crashdumping."
Linus Torvalds reviewed the driver and retorted, "I really get the feeling this thing should be removed entirely. Wasting power is the _least_ of its problems." When it was pointed out that the driver is only a debugging tool, Linus listed his complaints, "it has been a total disaster from beginning to end. It wastes power. It hangs machines when it tries to blink," going on to add, "its main problem is that PEOPLE SHOULD NOT USE IT, but it sounds cool, so people end up configuring the damn thing even though they shouldn't." Ultimately, Linus removed the driver before the 2.6.22 release [story] noting, "we could have just disabled it, but there's work on a new one that isn't as fundamentally broken, so there really doesn't seem to be any point in keeping it around."