H. Peter Anvin submitted a series of patches rewriting the x86 setup code, "this patch set replaces the x86 setup code, which is currently all in assembly, with a version written in C, using the '.code16gcc' feature of binutils (which has been present since at least 2001.)" He went on to explain why he did this, "the new code is vastly easier to read, and, I hope, debug. It should be noted that I found a fair number of minor bugs while going through this code, and have attempted to correct them."
Linus Torvalds reacted favorably, "I can't really argue against this on any sane grounds - not only is it removing more lines than it adds, but moving from mostly unreadable assembly to C seems a good idea." He went on to note, "so let's just get this merged. But the question is, do we put it in 2.6.23-rc1, or do we put it in -mm for a few weeks, which would imply waiting for the next merge window? Andrew?" Andrew Morton pointed out that the patches have been in -mm already for a couple of months, "this code has been in -mm since 11 May, as git-newsetup.patch. It has caused (for what it is) astonishingly few problems. Maybe a couple of build glitches and one runtime failure, all quickly fixed. I'd say it's ready." Linus agreed, "Ok. That makes it easy. I'll just merge it."
Following up to a bug report against the 2.6.22 kernel, Andrew Morton and Linus Torvalds offered some tips on how to debug kernel problems. Andrew first pointed to netconsole.txt for instructions on setting up a netconsole, "when the machine has stalled, see if you can get a task trace with ALT-SYSRQ-t. This will require CONFIG_MAGIC_SYSRQ=y and possibly setting ignore_loglevel on the kernel boot command line."
Linus Torvalds suggested "git bisect" as an alternative, "[it] will take some time, but is really a lot easier" He explains, "there's almost 7000 commits in between 2.6.21 and 22, but that still means that in about fourteen recompiles/reboots, "git bisect" should tell us where your problem starts, which will hopefully make it obvious what the problem is (or at least pinpoint it a *lot*)." He goes on to detail how to install git, obtain the latest kernel, and run "git bisect", "doing a git bisect isn't really that hard, but fourteen compiles/reboots will take some time (well, the compiles will, the reboots aren't that bad). But even if you're not a git user, it really is very simple". Specifically, he notes, "start the 'git bisect' with 'git bisect good v2.6.21', 'git bisect bad v2.6.22', and it will pick a kernel version about half-way between the two points, and you can now start testing. For each kernel you try, if it boots fine, do 'git bisect good', otherwise boot into a working kernel, and then do 'git bisect bad'. Git will then pick the next 'halfway' kernel for that case."
Included in Andrew Morton's potential 2.6.23 merge list [story] were a series of patches to make the x86-64 architecture tickless. Andi Kleen, the x86-64 maintainer replied, "I'm sceptical about the dynticks code. It just rips out the x86-64 timing code completely, which needs a lot more review and testing. Probably not .23." Linus Torvalds agreed, "we are *not* going to do another 'rip everything out, and replace it with new code' again. Over my dead body. We're going to do this thing gradually, or not at all." He went on to explain "the patch-set itself actually looks fine, as far as I'm concerned. But it does seem to have that 'enable everything in one go' problem. I'd much rather see one time source at a time being converted, and enabled then and there, so that when people report problems and do a bisection, if it was HPET that broke, you get the commit that changed HPET."
In response to the pains caused by the original dyntick merge in 2.6.21, Ingo Molnar acknowledged, "we had 12 -hrt/dynticks merge related regressions between 2.6.21-rc1 and -final, and 4 after final." He went on to point out, "it's all pretty quiet today on the dynticks regressions front. (there are no open regressions in either the upstream i386 code or in the devel patches we are aware of)." As to the source of the bugs, he explained, "the majority of the above bugs were in the infrastructure code. (the worst was the generic resume/suspend one fixed in 2.6.21.2) And sadly, a fair number of the infrastructure bugs we introduced during the frentic clockevents/dynticks rewrites/redesigns we did between .20 and .21. That was a royally stupid mistake for us to do - instead of patiently waiting for the bugs to be shaken out we destabilized the infrastructure. (it was a 'lets make this thing so nice that it's impossible to reject' instintic gut reaction.)" Linus replied, "one thing I'll happily talk about is that while 2.6.21 was painful, you and Thomas in particular were both very responsible about the thing, so no, I'm not at all complaining or worried about it in that sense! I just really _really_ wish we could have two fairly stable releases in a row. I think 2.6.22 has the potential to be a pretty good setup, and I'd really like to avoid having another 2.6.21 immediately afterwards."
In the ongoing effort to reduce the power consumption of the linux kernel [story] and take better advantage of the tickless kernel patch [story], Stephen Hemminger posted a patch to make it possible to unload the keyboard blink driver, "the blink driver wakes up every jiffy which wastes power unnecessarily. Using a notifier gives same effect. Also add ability to unload module." The blink driver was only recently merged, described as a "simple driver that blinks the keyboard LEDs when loaded. Useful for checking that the kernel is still alive or for crashdumping."
Linus Torvalds reviewed the driver and retorted, "I really get the feeling this thing should be removed entirely. Wasting power is the _least_ of its problems." When it was pointed out that the driver is only a debugging tool, Linus listed his complaints, "it has been a total disaster from beginning to end. It wastes power. It hangs machines when it tries to blink," going on to add, "its main problem is that PEOPLE SHOULD NOT USE IT, but it sounds cool, so people end up configuring the damn thing even though they shouldn't." Ultimately, Linus removed the driver before the 2.6.22 release [story] noting, "we could have just disabled it, but there's work on a new one that isn't as fundamentally broken, so there really doesn't seem to be any point in keeping it around."
Linux creator Linus Torvalds announced the official release of the 2.6.22 kernel, "it's out there now (or at least in the process of mirroring out - if you don't see everything, give it a bit of time)." He summarized the changes since 2.6.22-rc7 [story]:
"Not a whole lot of changes since -rc7: some small architecture changes (ppc, mips, blackfin), and most of those are defconfig updates. Various driver fixes: new PCI ID's along with some ide, ata and networking fixes (for example - the magic wireless libertas ioctl's got removed, they may be re-added later, hopefully in a more generic form, but in the meantime this doesn't make a release with new interfaces that aren't universally liked)."
The previous stable kernel, 2.6.21, was released a little over two months ago on April 25'th [story]. An overview of all the changes merged into the latest version of the kernel is maintained in the Kernel Newbies wiki. Included in the list of changes are the SLUB allocator which replaced the slab allocator, a new wireless stack, a new firewire stack [story], and support for the Blackfin architecture. Source level changes can be tracked via the gitweb interface to Linus' kernel tree.
Linux creator Linus Torvalds released the 2.6.22-rc7 kernel saying, "it's hopefully (almost certainly) the last -rc before the final 2.6.22 release, and we should be in pretty good shape. The flow of patches has really slowed down and the regression list has shrunk a lot." He briefly summarized the changes in this latest release candidate, "the patches are mostly trivial fixes, a few new device ID's, and the appended shortlog really does pretty much explain it," adding, "final testing always appreciated, of course".
The previous stable kernel, 2.6.21, was released a little over two months ago on April 25'th [story]. An overview of all the changes merged into the latest version of the kernel is maintained in the Kernel Newbies wiki. Included in the list of changes are the SLUB allocator which replaced the slab allocator, a new wireless stack, a new firewire stack [story], and support for the Blackfin architecture. Source level changes can be tracked via the gitweb interface to Linus' kernel tree.
Following a recent patch that translated Documentation/HOWTO into Japanese [story], a new patch offered a translation of the same document into Chinese. Li Yang noted, "currently Chinese involvement in Linux kernel is very low, especially [compared to China's large] population base. Language could be the main obstacle. Hopefully this document will help more Chinese to contribute to Linux kernel."
Linux creator Linus Torvalds noted that he was happy to see translations, "I think that the policies and processes parts of the documentation are things that make total sense to encourage translation of, because it's entirely possible that those are interesting and valid even to the people who aren't necessarily directly involved in the actual coding, and may well be relevant to managers etc who may not be _directly_ involed with the rest of the kernel developers." He then noted that he didn't see any reason to include the translations with the kernel proper, "that said, I don't think that merging the result into the standard kernel makes sense - like it or not, right now English ends up being required to be part of actually getting things into the 'standard' kernel, and as such, at _some_ point there has to be a connection point that switches over to English, and trying to make the translations be an in-kernel thing is thus kind of pointless." Rob Landley noted that he has started putting together a documentation site on which he will include these translations, "send me translations (preferably in HTML format), and I'll put 'em up. (I've already got the one that started this thread.)"
In another thread discussing the tracking of kernel regressions [story], Linux creator Linus Torvalds noted that the kernel is evolving so quickly it is inevitable that bugs will be introduced. He used a git query to determine that there are an average of over 65 patches being committed every single day, "that translates to five hundred commits a week, two _thousand_ commits per month, and 25 thousand commits per year. As a fairly constant stream. Will mistakes happen? Hell *yes*." He continued on to add, "and I'd argue that any flow that tries to 'guarantee' that mistakes don't happen is broken. It's a sure-fire way to just frustrate people, simply because it assumes a level of perfection in maintainers and developers that isn't possible." He then offered a number of calculations based on the number of lines of code added in the past 100 days, summarizing:
"Even by the most *stringent* reasonable rules, we add a new bug every four days. That's just something that people need to accept. The people who say 'we must never introduce a regression' aren't living on planet earth, they are living in some wonderful world of Blarney, where mistakes don't happen, developers are perfect, hardware is perfect, and maintainers always catch things."
A lengthy debate that began with a suggestion to dual license the Linux kernel under the GPLv2 and the GPLv3 [story] continues on the Linux Kernel Mailing List. Throughout the ongoing thread Linux creator Linus Torvalds has spoken out on the GPLv2, the upcoming GPLv3, the BSD license, Tivo, the Free Software Foundation, and much more. During the discussion, he was asked we he chose the GPLv2 over the BSD license when he's obviously not a big fan of the FSF. Linus explained:
"Because I think the GPLv2 is a great license. And I don't like the FSF's radical world-view, but I am able to separate the license (the GPLv2) from the author and source of the license (rms and the FSF). Why do people always confuse the two? The GPLv2 stands on its own. The fact that I disagree with the FSF on how to act has _zero_ relevance for my choice of license.
"[...] But for a project I actually care about, I would never choose the BSD license. The license doesn't encode my fundamental beliefs of 'fairness'. I think the BSD license encourages a 'everybody for himself' mentality, and doesn't encourage people to work together, and to merge."
"I was impressed in the sense that it was a hell of a lot better than the disaster that were the earlier drafts," Linus Torvalds explained in reply to a comment suggesting that he was impressed with the final draft of the GPLv3. He went on to add, "I still think GPLv2 is simply the better license." The discussion began with a suggestion that the Linux kernel be dual-licensed GPLv2 and GPLv3. Linus noted, "I consider dual-licensing unlikely (and technically quite hard), but at least _possible_ in theory. I have yet to see any actual *reasons* for licensing under the GPLv3, though. All I've heard are shrill voices about 'tivoization' (which I expressly think is ok) and panicked worries about Novell-MS (which seems way overblown, and quite frankly, the argument seems to not so much be about the Novell deal, as about an excuse to push the GPLv3)." In a followup email, Linus added:
"Btw, if Sun really _is_ going to release OpenSolaris under GPLv3, that _may_ be a good reason. I don't think the GPLv3 is as good a license as v2, but on the other hand, I'm pragmatic, and if we can avoid having two kernels with two different licenses and the friction that causes, I at least see the _reason_ for GPLv3. As it is, I don't really see a reason at all."
What started as the review of a bug report grew into an interesting debate as Linus Torvalds slammed the current suspend and resume [story] design in the Linux Kernel, "why the HELL cannot you realize that kernel threads are different? The right thing to do is AND HAS ALWAYS BEEN, to stop and start user threads only around the whole thing. Don't touch those kernel threads. Stop freezing them." Later in the discussion, Linus noted that he had no interest in Suspend to Disk (STD), and was only interested in a working Suspend to Ram (STR) implementation. He noted that complexity introduced by STD was infecting the STR logic, and that the two should be completely separated, "what irritates me is that STR really shouldn't have _had_ that bug at all. The only reason STR had the same bug as STD was exactly the fact that the two features are too closely inter-twined in the kernel. That irritates me hugely. We had a bug we should never had had! We had a bug because people are sharing code that shouldn't be shared! We had a bug because of code that makes no sense in the first place!" Linus noted that he doesn't use laptops much, but still likes STR on his desktop, "STR means they are quiet and don't waste energy when I don't use them, but they're instantly available when I care." He then went on to point to design flaws in the freezer:
"I actually don't think that processes should be frozen really at all. I agree that filesystems have to be frozen (and I think that checkpointing of the filesystem or block device is 'too clever'), but I just don't think that has anything to do with freezing processes. So I'd actually much prefer to freeze at the VFS (and socket layers, etc), and make sure that anybody who tries to write or do something else that we cannot do until resuming, will just be blocked (or perhaps just buffered)!"
In a humorous announcement for the latest release candidate of the upcoming 2.6.22 Linux kernel, Linus Torvalds noted that there were updates to the ARM, SH and Blackfin architectures. He also noted fixes to USB suspend, infiband, and the network stack, as well as updates to ATA, DVB and MMC, and network drivers. Noting that a three-day weekend was starting in the US he said, "so what's a pasty white nerd to do? You can't go out on the beach, because the goodlooking people will laugh at you, and kick sand in your face. I'm not bitter." Linus continued:
"But now you _can_ do something: you can download the latest -rc kernel, and smile smugly to yourself, knowing that you are running the latest and greatest on your machine. And suddenly it doesn't even matter that summer is coming, because you can just sit in the basement, and close the blinds, and bask in the warm light from your LCD, rather than the harsh glare of the daystar.."
Further information about what's new and changed in the upcoming 2.6.22 kernel can be found in the KernelNewbies wiki. The latest -rc can be downloaded from the Linux Kernel Archives [story], and the source changes can be browsed online using the gitweb interface.
In a recent lkml thread the concept of dumping an image of the kernel's memory to swap when the kernel hits a bug was discussed. Linus Torvalds pointed out that such a feature wasn't useful to an operating system like Linux that can ran on such a diverse assortment of computers, "yes, in a controlled environment, dumping the whole memory image to disk may be the right thing to do. BUT: in a controlled environment, you'll never get the kind of usage that Linux gets. Why do you think Linux (and Windows, for that matter) took away a lot of the market from traditional UNIX?" He went on to explain that there are systems where swap is not larger than the size of the core so collecting a crash dump would not be possible, that Linux instead tries to acknowledge bugs without crashing, and quite often the bug is actually in the drivers, "writing to disk when the biggest problem is a driver to begin with is INSANE." Comparing Linux to Solaris he added, "so the fact is, Solaris is crap, and to a large degree Solaris is crap exactly _because_ it assumes that it runs in a 'controlled environment'."
Alan Cox went on to point out that there are also privacy issues, "there is an additional factor - dumps contain data which variously is - copyright third parties, protected by privacy laws, just personally private, security sensitive (eg browser history) and so on. The only reasons you can get dumps back in the hands of vendors is because there are strong formal agreements controlling where they go and what is done with them." He went on to note that dump utilities are also not user friendly, "diskdump (and even more so netdump) are useful in the hands of a developer crashing their own box just like kgdb, but not in the the normal and rational end user response of 'its broken, hit reset'". Linus heartily agread, and suggested that anyone willing to use kernel dumps would be better off debugging through a firewire connection, " if you've ever picked through a kernel dump after-the-fact, I just bet you could have done equally well with firewire, and it would have had _zero_ impact on your kernel image. Now, contrast that with kdump, and ask yourself: which one do you think is worth concentrating effort on?"
"In no case is it ok to just 'shut up the warning'," Linus Torvalds exclaimed in response to a patch that stifled a compiler warning. Reminiscent of a thread on the lkml last year [story], Linus pointed out that it is very important to understand and properly fix compiler warnings [story]:
"Please, we do NOT fix compiler warnings without understanding the code! That's a sure way to just introduce _new_ bugs, rather than fix old ones. So please, please, please, realize that the compiler is _stupid_, and fixing warnings without understanding the code is bad.
"In this case, anybody who actually spends 5 seconds looking at the code should have realized that the warning is just another way of saying that the author of the code was on some bad drugs, and the warnings WERE BETTER OFF REMAINING! Because that code _should_ have warnings. Big fat warnings about incompetence!?"
"Ok, the merge window has closed, and 2.6.22-rc1 is out there," Linus Torvalds announced on the Linux Kernel Mailing List. He noted that there were a large number of changes, "almost seven thousand files changed, and that's not double-counting the files that got moved around." As to what was changed, Linus summarized, "architecture updates, drivers, filesystems, networking, security, build scripts, reorganizations, cleanups.. You name it, it's there." He went on to add:
"You want a new firewire stack? We've got it. New wireless networking infrastructure? Check. New infiniband drivers? Digital video drivers? A totally new CPU architecture (blackfin)? Check, check, check.
"That said, I think (and certainly hope) that this will not be nearly as painful as the big fundamental timer changes for 2.6.21, and while there are some pretty core changes there (like the new SLUB allocator, which hopefully will end up replacing both SLAB and SLOB), it feels pretty solid, and not as scary as ripping the carpet from under the timer infrastructure."