Linux news

Btrfs 0.16, Improved Scalability And Performance

Submitted by Jeremy
on August 8, 2008 - 12:02pm
Linux news

"Btrfs v0.16 is available for download," began Chris Mason, announcing the latest release of his new Btrfs filesystem. He noted, "v0.16 has a shiny new disk format, and is not compatible with filesystems created by older Btrfs releases. But, it should be the fastest Btrfs yet, with a wide variety of scalability fixes and new features." Improved scalability and performance improvements include fine grained btree locking, pushing CPU intensive operations such as checksumming into their own background threads, improved data=ordered mode, and a new cache to reduce IO requirements when cleaning up old transactions. Other new features include support for ACLs, prevention of orphaned inodes so files won't be lost after a crash, and a more robust directory index format. Chris noted:

"There are still more disk format changes planned, but we're making every effort to get them out of the way as quickly as we can. You can see the major features we have planned on the development timeline. [...] the btrfs kernel module now weighs in at 30,000 LOC, which means we're getting very close to the size of ext[34]."

Comparing HAMMER And Tux3

Submitted by Jeremy
on August 7, 2008 - 8:25am
Linux news

"The big advantage Hammer has over Tux3 is, it is up and running and released in the Dragonfly distro," began Daniel Phillips, offering a comparison between the two filesystem. He continued, "the biggest disadvantage is, it runs on BSD, not Linux, and it so heavily implements functionality that is provided by the VFS and block layer in Linux that a port would be far from trivial. It will likely happen eventually, but probably in about the same timeframe that we can get Tux3 up and stable." This led into a lengthy and interesting technical discussion between Daniel and HAMMER author Matthew Dillon, comparing the design of the two filesystems.

Matthew reviewed the Tux3 notes and replied, "it sounds like Tux3 is using many similar ideas [as HAMMER]. I think you are on the right track. I will add one big note of caution, drawing from my experience implementing HAMMER, because I think you are going to hit a lot of the same issues. I spent 9 months designing HAMMER and 9 months implementing it. During the course of implementing it I wound up throwing away probably 80% of the original design outright." Daniel noted that he's been working on the Tux3 design for around ten years, "and working seriously on the simplifying elements for the last three years or so, either entirely on paper or in related work like ddsnap and LVM3." Matthew cautioned, "I can tell you've been thinking about Tux for a long time. If I had one worry about your proposed implementation it would be in the area of algorithmic complexity. You have to deal with the in-memory cache, the log, the B-Tree, plus secondary indexing for snapshotted elements and a ton of special cases all over the place. Your general lookup code is going to be very, very complex. My original design for HAMMER was a lot more complex (if you can believe it!) then the end result. A good chunk of what I had to do going from concept to reality was deflate a lot of that complexity." The friendly conversation offers a very detailed look at the design choices made in each of these file systems.

2.6.27-rc2, "A Lot Of Random Changes"

Submitted by Jeremy
on August 6, 2008 - 11:44pm
Linux news

"So it's been a week since -rc1, and -rc2 is out there," began Linux creator Linus Torvalds, announcing the 2.6.27-rc2 Linux kernel. He noted, "there's a lot of random changes in there, and I'm hoping we're starting to calm down, but one particular _kind_ of random change is probably worth pointing out explicitly due to the things it can result in: the fact that a number of architectures ended up using the 'lull' after -rc1 (hah!) to do the 'include/asm-xyz' => 'arch/xyz/include/asm' renames." Linus explained that for people actively developing and merging code with git, "be aware that we've recently had more renames than the rename detection limit in git defaults to, and as a result, if you have a rename<->data change conflict, you may want to increase the default limit." Linus noted that developers with sufficient ram can set "renamelimit=0" to completely disable the limit, and others can set it to a high value such as 5,000, "the default limit is pretty low just to not cause problems for people who have less memory in their machines than kernel developers tend to have..."

Linus continued, "the dirstat (with rename detection on, so as to not show the movement as huge changes) is fairly usual, with most of the changes in drivers, along with an ext4 and xfs update making 'fs' show up pretty high too". He added:

"The shortlog is still a tad too big to make it on the list (again, as usual - normally I end up posting shortlogs for -rc3 and later when they become more manageable) but let me just say that it isn't really all that interesting. Theres' a lot of small changes here, but nothing that makes you go 'Wow!'. Not that there _should_ be anything like that in -rc2, of course, so I'm not complaining."

Reiser4 Update

Submitted by Jeremy
on August 6, 2008 - 10:00am
Linux news

"I have had to apply the reiser4 patches from -mm kernels to vanilla based patchset for over a year now. Reiser4 works fine, what will it take to get it included in vanilla?" began a brief thread on the Linux Kernel mailing list. Theodore Ts'o offered several links detailing the reamining issues with Reiser4, then suggested, "people who really like reiser4 might want to take a look at btrfs; it has a number of the same design ideas that reiser3/4 had --- except (a) the filesystem format has support for some advanced features that are designed to leapfrog ZFS, (b) the maintainer is not a crazy man and works well with other LKML developers (free hint: if your code needs to be reviewed to get in, and reviewers are scarce; don't insult and abuse the volunteer reviewers as Hans did --- Not a good plan!)."

Edward Shishkin noted that Reiser4 development continues, "I am working on the plugin design document. It will be ready approximately in September. I believe that it'll address all the mentioned complaints." He added, "This document [defines] plugins [and] primitives (like conversion of run-time objects) used in reiser4, and describes all reiser4 interfaces, so that it will be clear that VFS functionality is not duplicated, there are not VFS layers inside reiser4, etc."

Hans Reiser, the original developer of the Reiser4 filesystem, was convicted of first degree murder on April 28'th, 2008. The latest Reiser4 patches currently live on kernel.org, as do the necessary support programs.

Reviewing Linux-next

Submitted by Jeremy
on August 5, 2008 - 12:33pm
Linux news

"I do think 'next' as it is has a few issues that either need to be fixed (unlikely - it's not the point of next) or just need to be aired as issues and understood," noted Linus Torvalds about the linux-next development tree, originally designed as a way to get subsystem maintainers more involved in managing merge conflicts. Linus continued, "I don't think anybody wants it to go away. The question in my mind is more along the way of how/whether it should be changed. There was some bickering about patches that weren't there, and some about how _partial_ series were there but then the finishing touches broke things."

He listed his two primary concerns as, "I don't think it does 'quality control', and I think that's pretty fundamental," and, "I don't think the 'next' thing works as well for the occasional developer that just has a few patches pending as it works for subsystem maintainers that are used to it." Linus continued, "I don't think either of the above issues is a 'problem' - I just think they should be acknowledged. I think 'next' is a good way for the big subsystem developers to be able to see problems early, but I really hope that nobody will _ever_ see next as a 'that's the way into Linus' tree', because for the above two reasons I do not think it can really work that way." Andrew Morton noted, "a lot of the bugs which hit your tree would have been quickly found in linux-next too," then added, "but it's all shuffling deckchairs, really. Are we actually merging better code as a reasult of all of this? Are we being more careful and reviewing better and testing better? Don't think so."

2.6.27-rc1, "Pretty Dang Busy"

Submitted by Jeremy
on July 29, 2008 - 4:18pm
Linux news

"It's two weeks (and one day), and the merge window is over," began Linus Torvalds, announcing the 2.6.27-rc1 kernel. He continued, "finally. I don't know why, but this one really did feel pretty dang busy. And the size of the -rc1 patch bears that out - at 12MB, it's about 50% bigger than 26-rc1 (but not that much bigger than 24/25-rc1, so it's not like it's anything unheard of)." He reflected, "the pure size of the -rc's _is_ making me a bit nervous, though. Sure, it means that we are good at merging it all, but I have to say that I sometimes wonder if we don't merge too much in one go, and even our current (fairly short) release cycle is actually too big." As for the actual changes, Linus explained:

"Much of -rc1 was in linux-next, but certainly not everything. We'll see how that whole thing ends up evolving - it certainly didn't solve all problems, and there was some bickering about things that weren't there (and some things that mostly were ;), but maybe it helped. There's a ton of new stuff in there, but at least personally the interesting things are the BKL pushdown and perhaps the introduction of the lockless get_user_pages_fast(). The build system also got updated to allow moving the architecture include files ('include/asm-xyz') into the architecture subdirectories ('arch/xyz/include/asm'), and sparc seems to have taken advantage of that already."

Other changes Linus highlighted included merging the UBI filesystem, as well as, "tracing, firmware loading, continued x86 arch merging, and moving more code to generic support (unified generic IPI handling, coherent dma memory allocation, show_mem etc). Bootmem rewrites. [And] some support for further scalability (ie 4k cpu cores)."

Tux3 Versioning Filesystem

Submitted by Jeremy
on July 25, 2008 - 4:00pm
Linux news

"Since everybody seems to be having fun building new filesystems these days, I thought I should join the party, began Daniel Phillips, announcing the Tux3 versioning filesystem. He continued, "Tux3 is a write anywhere, atomic commit, btree based versioning filesystem. As part of this work, the venerable HTree design used in Ext3 and Lustre is getting a rev to better support NFS and possibly become more efficient." Daniel explained:

"The main purpose of Tux3 is to embody my new ideas on storage data versioning. The secondary goal is to provide a more efficient snapshotting and replication method for the Zumastor NAS project, and a tertiary goal is to be better than ZFS."

In his announcement email, Daniel noted that implementation work is underway, "much of the work consists of cutting and pasting bits of code I have developed over the years, for example, bits of HTree and ddsnap. The immediate goal is to produce a working prototype that cuts a lot of corners, for example block pointers instead of extents, allocation bitmap instead of free extent tree, linear search instead of indexed, and no atomic commit at all. Just enough to prove out the versioning algorithms and develop new user interfaces for version control."

Proposing Read-Only ZFS

Submitted by Jeremy
on July 22, 2008 - 6:42pm
Linux news

A recent thread on the lkml discussed a blog entry stating that minimal ZFS support for GRUB was available under the GPL license, "we could now use that code to implement support for ZFS in the Linux kernel." Alan Cox explained, "no we can't. The GPL ZFS bits don't include the various methods that would violate the patent so there is no grant. I've several times asked Sun to simply give permission and they don't even answer. I can only read the Sun motivation one way - they want to look open but know that ZFS is about the only thing that might save Solaris as a product in the data centre so are not truly prepared to let Linus use it." H. Peter Anvin added, "from what I can see, it is an absolutely-minimal read only implementation."

Christoph Hellwig offered, "adding a read-only for the start zfs driver for Linux would be useful for various purposes. And adding read-only filesystems to Linux is really easy." Referring to the individual who started the discussion, he added, "if Fred really cares about it I'd be very happy to mentor him implementing it. It should be a very good learning exercise for him." When asked if this offer applied to anyone else, Christoph replied, "yes, this offer is of course up to everyone interested. But it's not purely an integration effort in the traditional sense, the grub filesystem interface is quite different from the Linux one, and the code structure and style is quite different. But if you're willing to learn it should be very interesting."

Security Bugs and Full Disclosure

Submitted by Jeremy
on July 16, 2008 - 5:57am
Linux news

In an announcement for the 2.6.25.10 stable kernel, Greg KH noted, "it contains a number of assorted bugfixes all over the tree. And once again, any users of the 2.6.25 kernel series are STRONGLY encouraged to upgrade to this release." The emphasis on the word strongly led to a lengthy discussion about how security fixes are handled in the Linux Kernel. Linus Torvalds replied, "I personally consider security bugs to be just 'normal bugs'. I don't cover them up, but I also don't have any reason what-so-ever to think it's a good idea to track them and announce them as something special." Later in the thread he went on to explain, "one reason I refuse to bother with the whole security circus is that I think it glorifies - and thus encourages - the wrong behavior. It makes 'heroes' out of security people, as if the people who don't just fix normal bugs aren't as important. In fact, all the boring normal bugs are _way_ more important, just because there's a lot more of them. I don't think some spectacular security hole should be glorified or cared about as being any more 'special' than a random spectacular crash due to bad locking."

Theodore T'so pointed out that other developers had different beliefs about disclosure than Linus and referred to mailing lists such as the private security@ list described in the SecurityBugs documentation, originally created in early 2005. He then described Linus' stance, "if Linus finds out about a security bug, he will fix it and check it into the public git repository right away. But he's very honest in telling you that is what he will do --- so you can choose whether or not to include him in any disclosures that you might choose to make." Regarding whether Full Disclosure is the best policy, Ted highlighted the fact that the debate has been going on for several decades, "it is clear that we're not going settle this debate now, and certainly not on the Linux Kernel Mailing List." Later in the discussion, Linus offered a succinct summary of his viewpoint, "my responsibility is to do a good job. And not pander to the people who want to turn security into a media circus."

Kernel Release Numbering Redux

Submitted by Jeremy
on July 15, 2008 - 12:14pm
Linux news

For many years, each Linux kernel release was assigned a series of three numbers, X.Y.Z, with an even Y indicating a "stable" release, and an odd Y indicating an "unstable" development release. Z was incremented for each individual kernel release. The "stable" 1.0.0 Linux kernel was released in March of 1994. New development was then continued in the "unstable" 1.1.z branch, until the "stable" 1.2.0 Linux kernel was release in March of 1995. Major improvements in the kernel lead to X being incremented to 2, and a "stable" 2.0 kernel was released in June of 1996. Active development then continued in the "unstable" 2.1 tree. This process continued with "stable" 2.2, 2.4 and 2.6 kernel trees, and each stable tree gained an official maintainer while Linux creator Linus Torvalds focused on newer features in the next "unstable" tree. Development in these "unstable" trees could go on for periods of multiple years before a "stable" tree was branched.

This long-standing odd/even development model was officially scrapped in 2004 thanks to the success that Linus and Andrew Morton were having working together, and significant "unstable" development began happening between each 2.6.Z release. In a recent thread it was asked what it would take for an "unstable" 2.7 development tree to be created, to which Linus noted replied:

"Nothing. I'm not going back to the old model. The new model is so much better that it's not even worth entertaining as a theory to go back. That said, I _am_ considering changing just the numbering. Not to go back to the old model, but because a constantly increasing minor number leads to big numbers. I'm not all that thrilled with '26' as a number: it's hard to remember. [..] I think the time-based releases (ie the '2 weeks of merge window until -rc1, followed by roughly two months of stabilization') has been so successful that I'd prefer to skip the version numbering model too. We don't do releases based on 'features' any more, so why should we do version _numbering_ based on 'features'?"

2.6.26, "A Longer-Than-Usual Release Cycle"

Submitted by Jeremy
on July 14, 2008 - 6:13am
Linux news

"It's been almost three months since 2.6.25 (87 days to be exact, I think), making this a longer-than-usual release cycle. Or maybe it just feels that way, and we're always getting close to three months these days," said Linux creator Linus Torvalds, announcing the 2.6.26 Linux kernel, adding, "but it's out there now." He continued:

"The diffs from -rc9 are pretty small, with with the bulk actually being Documentation updates (almost 80% is just added docs). The rest tends to be one-liners for some regressions or otherwise pretty small patches. Several regressions did get fixed in the last few days, thanks to everybody involved."

Click the 2.6.26 tag to review all the previous release candidate announcements building up to this release. Source level changes can be reviewed via Linus' 2.6 gitweb kernel tree. The latest kernel can be downloaded from the Linux Kernel Archives.

POHMELFS Encryption

Submitted by Jeremy
on July 11, 2008 - 3:36pm
Linux news

Evgeniy Polyakov announced the latest release of his Parallel Optimized Host Message Exchange Layered File System, POHMELFS. He noted that the big new feature in this release is strong crypto support, "one can specify [an] encryption method (like cbc(aes), hash or digest, or all of them to be performed on [the] whole data channel (except headers)." In his blog, Evgeniy adds, "Cryptography support is [an] essential addition to the POHMELFS core. It was implemented with performance in mind, so that processing speeds would not drop noticeably even [during] very CPU-hungry operations". He explained, "POHMELFS utilizes [a configurable number of] pools of crypto threads, which perform data crypto processing and submit it either to [the] network or VFS layer." He included results from some performance benchmarks.

Evgeniy describes POHMELFS as "a high performance network filesystem with [a] locally coherent cache of data and metadata. Its main goal is distributed parallel processing of data. [The filesystem] supports [a] strong transaction model with failover recovery, allows encryption/hashing [of the entire] data channel, and performs read load balancing and write to multiple servers in parallel." When asked on his blog when he plans to push the new filesystem for mainline kernel inclusion, Evgeniy noted, "I do not know, maybe its time to push it upstream, but I do not want to bother with Linux kernel politics. We will see soon."

2.6.26-rc9, "Enough Changes That We Needed Another -rc"

Submitted by Jeremy
on July 7, 2008 - 7:14pm
Linux news

"Ok, the last -rc obviously wasn't the last one after all, since here's a new one," noted Linus Torvalds, announcing the 2.6.26-rc9 kernel. He continued, "enough changes that we needed another -rc, and the regression list isn't emptying fast enough either (probably because a number of people, including reporters, are vacationing)." He went on to summarize:

"The actual bulk of this all is a new UVC video driver for the standard USB Video Class specification. It's a new driver, so shouldn't cause any regressions, but it's fairly sizable [...] ie 78% is just that one new driver, and almost 92% is driver updates in general (although some of them are reverts, so they show up as diffs against -rc8, but they actually cause the _total_ diff against 2.6.25 to shrink a bit). The fs updates are partly some minor updates to 9p, ecryptfs, proc and udf, but partly some delayed cleanup patches that went through Al. Bad Al. But when Al sends me patches, I apply them. I worry what would happen if I didn't. The rest is mainly small fixes (one-liners and 'few-liners') all over the place, many of them merged from Andrew's -mm queue."

Openmoko Neo FreeRunner Released

Submitted by Jeremy
on June 27, 2008 - 9:17am
Linux news

"Thousands of Neo FreeRunners have been loaded into planes and fired around the world," announced Sean Moss-Pultz, the Openmoko CEO, in a frequently philosophical email titled "let us impact the material world", posted to the Openmoko community mailing list. He continued, "many of our distributors have already begun shipping. In about another week, Steve and Harry will announce the opening of our own webshop." The CAD files for building the smartphone hardware are available under the Creative Commons, and the software has been released under the GPL, including a patched 2.6.24 Linux kernel. Sean continued, "whenever I talk publicly about Openmoko, or so it seems, the following question is asked: How can you compete again the giants of this industry? For most of us, I'd like to think, the answer is obvious. Instead of answering, I usually return their question: How can they compete against us?" He explained:

"Openmoko is the collective creation of amateurs working on exactly what we love. They are professionals, some doing what they love, most working towards the next paycheck. At certain times, the amateur has a distinct advantage over the professional. A professional knows what they can deliver, and rarely goes beyond it. An amateur has no concept of their limitations and usually goes well beyond them. Experience teaches us our limits. When we have learned that and become complacent, we are finished, because our work can be calculated and measured. Our work ceases to be a weapon."

2.6.26-rc8, "A Pretty Small Set Of Changes"

Submitted by Jeremy
on June 26, 2008 - 12:16pm
Linux news

"It hasn't been a week, I know, and this is a pretty small set of changes since -rc7, but I'm going to be mostly incommunicado for the next week or so, so I just released what will hopefully be the last -rc," began Linux creator Linus Torvalds, announcing the 2.6.26-rc8 kernel. He added, "or maybe not. It depends on how good you all are while I'm not looking." Regarding the latest release candidate, Linus explained:

"Most of the bulk of the changes here are to Xen and to KVM in particular, which shows up as a rather unusual dirstat: 65% is in arch/x86 (counting the asm-x86 changes too). The rest is mostly random stuff, the appended ShortLog gives a reasonable idea. Several bugzilla entries are hopefully now closed."