"History is a one way street, and you might as well have the fs known the way it is so that people remember 'reiser oh wasn't he the guy who..' - unless you are trying to market the fs I guess."
"I have had to apply the reiser4 patches from -mm kernels to vanilla based patchset for over a year now. Reiser4 works fine, what will it take to get it included in vanilla?" began a brief thread on the Linux Kernel mailing list. Theodore Ts'o offered several links detailing the reamining issues with Reiser4, then suggested, "people who really like reiser4 might want to take a look at btrfs; it has a number of the same design ideas that reiser3/4 had --- except (a) the filesystem format has support for some advanced features that are designed to leapfrog ZFS, (b) the maintainer is not a crazy man and works well with other LKML developers (free hint: if your code needs to be reviewed to get in, and reviewers are scarce; don't insult and abuse the volunteer reviewers as Hans did --- Not a good plan!)."
Edward Shishkin noted that Reiser4 development continues, "I am working on the plugin design document. It will be ready approximately in September. I believe that it'll address all the mentioned complaints." He added, "This document [defines] plugins [and] primitives (like conversion of run-time objects) used in reiser4, and describes all reiser4 interfaces, so that it will be clear that VFS functionality is not duplicated, there are not VFS layers inside reiser4, etc."
Hans Reiser, the original developer of the Reiser4 filesystem, was convicted of first degree murder on April 28'th, 2008. The latest Reiser4 patches currently live on kernel.org, as do the necessary support programs.
The future of Reiser4 was raised on the lkml, with the filesystem's creator, Hans Reiser [interview], awaiting his May 7'th trial [story]. Concerns that the filesystem wasn't being maintained were laid to rest when Andrew Morton [interview] stated, "the namesys engineers continue to maintain reiser4 and I continue to receive patches for it." He further added, "the namesys guys are responsive and play well with others." As to why the filesystem hasn't yet been merged into the 2.6 kernel, Andrew explained, "to get it unstuck we'd need a general push, get people looking at and testing the code, get the vendors to have a serious think about it, etc. We could do that - it'd require that the namesys people (and I) start making threatening noises about merging it, I guess." He then made joking reference to the recent debate regarding the new CPU schedulers [story], "or we could move all the reiser4 code into kernel/sched.c - that seems to get people fired up."
Namesys developer and author of the Reiser4 encryption and compression plugins, Edward Shiskin, offered some updates. Replying to some comments about the need to remove plugins from the Reiser4 code he explained, "the popular opinion that plugins make more sense in the VFS is a great delusion, as plugins are entities related to reiser4 disk layouts." In an earlier thread it had been suggested that the plugins were misnamed and would be better called an internal abstraction layer [story]. Edward went on to note, "currently there are two namesys employees working [on Reiser4] mostly on enthusiasm." He linked to a wiki page listing known issues with the code needing to be fixed before it's likely to be merged into the 2.6 kernel, "the main issues here are xattrs and support for blocksize != pagesize. I think that adding xattrs will take ~1 month of full-time working. Not sure about blocksize support." When it was noted that other filesystems have already been merged without support for either of these features, Edward said that they'd lower their priority and finish up with the other remaining issues left on the old todo list and resume the merge discussion at that time.
With Namesys founder Hans Reiser [interview] recently arrested as the prime suspect in the disappearance of his estranged wife, a brief thread on the lkml discussed the future of ReiserFS. Alan Cox [interview] pointed out that, "reiserfs is written by a team of people at Namesys, and particularly with reiserfs3 people at SuSE and elsewhere as well."
Alexander Lyamin, listed on the Namesys website as their "hostmaster and sysadmin", noted that the team was "rather shaken and stressed at the moment". He confirmed that ReiserFS 3.6 is currently in maintenance mode, then continued to discuss Reiser4, "we are still going through revisions, thanks to [Andrew Morton]. Chunking out patches, fixing issues and generally cleaning the house." He explained that this was the short term plan, for at least the next 6 months. Regarding the future he noted it depends on the outcome of the trial, "if it goes [the] way we hope it will go. Well... We will do fine. If it goes bad. That is where it becomes tricky. We will try to appoint a proxy to run Namesys business."
The ongoing discussion about the Reiser4 filesystem [story] continues on the lkml. Jeff Garzik discussed the complexity introduced by a plugin layer [story], suggesting it is really a second VFS, "furthermore, it completely changes the notion of what a Linux filesystem is. Currently, each Linux filesystem is a tightly constrained set of metadata support. reiser4 changes 'tightly constrained' to 'infinity'. While that freedom is certainly liberating, it also has obvious support costs due to new admin paradigms and customer configuration possibilities."
Linux creator Linus Torvalds weighed in on the discussion, "as long you call them 'plugins' and treat them as such, I (and I suspect a lot of other people) are totally uninterested, and in fact, a lot of people will suspect that the primary aim is to either subvert the kernel copyright rules, or at best to create a mess of incompatible semantics with no sane overlying rules for locking etc." He went on to add, "as far as I'm concerned, the problem with reiser4 is that it hasn't tried to work with the VFS people. Now, I realize that the main VFS people aren't always easy to work with (Al and Christoph, take a bow), but that doesn't really change the basic facts. Al in particular is _always_ right. I don't think I've ever had the cojones to argue with Al.."
Later in the same thread, Andrew Morton [interview] noted that he's currently reviewing the code, "meanwhile here's poor old me trying to find another four hours to finish reviewing the thing." Regarding the code he added, "the writeout code is ugly, although that's largely due to a mismatch between what reiser4 wants to do and what the VFS/MM expects it to do. If it works, we can live with it, although perhaps the VFS could be made smarter." He then suggested, "I'd say that resier4's major problem is the lack of xattrs, acls and direct-io. That's likely to significantly limit its vendor uptake." As for the plugin debate, Andrew said, "the plugins appear to be wildly misnamed - they're just an internal abstraction layer which permits later feature additions to be added in a clean and safe manner. Certainly not worth all this fuss."
The discussion about why the Reiser4 filesystem has not been merged into the Linux kernel [story] continues on the lkml. Hans Reiser [interview] contrasted the struggles Reiser4 has had trying to get merged versus recent discussion about the up and coming ext4 filesystem [story], "the code isn't even written, benchmarked, or tested yet, and it is going into the kernel already so that its developers don't have to deal with maintaining patches separate from the tree. Wow. Kind of hard to argue that it is not politically differentiated, isn't it?"
Theodore T'so responsed, "it is a development procedure that was developed after discussion and consensus building across LKML and the ext2/3/4 development team. It was not the original plan put forth by the ext2 developers, but after listening to the concerns and suggestions, we did not question the motives of the people making suggestions; we listened." He went on to note that parts of what will be ext4 were written a year ago, and have been heavily tested and reviewed. Others pointed out that the evolution between ext3 and ext4 will be a very public process, with patches being merged gradually, whereas Reiser4 is a completely different code base from Reiser3.
The latest chapter in this ongoing debate tends to be more about clashing personalities than the code in question. How this affects if and when the Reiser4 filesystem will be merged into the mainline Linux kernel is yet to be seen.
The question of if and when Reiser4 will be merged into the mainline Linux kernel has been an on-going debate for a couple of years [story]. The filesystem was described as being "fairly stable for average users" by Hans Reiser [interview] over two years ago, in March of 2004 [story]. It has been merged into Andrew Morton [interview]'s -mm kernel [story], though issues such as Reiser4 plugins [story] and coding style [story] caused lengthy discussions last year. Two recent threads on the lkml raised the question again, asking at a non-technical level why Reiser 4 has not been included in the Linux kernel. Some have offered theories that Reiser4 is being blocked for political reasons, others because of concerns that once Reiser4 is included Namesys might forget it and move onto another filesystem. Responses to these theories point out that in reality there are technical issues that must be resolved before the filesystem will be merged, and that much progress has been made toward this end. Additional discussion can be found on a relevant recently created kernel newbies wiki page.
Hans Reiser posted a "short term task list for Reiser4" to address the remaining technical issues. The todo list included getting batch_write merged into the -mm kernel [story], getting read optimization code merged into the -mm kernel, documenting everything in the Namesys wiki, exploring and addressing reports of system pauses when using Reiser4, a complete review of the crypt-compress code, a large effort in optimizing fsync, a review of installation instructions, and a review of the kernel documentation. Hans explains, "unfortunately, our code stability is going to decrease for a bit due to all these changes to the read and write code --- no way to cure that but passage of time. On the other hand, our CPU usage went way down. Reiser4's only performance weakness now is fsync. Once the crypt-compress code is ready, we will release Reiser4.1-beta (with plugins, releasing a beta means telling users that if they mount -o reiser4.1-beta then cryptcompress will be their default plugin, and if they don't, then they are using Reiser4.0 still). Doubling our performance and halving our disk usage is going to be fun."
Hans Reiser [interview] described a recently posted patch as, "it revises the existing reiser4 code to do a good job for writes that are larger than 4k at a time by assiduously adhering to the principle that things that need to be done once per write should be done once per write, not once per 4k." He went on to explain, "this code empirically proves that the generic code design which passes 4k at a time to the underlying FS can be improved. Performance results show that the new code consumes 40% less CPU when doing 'dd bs=1MB .....'" Referring to
generic_file_write(), he further noted that currently when writing 64MB of data, "it may go to the kernel as a 64MB write, but VFS sends it to the FS as 64MB/4k separate 4k writes." It was acknowledged that this could also be accomplished in a non-generic way, howevever earlier feedback had suggested that such improvements should be made available to all.
Andrew Morton [interview] responded to the proposed changes saying, "there's nothing which leaps out and says 'wrong' in this. But there's nothing which leaps out and says 'right', either. It seems somewhat arbitrary, that's all." He pointed out that reiser4 was currently the only filesystem to benefit from the changes, "to be able to say 'yes, we want this' I think we'd need to understand which other filesystems would benefit from exploiting it, and with what results?" In the resulting discussion, it was determined that both FUSE [story] and XFS [story] would benefit from these changes prompting Hans to ask, "Is it enough?" Andrew agreed, "Spose so. Let's see what the diff looks like?"
Hans Reiser [interview] sent an email to the lkml titled, "I request inclusion of reiser4 in the mainline kernel". He provided a list of objections raised earlier, noting that all had been addressed. Among the listed issues, Reiser4 now works with 4k stacks. "There have been no bug reports concerning the new code," Hans added.
The request was followed with some suggestions by Christoph Hellwig, including general comments about the coding style. This was one of many issues that led to debate in which Hans commented, "most of my customers remark that Namesys code is head and shoulders above the rest of the kernel code. So yes, it is different." Alan Cox [interview] replied that while the kernel coding style isn't his own style, he tries to follow it when working on the kernel, "one big reason we jump up and down so much about the coding style is that its the one thing that ensures someone else can maintain and fix code that the author has abandoned, doesn't have time to fix or that needs access to specific hardware the authors may not have." Much of the rest of the thread was less friendly, leaving the question of merging Reiser4 into the mainline kernel still up in the air.
Hans Reiser formed Namesys and began the development of Reiserfs ten years ago. The first release of the filesystem, Reiser3, is part of the mainline 2.4 and 2.6 Linux kernels. The more recent Reiser4 is a complete redesign and reimplementation of Reiserfs, aiming to soon be merged into the mainline 2.6 Linux kernel.
In this interview, Hans discusses his background and how he came to create Namesys and Reiserfs. He looks back at Reiser3, describing the advantages it had over other filesystems when it was released and its current state. He then explores the many improvements currently in Reiser4, describing the plugin architecture and its exciting potential for future semantic enhancements.
In the debate following Andrew Morton [interview] posting his plans for 2.6.13 [story], the existence of a plugin layer in Reiser4 was discussed. Jeff Garzik put it blunty, "the plugin stuff is crap. This is not a filesystem but a filesystem new layer. IMO considered in that light, it duplicates functionality elsewhere." Andrew Morton went on to explain, "I think the concern here is that this is implemented at the wrong level. In Linux, a filesystem is some dumb thing which implements address_space_operations, filesystem_operations, etc."
Hans Reiser noted, "please remember that this is per file, per item, per node, per attribute, per disk format, per bitmap, per super block, etc., abstracting, not per filesystem abstracting." He explained a couple advantages to plugins being that it makes it much easier for developers to change the disk format, and allows for easy code reuse. He added, "the use of plugins forced all the programmers to think about reusability at every layer of design. V3 of reiserfs is way too hard to work on and modify. If you ask one of the team to code something for V3 instead of V4, they quietly groan at the thought. It is just so much easier to do in V4."
Andrew Morton replied, "advanced features such as those which you describe are implemented on top of the filesystem, not within it. reiser4 turns it all upside down. Now, some of the features which you envision are not amenable to above-the-fs implementations. But some will be, and that's where we should implement those." The lengthy discussion continued, an interesting read for Reiser4 supporters and detractors alike.
Continuing the debate over the right way to go about implementing some of the features found in the newly released Reiser4 [story], Hans Reiser asked Al Viro to clarify the issues he thinks could arise from the current implementation. The result was a brilliant explanation of what problems Al sees, specifically related to dentry aliasing, and how the current VFS architecture handles some of these problems.
Read on for Al's response and further clarification from Linus Torvalds. The interesting exchange provides some good insight into the Linux VFS layer.
Reading through the lengthy debate on the lkml titled "silent semantic changes with reiser4" [story] is a time investment. Comprised of well over 500 emails and growing, I include here a tiny snippet containing a discussion primarily between Hans Reiser, Andrew Morton [interview], and Linus Torvalds. Questions raised include whether or not the filesystem should be ultimately merged into the mainline kernel, and if so how to go about this. Much of the debate is regarding extensions that are currently only available through reiser4, and perhaps not fully compatible with existing utilities. The thread within begins with some coments by Andrew, who suggests that if the provided feature set is the desired direction for the Linux kernel, his preference would be to "accept the reiser4-only extensions with a view to turning them into kernel-wide extensions at some time in the future, so all filesystems will offer the extensions (as much as possible)".
As quoted earlier [story], Hans stressed that it was important that the reiser4 functionality be merged so that Linux is capable of competing with WinFS and Spotlight. The argument was continued by others, and to these followup comments Linus retorted:
"Hell will freeze over before Microsoft does a filesystem right. Besides, WinFS is likely almost in user mode anyway, ie mostly a library, rather like the gnome people are already doing with gnome storage. So there's really no point in trying to push your agenda by trying to scare people with MS activities. Linux kernel developers do what's right because it is _right_, not because somebody else does it."
As expected, merging Resier4 into Andrew Morton [interview]'s -mm tree [story] brought with it a lot of additional features and semantic changes. Christoph Hellwig expressed some unhappiness over these semantic changes, spawning a lengthy thread on the lkml. Specifically, he mentioned that the handling of files-as-directories (multiple streams within files) could cause problems to user-space applications, and could cause other dcache problems.
A lot of opposition was expressed. Some mentioned that the handling of multiple streams is really a userspace issue, whereas others mentioned that legacy applications may not properly handle multiple streams which could lead to the loss of user data. This lead Hans Reiser to say in support:
"Andrew, we need to compete with WinFS and Dominic Giampaolo's filesystem for Apple, and that means we need to put search engine and database functionality into the filesystem."
With the release of 18.104.22.168-mm2, in addition to a probable fix for the memory leak some reported when writing audio CDs [story], Andrew Morton [interview] announced that the long awaited [story] Resier4 filesystem has been merged into his -mm patchset [story]. Hans Reiser provided some information about the history and usage of Reiser4, beginning:
"Reiser4 is a file system based on dancing tree algorithms, and is described at http://www.namesys.com. One should be able to get it up and running just like any of the other filesystems supported by Linux. Configure it to be compiled either builtin or as a module. Create reiser4 filesystem with mkfs.reiser4, mount and use it. More detailed info can be found at http://thebsh.namesys.com/snapshots/LATEST/READ.ME."
In question and answer format - questions from Andrew, answers from Hans - we go on to learn where to obtain the latest Reiser4 tools, and about the current limitations of the filesystem. Hans explains, "Reiser4 has [only] been tested on i386 [...]. Quota support is not ready yet. Should be ready soon. [...] Only the very core functionality is working. Exotic plugins, an API for multiple operation transactions and accessing multiple small files in one syscall, compression, inheritance, all have been postponed until after the core functionality is shipped." Read on for more information, including tips on benchmarking the new filesystem.