"Any benchmark is going to be a benchmark of the OS as much as it is going to be a benchmark of the filesystem. It's pretty hard to separate the two. ZFS is best tested on Open Solaris. UFS is best tested on FreeBSD, EXT3 is best tested on Linux, and HAMMER of course is best tested on DragonFly."
"I have had to apply the reiser4 patches from -mm kernels to vanilla based patchset for over a year now. Reiser4 works fine, what will it take to get it included in vanilla?" began a brief thread on the Linux Kernel mailing list. Theodore Ts'o offered several links detailing the reamining issues with Reiser4, then suggested, "people who really like reiser4 might want to take a look at btrfs; it has a number of the same design ideas that reiser3/4 had --- except (a) the filesystem format has support for some advanced features that are designed to leapfrog ZFS, (b) the maintainer is not a crazy man and works well with other LKML developers (free hint: if your code needs to be reviewed to get in, and reviewers are scarce; don't insult and abuse the volunteer reviewers as Hans did --- Not a good plan!)."
Edward Shishkin noted that Reiser4 development continues, "I am working on the plugin design document. It will be ready approximately in September. I believe that it'll address all the mentioned complaints." He added, "This document [defines] plugins [and] primitives (like conversion of run-time objects) used in reiser4, and describes all reiser4 interfaces, so that it will be clear that VFS functionality is not duplicated, there are not VFS layers inside reiser4, etc."
Hans Reiser, the original developer of the Reiser4 filesystem, was convicted of first degree murder on April 28'th, 2008. The latest Reiser4 patches currently live on kernel.org, as do the necessary support programs.
"Since everybody seems to be having fun building new filesystems these days, I thought I should join the party, began Daniel Phillips, announcing the Tux3 versioning filesystem. He continued, "Tux3 is a write anywhere, atomic commit, btree based versioning filesystem. As part of this work, the venerable HTree design used in Ext3 and Lustre is getting a rev to better support NFS and possibly become more efficient." Daniel explained:
"The main purpose of Tux3 is to embody my new ideas on storage data versioning. The secondary goal is to provide a more efficient snapshotting and replication method for the Zumastor NAS project, and a tertiary goal is to be better than ZFS."
In his announcement email, Daniel noted that implementation work is underway, "much of the work consists of cutting and pasting bits of code I have developed over the years, for example, bits of HTree and ddsnap. The immediate goal is to produce a working prototype that cuts a lot of corners, for example block pointers instead of extents, allocation bitmap instead of free extent tree, linear search instead of indexed, and no atomic commit at all. Just enough to prove out the versioning algorithms and develop new user interfaces for version control."
A recent thread on the lkml discussed a blog entry stating that minimal ZFS support for GRUB was available under the GPL license, "we could now use that code to implement support for ZFS in the Linux kernel." Alan Cox explained, "no we can't. The GPL ZFS bits don't include the various methods that would violate the patent so there is no grant. I've several times asked Sun to simply give permission and they don't even answer. I can only read the Sun motivation one way - they want to look open but know that ZFS is about the only thing that might save Solaris as a product in the data centre so are not truly prepared to let Linus use it." H. Peter Anvin added, "from what I can see, it is an absolutely-minimal read only implementation."
Christoph Hellwig offered, "adding a read-only for the start zfs driver for Linux would be useful for various purposes. And adding read-only filesystems to Linux is really easy." Referring to the individual who started the discussion, he added, "if Fred really cares about it I'd be very happy to mentor him implementing it. It should be a very good learning exercise for him." When asked if this offer applied to anyone else, Christoph replied, "yes, this offer is of course up to everyone interested. But it's not purely an integration effort in the traditional sense, the grub filesystem interface is quite different from the Linux one, and the code structure and style is quite different. But if you're willing to learn it should be very interesting."
Pawel Dawidek first ported ZFS to FreeBSD from OpenSolaris in April of 2007. He continues to actively port new ZFS features from OpenSolaris, and focuses on improving overall ZFS stability. During the introduction to his talk at BSDCan, he explained that his goal was to offer an accessible view of ZFS internals. His discussion was broken into three sections, a review of the layers ZFS is built from and how they work together, a look at unique features found in ZFS and how they work internally, and a report on the current status of ZFS in FreeBSD.
The BSDCan website notes that Pawel is a FreeBSD committer, adding:
"In the FreeBSD project, he works mostly in the storage subsystems area (GEOM, file systems), security (disk encryption, opencrypto framework, IPsec, jails), but his code is also in many other parts of the system. Pawel currently lives in Warsaw, Poland, running his small company."
A recent thread on the FreeBSD -current mailing list discussed the stability of ZFS on FreeBSD. Scott Long noted that ZFS requires proper tuning to be stable:
"I guess what makes me mad about ZFS is that it's all-or-nothing; either it works, or it crashes. It doesn't automatically recognize limits and make adjustments or sacrifices when it reaches those limits, it just crashes. Wanting multiple gigabytes of RAM for caching in order to optimize performance is great, but crashing when it doesn't get those multiple gigabytes of RAM is not so great, and it leaves a bad taste in my mouth about ZFS in general."
ZFS was committed in April of 2007 by Pawel Dawidek who notes that he is using ZFS quite successfully on all of his systems. He then cautioned, "of course all this doesn't mean ZFS works great on FreeBSD. No. It is still an experimental feature." In response to some negative comments about ZFS on FreeBSD, Pawel noted, "in my opinion people are panicing in this thread much more than ZFS:) Let try to think how we can warn people clearly about proper tunning and what proper tunning actually means. I think we should advise increasing KVA_PAGES on i386 and not only vm.kmem_size. We could also warn that running ZFS on 32bit systems is not generally recommended."
"The universal need for tuning combined with the poorly understood problem reports tells me that administrators considering ZFS should expect to spend a fair amount of timing testing and tuning. Don't expect it to work out of the box for your situation."
"I am going to start committing bits and pieces of the HAMMER filesystem over the next two months," announced Matthew Dillon on the Dragonfly BSD kernel mailing list. He noted that the filesystem should be functional by the 2.0 release in December, "I am making good progress and I believe it will be beta quality by the release. It took nearly the whole year to come up with a workable design. I thought I had it at the beginning of the year but I kept running into issues and had to redesign the thing several times since then." Matthew then posted a detailed design document for the new filesystem.
During the followup discussion, Matthew was asked if HAMMER would be a ZFS killer. He responded, "ZFS serves a different purpose and I think it is cool, but as time has progressed I find myself liking ZFS's design methodology less and less, and I am very glad I decided against trying to port it." He noted it is essential to have redundant copies of data, but added, "the problem ZFS has is that it is TOO redundant. You just don't need that scale of redundancy if you intend to operate in a multi-master replicated environment because you not only have wholely independant (logical) copies of the filesystem, they can also all be live and online at the same time." As for how Dragonfly's new filesystem will address redundancy, he explained:
"HAMMER's approach to redundancy is logical replication of the entire filesystem. That is, wholely independant copies operating on different machines in different locations. Ultimately HAMMER's mirroring features will be used to further our clustering goals. The major goal of this project is transparent clustering and a major requirement for that is to have a multi-master replicated environment. That is the role HAMMER will eventually fill. We wont have multi-master in 2.0, but there's a good chance we will have it by the end of next year."
Mark Weinem offered a summary of NetBSD's six 2007 Summer of Code development projects. The projects included: the Automated Testing Framework, "the goal of the ATF project was to develop a testing framework to easily define test cases and run them in a completely automated way"; porting ZFS, "the primary goal of this project was to port volume emulation (ZVOL) functionality in order to mount ZFS file systems"; QoS framework for NetBSD's virtual memory system, "for delay sensitive systems such as streaming multimedia servers and back-end database systems, servicing the reader processes in a timely fashion is more important than the servicing the writers"; kernel file systems in userspace, as a result of the project, "almost all NetBSD kernel file systems can be compiled, mounted and run in userspace"; and hardware monitoring, "the aim of this project was to develop a kernel event notification framework to notify userland of hardware changes e.g. a new USB device being added". Mark added:
"NetBSD has been involved in the Google Summer of Code since its conception in 2005. This year we were glad to once again have the oppertunity to introduce six students to our operating system, to Open Source software development and get them sponsored by Google to work on projects defined by the NetBSD developers."
"I'm pleased to announce [the] fourth release of the distributed storage subsystem, which allows [you] to form a storage [block device] on top of remote and local nodes, which in turn can be exported to another storage [block device] as a node to form tree-like storage [block devices]," Evgeniy Polyakov stated on the Linux Kernel mailing list. The new release includes a new configuration interface and several bug fixes.
Network device driver and SATA subsystem maintainer, Jeff Garzik, was not impressed with the concept, "[distributed block devices] are not very useful, because it still relies on a useful filesystem sitting on top of the DBS." He went on to explain the problem, "it devolves into one of two cases: (1) multi-path much like today's SCSI, with distributed filesystem arbitrarion to ensure coherency, or (2) the filesystem running on top of the DBS is on a single host, and thus, a single point of failure (SPOF)." He proposed instead that time would be better spent developing a POSIX-only distributed filesystem, "in contrast, a distributed filesystem offers far more scalability, eliminates single points of failure, and offers more room for optimization and redundancy across the cluster." Jeff went on to caution, "a distributed filesystem is also much more complex, which is why distributed block devices are so appealing :)" When Lustre was pointed out as an existing option, Jeff noted, "Lustre is tilted far too much towards high-priced storage, and needs improvement before it could be considered for mainline."
A recent discussion on the lkml examined the possibility of a Linux implementation of Sun's ZFS. It was pointed out that the file system is released under the GPL-incompatible CDDL, and that Sun has filed numerous patents to prevent ZFS from being reverse engineered. Max Yudin pointed out, "according to Jeff Bonwick's blog Sun issued 56 patents on ZFS, but I have no idea what they patented. Sorry, binary compatible ZFS reimplementation with GPL license might not be legal." David Litwin noted that he had been told by a ZFS developer to talk to Linux developers to see about getting non-GPL'd code included with the kernel. Theodore T'so replied, "that was totally useless answer from the ZFS developers. What he should have told you is to contact Sun management, since they are the only ones who can decide whether or not to release ZFS under a GPL license, and more importantly, to give a patent license for any patents they may have filed in the course of developing ZFS."
Alan Cox [interview] suggested, "the real test of whether Sun were serious about ZFS being anywhere but Solaris is what they do to license it - they've patented everything they can, and made the code available only under licenses incompatible with other OS products. Their intent is quite clear, and quite sad. Compare it to what the old Sun company did with NFS, which is now a standard used everywhere." Theodore T'so added, "given that Sun has reportedly filed a huge number of patents covering ZFS and has refused to make them available for anything other than Solaris --- and there are senior Sun programmers who have on record stated that one of the reasons why Sun picked the CDDL was precisely because it was incompatible with GPL and Sun fears Linux ---- I wouldn't bet on Sun being willing to making a patent license available to a hypothetical alternate implementation of the ZFS format for Linux." He went on to note, "of course, this is all open source. If someone wants to work on reimplementing ZFS from scratch, either in userspace or in the kernel, certainly the Linux community won't stop them. Given the patent issues Linus might not feel comfortable including it in the mainline sources without a promise from Sun that they won't sue the pants off of him and The Linux Foundation, but again, that's Sun's decision, and no one else can help you there."