The Linux Kernel Archives are perhaps most familiar through their web interface, http://kernel.org/. The latest release of the Linux kernel is easily found here, along with patches by various Linux kernel hackers, and mirrors of other popular free and open source projects. Countless people worldwide happily rely on this archive without giving much thought to the effort behind it.
In a recent announcement to the Linux Kernel Mailing List, H. Peter Anvin detailed a recent upgrade of the infrastructure behind kernel.org. The new servers were donated by Hewlett-Packard, and are each quad Opterons with 24 gigabytes of RAM and 10 terabytes of disk space. Internet Systems Consortium, Inc. donates the bandwidth in the form of two independent gigabit-connected datacenters, PAIX Palo Alto and e200paul in San Francisco. H. Peter Anvin, Nathan Laredo, and Kees Cook all donate time to maintain the archives. KernelTrap recently spoke with Peter Anvin to learn more about the history behind the Linux Kernel Archives.
Peter Anvin has been involved with Linux since nearly the beginning. When Linus Torvalds purchased his first computer on which he began writing the Linux kernel, the state-of-the art PC with 4 megabytes of RAM and running at 33 megahertz was too expensive for him to buy outright. Therefore, he financed much of the nearly $3,500 price, planning to pay it off over three years. Within a year as the Linux kernel began to evolve and a community of users formed, Peter organized an online collection that raised $3,000 and paid it off.
Later, when Linus graduated from the University of Helsinki, Peter convinced him to move to California to work for Transmeta Corporation, where Peter himself had been working for about a year. At this point, the Linux Kernel Archives was born. "I've been taking care of kernel.org since its inception in 1997," Peter explained. In the beginning, the archives were housed on a generic white-box PC running the Linux kernel, connected to the Internet through Transmeta's T1. The idea was to provide Linus with a local server.
The 'kernel.org' domain name was picked because by that time in 1997 the more logical seeming Linux dot names were already taken. The Transmeta domain was intentionally not used to avoid creating the false perception that Transmeta owned Linux. "So kernel.org was taken as sort of a second choice," Peter explained, "and it has worked out obviously very well, as today it's instantaneously recognized as its own thing."
The original PC was replaced in 1998 with a Dual PII 550 donated by VA Linux Systems (now VA Software). Around that same time, Globix donated colocation for the server, providing a dedicated 100 megabit link at their data center in Santa Clara. Within a couple of years, the website was drawing that much bandwidth on a regular basis, and Peter noted "the relationship was getting fairly strained." In 2000 when the telecommunications industry came crashing down, Globix found that they needed to trim costs. Peter summarized, "they pretty much asked us to leave on short notice."
At this point, Paul Vixie, who runs Internet Systems Consortium, Inc., contacted them to offer space at ISC's colocation in PAIX Palo Alto. "This was pretty much a dream colo for us," Peter said, "we were allowed to saturate a 100 megabit link into quite a few Internet backbones."
I emailed Paul Vixie asking for insight into how ISC is able to provide this hosting for kernel.org. He pointed out an impressive list of projects for which they currently provide hosting and explained, "we're a public benefit corporation and we do a lot of this kind of stuff. We recognize the Linux Kernel Archive project as a fellow traveler and it's clear to us that by helping Peter Anvin we help our own cause." He went on to add, "Peter Anvin's been great to work with. Kernel.org is one of our larger single traffic sources, and we're proud to be associated with it. ISC believes that our existence has an industry-wide and community-wide benefit, and that kernel.org's existence, likewise." Mr. Vixie went on to thank his friend Daryl Jones of SMRN who is helping pay for kernel.org's power and heat in the San Francisco location.
In 2001, Hewlett-Packard made their first hardware donation to the Linux Kernel Archives, a ProLiant DL380 G2 with Dual PIII's running at 1.1 gigahertz. "That machine had what then seemed an astounding 6 gigs of RAM, and a terabyte of disk," Peter said. "At that time it seemed like a way over-dimensioned system." ISC then upgraded kernel.org to a gigabit link, more bandwidth then the server could actually use. With a fair amount of tuning, they managed to get the server pushing out 600 megabits of data. That limitation, Peter explained, was because more data was being served than could fit in the 6 gigabytes of RAM, at which point disk bandwidth became the limiting factor.
Serving data with http and ftp is is not very CPU intensive, but over time the amount of rsync traffic being fed by the kernel.org server continued to increase, and rsync is CPU intensive. "That's what rsync does" Peter said, "it trades bandwidth for CPU horsepower. We were getting to the point where we had all the bandwidth, but the Dual PIII 1.1's couldn't really keep up." He noted that the load average kept growing, well into triple digits. Referring to 32-bit systems, Peter noted, "we learned that the Linux load average rolls over at 1024. And we actually found this out empirically."
As it become more apparent that the hardware needed an upgrade, Peter began to think about preparing a request to Hewlett-Packard for new hardware. Before he even made his request, HP contacted him basically saying, "hey, we noticed that you guys have been kind of struggling lately, what do you need?" Peter provided them with his wish list, and within two weeks the decision was made and new hardware was on the way. Peter noted, " HP came to us from a quite high level. They have been absolutely great."
Matt Taggart, part of the R&D lab within HP's Open Source & Linux Organization, noted that HP is a large company and that the different donations to kernel.org actually came from different divisions. "There are plenty of people in HP that recognize the value that kernel.org provides and that benefit (both directly and indirectly through HP's customers) from having it perform well," he explained. "This time the donation came for HP's Open Source and Linux Operation R&D Lab, but in the past they have come from other places such as the Industry Standard Server Division (the folks that do ProLiant)." He went on to add, "HP's IT organizations also use Linux and are big users of kernel.org, so it benefits them as well."
As for why HP has made these donations, Matt explained, "when possible, HP likes to help Free and Open Source software projects at the source. For example, if HP wants to contribute driver fixes for a piece of equipment that we ship, it is a better use of our time to work at the kernel.org level rather than duplicating effort by working individually with each distributor (or not being able to work with some at all). Providing kernel.org hardware is an easy way for us to give back to the project that has helped save us a lot of effort."
On the wishlist that Peter provided to HP he had two main requirements, a 64-bit processor, and two identical servers to allow one to have scheduled downtime while the other could continue to function. The new servers donated by HP are ProLiant DL585 4-way dual-core Opterons, with 24 gigabytes of RAM and 10 terabytes of disk space using a pair of MSA-30 arrays for each server. "The new machines can genuinely serve all the commonly requested files from RAM," Peter said. "That was a big reason why we asked for 24 gigabytes."
One of the new servers is located in San Francisco, the other in PAIX Palo Alto. As of April 9'th, 2005, both of the new servers are online and serving traffic. Each of them should be capable of individually serving a full gigabit of traffic, though this hasn't happened yet. The CPU load average dropped from triple digits down to the low single digits.
"Each server is in a different ISC colo, connected to the Internet via gigabit fiber links," Peter summarized in an announcement to the Linux Kernel Mailing list. "Consequently, we should now see incredibly much better performance from kernel.org. Huge thanks to HP for the new hardware, and huge thanks to ISC for letting us quadruple our rack space requirements from 5U to 2x10U. We'll be saturating those links in no time :)"
Under The Hood:
The servers that power the Linux Kernel Archives have always used the Linux kernel. In the beginning, they ran the vanilla kernel, keeping up with the latest and greatest features providing the best performance. However, in the past couple of years, the archives have begun using vendor kernels. At this time, the servers run Fedora Core and use the 2.6 kernel provided by RedHat. Peter explained, "it just comes down the upgrade pipe, which makes keeping it up to date a lot simpler." He added that for this reason they will continue to use vendor kernels so long as they're not lacking any critical features. The Linux Kernel Archives began serving data with the 2.6 kernel nearly a year ago on May 24'th, 2004.
The web pages are served by Apache, upgraded to Apache 2 on December 4'th 2004. FTP is served by vsftpd, which replaced proftpd on May 26'th 2004. Beyond that, Peter noted, "very little fancy is going on, and that is good because fancy is hard to maintain." He explained that the only fancy thing being done is that all filesystems are mounted noatime meaning that the system doesn't have to make writes to the filesystem for files which are simply being read, "that cut the load average in half." Beyond that, he explained that their main requirement is that everything use the sendfile system call, "which basically says take this file and herd it out this particular TCP port. That is 99 point something percent of what we do, so that is very important to us."
The Linux Kernel Archives Mirror System is managed by Kees Cook. Peter explained that this system originated back when kernel.org was using the Transmeta T1 link, "and horribly bogged down as a result." Several high bandwidth sites volunteered to act as mirrors, and a formal system was created. Essentially, each site agrees to a baseline of service, and links are provided from the kernel.org website. "We consistently have a little over 100 sites, and the number has been constant from pretty much the very beginning," Peter said. "Of course, the sites themselves change over time." When there was only one server running kernel.org, the mirrors would also take over when the main server needed maintenance.
In the current configuration there is no shortage of CPU or bandwidth, causing Peter to remark, "as far as the kernel is concerned, it wouldn't be a whole lot of skin of our teeth if the mirror system fell apart." However, for users downloading the kernel outside of North America, the mirrors are very helpful by providing them with a local source.
Other active services fall under the kernel.org domain. For example, the Linux Kernel Mailing List is run on a server called vger. However, physically that machine has nothing to do with the Linux Kernel Archives. Peter explained that originally there was a Linux Activist mailing list run in Finland. It was eventually replaced with the Majordomo powered Linux Kernel Mailing List, managed by David Miller at Rutgers University. Later, when David went to work at RedHat and the server moved with him, some people become quite concerned about the LKML having a redhat.com domain. Peter offered at that time, "if it makes people feel better, we can make it vger.kernel.org." And so, while the server still is physically housed at RedHat, it is part of the kernel.org domain. "This is due to its function, not its location."
Bugzilla.kernel.org is another example of a server in the kernel.org domain that is housed elsewhere. In the case of the bugzilla server, it's run by OSDL. "It was because it kind of got blessed by Linus Torvalds and the general consensus of kernel developers that we put it in the kernel.org domain," Peter explained.
The normal bandwidth used by kernel.org is between 150 to 200 megabits per second, at times when "nothing major is happening," Peter said. "Quite honestly, the test releases aren't even a blip on our radar," he added, referring to the -pre and -rc kernels, explaining that they don't noticeably increase the amount of bandwidth that is consumed. Only when an official stable release is announced does kernel.org see a spike in traffic. For example, with the upcoming 2.6.12 release Peter predicted, "I expect it go to the high 200's, for about a day." He noted that even with a direct link from a busy website such as Slashdot, that was about as much bandwidth consumption as they see from a kernel release.
"What really drives up the load average is when one of the distributions that we mirror makes a release," he explained, "such as one of the Fedora cores. The kernel is only a few tens of megabytes, whereas a fedora core is a couple of gigabytes." With the upcoming release of Fedora Core 4, Peter predicts that both gigabit links will probably be saturated for 3 or 4 days. "This is largely speculation, because never before have we had the capability of serving that much traffic."
When asked about viewing the actual access logs, Peter explained that although they do occasionally get requests from various sorts of researchers, they generally don't make them available for privacy reasons. "We've only allowed access to people who are intimately involved with Linux already," he said, "people we already know." There has been discussion about making the logs available in an anonymized form, but it's not the top priority. "It gets talked about," he noted, "but it's largely a people time issue."
Making It Happen:
Currently there are three people who manage kernel.org, all to some degree in their spare time. In addition to Peter Anvin, Nathan Laredo and Kees Cook also help out. Peter, who is employed by Orion Multisystems, is in charge of the overall architectural design, providing developers with access to upload their patches, and with public relations. Nathan, who also works at Orion, maintains the system and server software as well as the web pages. And Kees, employed by the OSDL, is in charge of the mirror system. Day-to-day administration is done by whomever gets to it.
When I asked Linus Torvalds about the Linux Kernel Archives, he replied, "I have been very happily relying on others to do all the work with kernel.org." He went on to say, "I've literally never needed to lift a finger for kernel.org maintenance, which is wonderful (both for me - since I'm lazy, and for kernel.org - since I'm a total air-head when it comes to system management ;) The only thing I can add to anything is just a 'thanks for doing it' to Peter and the other people involved."
There has been talk of possibly creating a formal staff for managing kernel.org, though for several reasons it hasn't happened yet. "We would have to find a sponsor to pay for it," Peter explained, "It's not impossible, but just going out and hunting for that is a big job in itself." He went on to add that part of the reason this hasn't happened "is that both HP and ISC have been so great to deal with. They've been very low demand on our time, and they've been very forthcoming with what we need without going through rigmarole."
Officially, the site is run by the Kernel Dot Org Organization, Inc., a nonprofit corporation formed in 2001. However, a whois search reveals that the domain name is still registered by Transmeta Corporation. Peter is currently working with Transmeta to get it re-registered under the corporate entity. This originally was happening back in 2001, but was stalled due to some turmoil with Transmeta at that time, and there really hasn't been a pressing need.
The idea behind the non-profit organization was that the bandwidth consumed by kernel.org is very expensive. "I wouldn't be surprised to learn if it amounts to 1 million dollars a year," Peter said. "That's a lot of money, and we were thinking if we had to get another ISP sponsor, they'd need to be able to deduct this as a charitable expense. Currently we're doing this under ISC's umbrella." He added that the plan is to continue working with ISC as long as possible, "it's been an incredibly good relationship for us."