Ryan McBride works full time on OpenBSD development. His first contribution was adding IPv6 support to PF, OpenBSD's stateful packet filter. More recently he was the primary developer of CARP, the Common Address Redundancy Protocol, a patent-free alternative to HSRP and VRRP.
In this interview, Ryan discusses the development of CARP, explaining what it is and how it works. He reflects on patents and the difficulties OpenBSD has faced trying to work with IANA, as well as discussing several efforts to port CARP to other operating systems. Finally, he also highlights some of the new functionality that will be found in the upcoming May 1, 2004 release of OpenBSD 3.5.
Background:
Jeremy Andrews: Please share a little about yourself and your background.
Ryan McBride: I'm married, 26 years old, living part time in what amounts to a cabin in the woods on the coast of BC, north of Vancouver, and part time in an apartment in downtown Vancouver. I enjoy the contrast of city versus country, and whenever I'm in one, I can't wait to be in the other.
Regarding school and work; I studied Computer Science and Cognitive Science at Simon Fraser University until the dot-com boom lured me away to full-time work. For several years now I've been working as an Information Security consultant doing quite a variety of things, but since the summer of 2003 I've taken time off from that to hack OpenBSD full time.
JA: How long have you been living in the Vancouver area?
Ryan McBride: On and off for the past decade or so. I lived in Montreal for 3 years while my wife went to school at McGill, but besides that I've been living in Vancouver or it's suburbs since I finished high school.
JA: How did you come to work full time on OpenBSD?
Ryan McBride: My wife and I had come to the understanding that after she finished her degree, she would work and I would get to take a year off. My intention was to basically spend the year relaxing and hacking on OpenBSD. I told Theo about this at last year's hackathon in May, and at some point over the summer, Theo offered to fund me. In retrospect this was critical to getting CARP done so quickly, because for various reasons my wife and I couldn't stick to the original plan and I would have had to go out and find a "real" job.
JA: How and when did you get started with OpenBSD?
Ryan McBride: I've been involved with OpenBSD as a user since 2.3 or so. I had seen Theo speak at DefCon 5 (where he was giving away 2.1 CDs), and again at The BlackHat briefings the following year, and I decided that it was time to check it out. I was working as a network admin at the time, and it quickly became used in all kinds of places on networks I managed.
As a developer, my first real involvement with OpenBSD was coding IPv6 support for PF prior to it's official birth in 3.0. I had just recently begin playing with IPv6 on my network, saw that this functionality was missing, and in the "Shut up and Hack" spirit of OpenBSD, I simply sat down and coded it.
IPv6:
JA: What was the reaction when you presented your code to the OpenBSD team?
Ryan McBride: I think they were fairly surprised, although they had already been putting some thought into IPv6 support. I got a lot of constructive criticism of course, and had to go back and redo some of the code, which was only to be expected for my first kernel hacking attempt.
JA: How complete is OpenBSD's support for IPv6?
Ryan McBride: It's reasonably complete - it's possible for the most part to run an OpenBSD network entirely with IPv6.
JA: What's missing?
Ryan McBride: RPC (and hence NFS) is the major thing that is missing, in my opinion. In general, IPv6 tends to lag behind v4 a little in many areas, because there are less people who understand it (and also less people who use it and are motivated to work on it). For example PF still needs fragment reassembly for v6, and pfsync does not support IPv6 as a transport (though states for v6 connections get sync'd just fine).
JA: How long would you estimate before IPv6 finally replaces IPv4 on the internet?
Ryan McBride: This is something that I'm not willing to stick my neck out making a prediction on. IPv6 has been the Next Best Thing for a long, long time now. Should we take this to mean that it has staying power and will eventually come out on top, or that it's failed to provide the things needed to make it a compelling option?
Common Address Redundancy Protocol:
JA: Can you describe your involvement in the development of CARP, the Common Address Redundancy Protocol?
Ryan McBride: I'm the primary developer of CARP, but it was really a collaborative effort. From design to code, a number of other OpenBSD developers were involved. A large portion of the code came from a VRRP implementation by Michael Shalayeff - the protocol is different, but all the backend stuff that needs to be done with interfaces and ARP, etc. is the same. Markus Friedl was a great help with the cryptography, and Jun-Ichiro "Itojun" Hagino helped with the design of the IPv6 support. Still more help came from too many other developers to name them all.
JA: How functional is CARP's IPv6 support?
Ryan McBride: There is essentially no difference between how IPv4 and IPv6 are handled by CARP and they should work equally well.
JA: How does CARP utilize cryptography?
Ryan McBride: CARP pseudo-packet is cryptographically signed with a SHA-1 HMAC, which the data to be protected with a shared password. The pseudo-packet includes the data from the real packet on the network, as well as the ip addresses associated with the CARP interface.
The counter in the packet can be used for replay detection, though this is not yet implemented.
JA: How long has CARP been under development?
Ryan McBride: CARP has been under development for quite some time, longer than my involvement, but I first became involved in discussions about CARP with some of the other developers at the OpenBSD hackathon in May 2003. Real work on the protocol didn't begin in earnest until early fall.
CARP and patents:
JA: How does CARP manage to avoid any of Cisco's patents, while providing functionality similar to their HSRP, or the IETF's VRRP?
Ryan McBride: It's actually not terribly difficult to avoid the Cisco patent. I think that very few people actually go and carefully read the patent documents that companies use for intimidation - certainly it's clear that the US Patent office doesn't read them.
In fact by our analysis VRRP does not infringe on the HSRP patent either. The problem with VRRP is not so much that it infringes on the HSRP patent, but that everybody takes Cisco at their word when they say it does!
However just to be safe, we made changes which further differentiate CARP from the Cisco patent, removing some fields from the advertisements, changing the semantics of others.
JA: Have you heard anything from Cisco regarding CARP?
Ryan McBride: Not since it was released. We have had communications with Cisco's Lawyer, Robert Barr prior to implementing CARP, but they've not commented since then.
JA: Did CARP have to make any sacrifices to avoid infringing on these patents?
Ryan McBride: Not really, and the exercise of making the changes actually led to some improvements. For instance the HSRP patent specifically mentions having the virtual IP address in the advertisement message. We removed that (it is still part of the HMAC), and realized that CARP was now address family independent. So adding IPv6 support to CARP was relatively easy and didn't require changing the packet layout at all.
CARP functionality:
JA: From the name, it seems obvious that at some level CARP provides or manages highly available network services. Can you give a complete description of what CARP provides?
Ryan McBride: Essentially CARP provides the ability for one host to assume the network identity of an other, and a mechanism to decide when that's necessary. The way it works is very simple: The master of the virtual addresses advertises it's presence on an regular basis. The backups listen for this advertisement, and if the advertisements stop, one of them steps in and becomes master, takes over the virtual addresses, and sending out regular advertisements.
JA: How do you prevent two servers from trying to become master at the same time?
Ryan McBride: When two hosts both decide to become master, they both advertise. When they receive the advertisements, the one which intends to advertise less frequently goes into the backup state.
JA: How does each host decide how frequently it will advertise?
Ryan McBride: It's a combination of two options specified by the user, the 'advintv' (advertisement interval) and the 'advskew' (advertisement skew). The following formula determines the actual interval: advintv + (advskew / 255).
Typically the 'advintv' value is left at the default of 1 second, so with this setting, an 'advskew' of 128 means the host is advertising every 1.5 seconds.
JA: What information is in the advertisement packets?
Ryan McBride: Besides the IP header, the advertisement packets contain the virtual host id (you can have up to 255 virtual hosts on the same network segment) as well as the base advertisement interval and the skew on that interval which taken together indicate how frequently the host expects to advertise. There is also a 64 bit counter, a field indicating the length of the authentication data in 4 byte words (always set to 7 in our implementation), and the SHA-1 HMAC.
JA: It sounds like CARP can provide much more than firewall redundancy. I assume that any network application can build upon CARP to provide network redundancy?
Ryan McBride: To varied degrees yes. It works extremely well with stateless, sessionless protocols such as DNS - with these, you won't even notice the failover. With a protocol like HTTP, active downloads will be terminated, but simply hitting reload in your browser should clear it up. Databases need to have some kind of replication scheme with locking, etc to work correctly in a setup like this, and it gets more complicated. Basically, if there is state involved, the application needs to be coded specially to handle it. In the case of firewalls, that's where pfsync comes in.
JA: Is CARP the only patent-free protocol that you are aware of providing this functionality?
Ryan McBride: Nothing that provides exactly the functionality that CARP does.
There are some toolkits available which play similar layer2/layer3 games to CARP for changing addresses, but do their detection in the opposite direction; instead of listening for an advertisement, they poll a service. "fake" is one such package.
JA: OpenBSD's stateful packet filter, PF, runs on a number of different architectures. Can CARP provide redundancy on a group of firewalls comprised of a combination of these different architectures?
Ryan McBride: This is really two separate questions.
Yes, CARP works with multiple architectures; we've run test clusters with as many as 4 architectures without any problems.
CARP, by itself, doesn't provide redundancy on a group of firewalls - you also need a way to synchronize the state table, otherwise when a failover happens with CARP, packets for existing sessions will not be recognized, and the sessions will die.
Luckily, the 3.5 release of OpenBSD also includes the pfsync protocol, which takes care of synchronizing the state table between a cluster of firewalls.
Like CARP, pfsync is a relatively simple protocol. Each firewall sends out state insertion, update, and deletion messages via multicast on a specified interface, using the PFSYNC protocol (IP Protocol 240). It also listens on that interface for similar messages from other firewalls, and imports them into the local state table.
A mechanism exists for pfsync to request all the state entries from active firewall when the firewall is first booting, and it integrates with CARP, preventing CARP from preempting the address from other hosts until the bulk transfer is complete.
JA: This is very impressive functionality, but it sounds like it has potential to be difficult to setup a truly redundant firewall configuration?
Ryan McBride: Not really. It involves setting up a CARP group for each interface on your firewall, configuring pfsync, which is a one-liner in /etc/hostname.pfsync0, and setting up your pf ruleset appropriately. There is a simple example in the pfsync(4) manpage.
JA: What was involved in choosing the acronym CARP?
Ryan McBride: A thesaurus :-)
CARP limitations:
JA: What limitations does CARP have?
Ryan McBride It's a multicast protocol, so if your network card doesn't support multicast, you have problems, and because it does Strange Things™ with MAC addresses, it's possible some broken switches have problems with it.
It doesn't provide any failover of higher levels in the stack. I already mentioned the firewall problem, but connections which terminate on the CARP host will also die if the ip moves to a different host.
Finally, CARP by itself cannot deal with the situation that arises on a firewall, where there is a failure on some but not all of the interfaces. For example, if this happens with two firewalls, the backup will take over on the failed side, but not on the good side. The problem is that on the failed side neither firewall can see each other advertising, so they _both_ think that they are master on that side. There's no real way to resolve this without polling some third resource, like the gateway, to determine which firewall is in a position to forward traffic.
This situation is actually pretty rare, but we are developing a tool called ifstated which can be used to deal with this situation.
JA: Can you elaborate on what you mean by CARP doing strange things with MAC addresses, and how this can confuse some switches?
Ryan McBride: This is a problem that's shared by other failover protocols, such as VRRP, that use the same mechanism of passing a virtual MAC address around from host to host when a failover occurs. Some switches don't handle this cleanly, take too long to change the entry in their forwarding table, and thus extend the failover time. I've not seen this behavior personally, but it exists.
JA: Will ifstated be included with OpenBSD 3.5?
Ryan McBride: No.
JA: What else is planned for CARP?
Ryan McBride: Replay detection, further integration with pfsync and other tools and the ability to configure CARP to use an interface with an address on a different subnet from the common address, or even no address at all. We also have some other ideas to help prevent CARP from going into a split-brain situation where the state of CARP interfaces on one system is not in sync. And there will of course be improvements that are not yet planned - I expect to get a lot of ideas from people as 3.5 rolls out and CARP is more widely developed.
Porting CARP:
JA: As CARP was developed under the BSD license, it is therefore freely available to be used by other firewalls. Have other firewall groups expressed an interest in this new protocol?
Ryan McBride: There is a direct port of OpenBSD's implementation available for FreeBSD, and I believe there is a serious effort to integrate it into their main tree. A port has also been done for NetBSD but I'm not sure exactly what it's status is.
There's also a userland implementation of the protocol which runs on *BSD and Linux.
JA: If the protocol can be implemented in userland, what is the advantage to having it in the kernel?
Ryan McBride: The usual: performance and complexity. Implementing it in the kernel avoids a lot of kernel<->userland copying, not so important for the CARP advertisements, but very important for regular traffic coming to the CARP MAC address. Additionally, CARP involves messing with MAC addresses and other low-level network things, which is somewhat easier to do inside the kernel, resulting in less lines of code and (hopefully) less bugs.
JA: What is the name of the userland CARP project?
Ryan McBride: UCARP. The website is http://www.ucarp.org/.
JA: Were you at all involved in porting CARP to the FreeBSD port of PF?
Ryan McBride: Not really, though I've been very happy to receive patches for problems that Max Laier (the person doing the port) finds during his efforts. I'm actually surprised that I haven't received more requests for information from porters. I only found out about the userland CARP implementation very recently, and I haven't had any contact with it's author at all.
JA: Have you tried to contact the author?
Ryan McBride: Not yet, as I've been too busy dealing with the 3.5 release cycle. I'll probably get in touch with him once work on 3.6 begins in earnest.
IANA/IETF:
JA: Can you explain a little about the issues the OpenBSD team has had when trying to work with IANA?
Ryan McBride: The IANA has a heavily bureaucratic process for getting official number assignments. There are essentially two options for getting a protocol number assigned: The first is to run your protocol through the IETF on a standards track. This avenue is closed to us - the IETF has become monopolized by large corporate interests, and they have no problem with using patented protocols. They're perfectly happy using VRRP, and they won't support another standard. The second path is their proprietary path; you pay for "experts" to review your protocol and if they agree that it requires the numbers you're asking for, you get it. If you look at the list of assigned protocol numbers, this method appears to be the favored one. Getting a number allocation has more to do with having money. Obviously, since we're not a large multinational corporation, we can't afford to take this path. Since they were unable to help us by providing a real alternative, our only option is to simply pick an unused number and go with that.
JA: Are you concerned that at a future date the IANA might officially assign the port you've chosen to another protocol?
Ryan McBride: Not extremely worried, no. Our hope is that they will accept it as an official assignment before that has to happen.
JA: What do you feel can be done to address this situation?
Ryan McBride: Anybody who cares about having standards that are actually free should express their concern about this to the IETF - community outcry prevented the World Wide Web Consortium from going the route of allowing standards contributors to retain all rights to patents injected into standards, and it could work here as well.
My personal opinion is that there should be no such thing as a software patent at all, but let's take it one step at a time.
OpenBSD 3.5:
JA: Other than CARP, what other new features have found their way into PF in OpenBSD 3.5?
Ryan McBride: Aside from the pfsync changes, and a bunch of bugfixes and cleanup, the following new features have been added:
JA: What are some other highlights of the upcoming OpenBSD 3.5 release?
Ryan McBride: That's a tough question to answer succinctly - the authoritative list is the release page at http://www.openbsd.org/35.html, but essentially the answer is this: like OpenBSD 3.4, but more so:
JA: What other development where you involved in with OpenBSD 3.5?
Ryan McBride: All PF. I was responsible for a number of the major changes to PF - source address tracking (sticky-address/source-tracking) and trimming the size of state table entries. Was in charge of getting the redirect to localhost problem fixed, but didn't actually write most of the code in that case.
PF is approaching feature completeness, so I will be branching out now and getting involved in other areas - likely isakmpd will be my next target.
Hobbies:
JA: How do you enjoy spending your time when you're not working on OpenBSD?
Ryan McBride: Chopping firewood, fighting back the temperate rainforest, and digging in the garden. I try to take as many hiking trips as possible, often with other OpenBSD developers, and I practice Aikido (http://www.aikidofaq.com/) 3 or 4 days a week.
JA: Where have you gone on recent hikes with other OpenBSD developers?
Ryan McBride: Nothing more recently than last summer - the snow in the mountains and rain on the coast make hiking unpleasant and/or dangerous. Last summers trips include: Mount Assiniboine on the BC-Alberta border with Theo and Peter Valchev, the Juan de Fuca trail on the west coast of Vancouver Island with Peter, and an attempt at the Jasper Skyline with Theo, Peter, Bob Beck, and Henning Brauer.
JA: Where do you plan to go for your next hike?
Ryan McBride: No plans yet, but something will come up soon enough.
JA: How long have you been practicing Aikido?
Ryan McBride: Not long at all, but it's something that I've been wanting to since I've been in high school.
Wrap up:
JA: What insight would you offer to someone just beginning to get interested in kernel hacking?
Ryan McBride: The best way to learn is by doing. Get in there, fix bugs, add features, whatever. Make mistakes - you'll learn from the process of fixing them.
If you have questions, try to ask smart ones - have a look at the document "How To Ask Questions The Smart Way" (http://www.catb.org/~esr/faqs/smart-questions.html)
And correct bug fixes have a better chance of getting accepted into the official tree than correct new features. :-)
JA: Is there anything else that you'd like to add?
Ryan McBride: I should mention that the only reason that I can afford to take time off from the "real world" to work on OpenBSD full-time is that my work is being funded by Theo, with money that comes from CD sales and donations.
Increased funding for OpenBSD means more cool features, so order your CDs now, and thanks to everyone who has supported OpenBSD in the past.
JA: Thank you for all your time answering my questions, and for all the effort you've put into CARP and PF development.
Ryan McBride: It's my pleasure!
OpenBSD 3.5 will be available on May 1'st, 2004. It can be freely downloaded from a local FTP mirror, or purchased as a 3-CD set, which includes images for i386, vax, amd64, macppc, sparc and sparc64. Versions for alpha, hppa, hp300, mvme68k, mvme88k, mac68k and cats are available by FTP. The project's homepage states, "OpenBSD is developed by volunteers. The project funds development and releases by selling Cd's and T-shirts, as well as receiving donations. Organizations and individuals donate and thus ensure that OpenBSD will continue to exist, and will remain free for everyone to use and reuse as they see fit."
Hmm
The McBride -for- free software.
And CARP gets to be the theme for 3.5
What a tribute for the open source way. Ryan "shuts up and codes", it gets integrated into OpenBSD, and becomes the theme for this release, including the motto for the shirt.
You too, dear reader, can rise to this level, provided you can "shut up and code" Now will some of you reverse engineer some wireless drivers for the main chipsets out there, so we can enjoy up to date wireless.
BTW I learned of this interview from the new OBSDJ site.
> "BTW I learned of this inte
> "BTW I learned of this interview from the new OBSDJ site."
As I did. OBSDJ should make a great replacement of the initial OpenBSD Journal. http://obsdj.baselabs.org/.
OBSDJ ROCKS! :D
OBSDJ ROCKS! :D
IPv6 fragmentation and reassembly for pf
I believe, that you cannot make fragments reassembly in pf as you do for IPv4, since in IPv6 fragmenation and reassembly can happen only at the end systems. You can do some sanity checking. We did some analysis in the 6NET project. If you want to contact me about the issues we found write me at:
mohacsiNOSPAM_at_niif.hu
IETF replies
Since IETF is mentioned in this interview, and since the kerneltrap page was recently discussed on the IETF general mailing list, I take the liberty to indicate here pointers to relevant parts of this discussion (I don't have personal experience with port number assignments, so I just post links, I cannot vouch for the facts) :
* http://www1.ietf.org/mail-archive/web/ietf/current/msg48946.html (Red Hardies says experts are not paid)
* http://www1.ietf.org/mail-archive/web/ietf/current/msg48948.html (IANA official reply)
* http://www1.ietf.org/mail-archive/web/ietf/current/msg48949.html (Bill Fenner fixes another bug)