carp: intermittent master/backup swapping

Previous thread: That whole "Linux stealing our code" thing by Theo de Raadt on Friday, August 31, 2007 - 9:40 pm. (102 messages)

Next thread: DNS server setup for multiple domains by mufurcz on Saturday, September 1, 2007 - 12:15 am. (3 messages)
To: <misc@...>
Date: Friday, August 31, 2007 - 10:38 pm

have 2 sun netra t1s running sparc64 4.1-release as my firewalls and am
experiencing intermittent swapping of MASTER and BACKUP states on carp
interfaces. i have carp working fine in a number of other places and do
not see this behavior there, although the working setups are i386-based.

NOTE: i've included several tcpdumps and various outputs, so this is a
long message. have spent several hours at this without a resolution and
do appreciate folks taking the time to read through it =)

problems are most apparent when the internal interface drops packets,
but the most serious case is that of the public IPs that are carp-ed.
example:

- have the external interface on both machines, hme1, carp-ed and the
ifconfig output for each machine's interfaces is as follows

FW #1

hme1:
flags=8b63<UP,BROADCAST,NOTRAILERS,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST>
mtu 1500
lladdr 08:00:20:c2:21:45
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet6 fe80::a00:20ff:fec2:2145%hme1 prefixlen 64 scopeid 0x2
inet 208.70.19.203 netmask 0xfffffff8 broadcast 208.70.19.207
...
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:00:5e:00:01:01
carp: MASTER carpdev hme1 vhid 1 advbase 1 advskew 0
groups: carp
inet 208.70.19.202 netmask 0xfffffff8 broadcast 208.70.19.207
inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0xd

FW #2

hme1:
flags=8b63<UP,BROADCAST,NOTRAILERS,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST>
mtu 1500
lladdr 08:00:20:f9:a8:8d
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet6 fe80::a00:20ff:fef9:a88d%hme1 prefixlen 64 scopeid 0x2
inet 208.70.19.204 netmask 0xfffffff8 broadcast 208.70.19.207
...
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:00:5e:00:01:01
carp: BACKUP carpdev hme1 vhid 1 advbase 1 ad...

To: Jacob Yocom-Piatt <jy-p@...>
Cc: <misc@...>
Date: Saturday, September 1, 2007 - 5:52 am

this happens when you reconfigure IP addresses; workaround: ifconfig
carpXX destroy; sh /etc/netstart carpXX. the fix is in rev 1.132.2.1 of

my preference is 'no state' on things like carp and ospf. makes little
difference for many setups, but if I don't always do it, I tend to forget
it where I need it (e.g. where queues are involved).

To: <misc@...>
Date: Saturday, September 1, 2007 - 10:36 am

tried this and AFAICT, it has fixed the issue. thanks stuart!

since i'd rather not wait 2 months for this fix to show up in
4.2-release, is it possible to patch ip_carp.c up to get this fix in and
recompile the kernel? guess it shouldn't matter once i have everything

sounds like good advice, especially on "state free" protocols.

Previous thread: That whole "Linux stealing our code" thing by Theo de Raadt on Friday, August 31, 2007 - 9:40 pm. (102 messages)

Next thread: DNS server setup for multiple domains by mufurcz on Saturday, September 1, 2007 - 12:15 am. (3 messages)