carp: intermittent master/backup swapping

Previous thread: That whole "Linux stealing our code" thing by Theo de Raadt on Friday, August 31, 2007 - 6:40 pm. (102 messages)

Next thread: DNS server setup for multiple domains by mufurcz on Friday, August 31, 2007 - 9:15 pm. (3 messages)
From: Jacob Yocom-Piatt
Date: Friday, August 31, 2007 - 7:38 pm

have 2 sun netra t1s running sparc64 4.1-release as my firewalls and am 
experiencing intermittent swapping of MASTER and BACKUP states on carp 
interfaces. i have carp working fine in a number of other places and do 
not see this behavior there, although the working setups are i386-based.

NOTE: i've included several tcpdumps and various outputs, so this is a 
long message. have spent several hours at this without a resolution and 
do appreciate folks taking the time to read through it =)

problems are most apparent when the internal interface drops packets, 
but the most serious case is that of the public IPs that are carp-ed. 
example:

- have the external interface on both machines, hme1, carp-ed and the 
ifconfig output for each machine's interfaces is as follows

FW #1

hme1: 
flags=8b63<UP,BROADCAST,NOTRAILERS,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> 
mtu 1500
        lladdr 08:00:20:c2:21:45
        media: Ethernet autoselect (100baseTX full-duplex)
        status: active
        inet6 fe80::a00:20ff:fec2:2145%hme1 prefixlen 64 scopeid 0x2
        inet 208.70.19.203 netmask 0xfffffff8 broadcast 208.70.19.207
...
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        lladdr 00:00:5e:00:01:01
        carp: MASTER carpdev hme1 vhid 1 advbase 1 advskew 0
        groups: carp
        inet 208.70.19.202 netmask 0xfffffff8 broadcast 208.70.19.207
        inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0xd

FW #2

hme1: 
flags=8b63<UP,BROADCAST,NOTRAILERS,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> 
mtu 1500
        lladdr 08:00:20:f9:a8:8d
        media: Ethernet autoselect (100baseTX full-duplex)
        status: active
        inet6 fe80::a00:20ff:fef9:a88d%hme1 prefixlen 64 scopeid 0x2
        inet 208.70.19.204 netmask 0xfffffff8 broadcast 208.70.19.207
...
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        lladdr 00:00:5e:00:01:01
        carp: BACKUP carpdev hme1 vhid 1 advbase 1 advskew 100
        ...
From: Stuart Henderson
Date: Saturday, September 1, 2007 - 2:52 am

this happens when you reconfigure IP addresses; workaround: ifconfig
carpXX destroy; sh /etc/netstart carpXX. the fix is in rev 1.132.2.1 of

my preference is 'no state' on things like carp and ospf. makes little
difference for many setups, but if I don't always do it, I tend to forget
it where I need it (e.g. where queues are involved).

From: Jacob Yocom-Piatt
Date: Saturday, September 1, 2007 - 7:36 am

tried this and AFAICT, it has fixed the issue. thanks stuart!

since i'd rather not wait 2 months for this fix to show up in 
4.2-release, is it possible to patch ip_carp.c up to get this fix in and 
recompile the kernel? guess it shouldn't matter once i have everything 

sounds like good advice, especially on "state free" protocols.

Previous thread: That whole "Linux stealing our code" thing by Theo de Raadt on Friday, August 31, 2007 - 6:40 pm. (102 messages)

Next thread: DNS server setup for multiple domains by mufurcz on Friday, August 31, 2007 - 9:15 pm. (3 messages)