(originally sent to netdev on aug 6th) IPv6 initially works, but when I leave it alone overnight I'm unable to ping even my default gw. Static global IPv6 addresses configured on both ends. No access lists on either end. Kernel version: 2.6.35 mainline (amd64) and 2.6.33.6. Kernel config: http://pastebin.com/raw.php?i=Y6S8iKW7 Dist: Debian Lenny (5.0.5), nothing special to my knowledge. I seem to have the same issue that Mikael Abrahamsson encountered with Ubuntu kernels 2.6.26.3, 2.6.26-5-generic and 2.6.27-2-generic, and mainline kernels 2.6.25, 2.6.26 and 2.6.27: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/263260 He got IPv6 running again without rebooting using "networking stop, ifconfig eth0 down, networking start, kill dhclient", while I narrowed it down to just deleting the ipv6 neighbor (ip ne del..., see below). Rebooting also causes it to start working again. It's very reproducible. I just leave it overnight and it breaks every time. I am willing and able to try patches at any time, the box is not in production. No iptables, no ip6tables. IP6tables support is not even compiled in. NIC is "Broadcom Corporation NetXtreme BCM5715 Gigabit ethernet (rev a3)" according to lspci. Other end is a directly connected Cisco 7600 (routed port) that I have access to, but it's in production use. IPv4 works perfectly over this same port. Only lo and eth0 are UP. Output when broken ------------------ $ uname -a Linux XXXXX 2.6.35 #1 SMP Tue Aug 3 09:25:51 CEST 2010 x86_64 GNU/Linux $ ip -6 a sh 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 inet6 2a00:800:1000:64::1/128 scope global valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000 inet6 2a00:800:752:1::5c:2/112 scope global valid_lft forever preferred_lft forever inet6 fe80::224:81ff:fea3:4424/64 scope link valid_lft ...
From: Thomas Habets <thomas@habets.pp.se> If you didn't get an answer on netdev, sending your query again to linux-kernel isn't going to help. Networking experts generally do not read this list. Nobody has simply had an opportunity to look into your problem yet, that is all. I personally have it saved in my inbox and plan to look at it when I get a chance unless someone else gets to it first. --
advmss 14 ? or is it a copy/paste error ? unreachable ? Sollicitation comes from fe80::224:81ff:fea3:4424 instead of I am wondering if you have some lowlevel problem, say lost frames in an otherwise idle link, maybe a full/half duplex mismatch ? --
Copy/paste error, sorry.
advmss 1440, and no "unreachable"
Complete (correct) line:
2a00:800:752:1::5c:0/112 dev eth0 proto kernel metric 256 mtu 1500 advmss
Yes, that's where the "unreachable" belonged.
unreachable 2a00:800:1000:64::1 dev lo proto kernel metric 256 error
-101 mtu 1500 advmss 1440 hoplimit 4294967295
Am I not allowed to add addresses to lo? That I've deconfigured this
But at first it works perfectly, and then it doesn't work at all. The link
is ~idle both before and after, and IPv4 is unaffected. When I run "ip ne
del ..." it *immediately* starts working again. From 100% packet loss to
0%.
Duplex is full according to dmesg and ethtool (mii-tool says 1000BaseT-HD,
but I suppose mii-tool is not as reliable?).
Cisco router also says "Full-duplex, 1000Mb/s", so there doesn't seem to
be a mismatch. No errors in "show int giX/Y" either.
ethtool info right after reboot (when ipv6 is still working):
$ sudo ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: g
Current message level: 0x000000ff (255)
Link detected: yes
No errors show in "ethtool -S eth0 | grep -v ': 0$'" now that it's
working.
I will re-check ethtool and Cisco router output for mismatches when it
breaks again to make sure that there's no change or errors counting ...IPv6 is currently not working and it's still 1000 Full on both sides
("show int GiX/Y" and "ethtool eth0").
No errors in "ethtool -S eth0" or "show int GiX/Y".
"ethtool eth0" output is the same as yesterday.
Here's the addresses and routing table as they are now, and have been
since reboot:
$ ip -6 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2a00:800:752:1::5c:2/112 scope global
valid_lft forever preferred_lft forever
inet6 fe80::224:81ff:fea3:4424/64 scope link
valid_lft forever preferred_lft forever
$ ip -6 r sh
2a00:800:752:1::5c:0/112 dev eth0 proto kernel metric 256 mtu 1500
advmss 1440 hoplimit 4294967295
fe80::/64 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440
hoplimit 4294967295
default via 2a00:800:752:1::5c:1 dev eth0 metric 1024 mtu 1500 advmss
1440 hoplimit 4294967295
$ ip -6 ne sh
2a00:800:752:1::5c:1 dev eth0 lladdr 00:22:55:17:4b:80 router STALE
[try ping6 again, no reply]
$ ip -6 ne sh
2a00:800:752:1::5c:1 dev eth0 lladdr 00:22:55:17:4b:80 router DELAY
[try ping6 again, no reply]
$ ip -6 ne sh
2a00:800:752:1::5c:1 dev eth0 lladdr 00:22:55:17:4b:80 router REACHABLE
[try ping6 again, no reply]
Configured network with /etc/network/interfaces:
auto lo
iface lo inet loopback
allow-hotplug eth0
iface eth0 inet static
address x.x.x.x
netmask 255.255.255.252
broadcast x.x.x.x
gateway x.x.x.x
up ip a a 2a00:800:752:1::5c:2/112 dev eth0
up ip r a default via 2a00:800:752:1::5c:1
dns-nameservers x.x.x.x
dns-search xxxx.net
---------
typedef struct me_s {
char name[] = { "Thomas Habets" };
char email[] = { "thomas@habets.pp.se" };
char kernel[] = { "Linux" };
char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" };
char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" };
...This seems a bit different than previous mail. Apparently discovery now works ? Could you have a tcpdump on both sides ? --
Aha! New development: The Cisco router can't discover the address of the Linux box because Linux doesn't seem to be listening to ff02::1 (all-nodes). ----------- cisco#ping ff02::1 Output Interface: GigabitEthernet1/2 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to FF02::1, timeout is 2 seconds: Packet sent with a source address of FE80::222:55FF:FE17:4B80%GigabitEthernet1/2 Request 0 timed out Request 1 timed out Request 2 timed out Request 3 timed out Request 4 timed out Success rate is 0 percent (0/5) 0 multicast replies and 0 errors. ------------ If i set promisc mode on the interface (tcpdump without -p or "ip link set promisc on eth0") it starts working (both normal ping and the above ping from the Cisco to ff02::1). It continues working until I guess the neighbor table on the cisco times out (leaving it overnight seems to be enough idle time) or I manually do a "clear ipv6 neig". So great news! I can reproduce it at will with no waiting time! Right after rebooting the Linux box I run "clear ipv6 neighbors" and Linux can no longer ping the router. Tested reproducing it immediately after reboot. The Linux box itself can ping ff02::1%eth0 with no problem, and gets replies from the fe80:: link-local of itself and the Cisco router. So could this be that for some reason the NIC isn't listening multicast MAC address 33:33:ff:5c:00:02 ? Is there a way to see the list of addresses that get past the NIC? Or can this perhaps be filtered after the NIC, but before tcpdump -p? Since this now looks like a NIC thing, here's some info about eth0: $ dmesg | grep eth0 [...] tg3 0000:03:04.0: eth0: Tigon3 [partno(N/A) rev 9003] (PCIX:133MHz:64-bit) MAC address 00:24:81:a3:44:24 tg3 0000:03:04.0: eth0: attached PHY is 5714 (10/100/1000Base-T Ethernet) (WireSpeed[1]) tg3 0000:03:04.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1] tg3 0000:03:04.0: eth0: dma_rwctrl[76148000] dma_mask[40-bit] [...] $ sudo lspci -v -s ...
That would be very surprising, but who knows... Can you try : "ifconfig eth0 allmulti" If you let a "tcpdump" running with -p option, do you receive the packet sent to ethernet dest 33:33:ff:5c:00:02 ? If you can see it with tcpdump, then NIC gave the frame to us. --
That didn't help. "ifconfig eth0" and "ip l" shows that allmulti is now set, but no other difference. Can't ping router, and router gets no answer when pinging ff02::1. No message in dmesg saying allmulti isn't supported Seems to be invisible unless I or tcpdump set promisc mode. But when promisc mode is set I can immediately see the 33:33:ff:5c:00:02 packet (ND solicitation) and I see that Linux is answering it. Here's a tcpdump from the Linux host. It's slightliy trimmed to fit in an email, but the full dump is at http://www.habets.pp.se/tmp/ipv6.pcap $ sudo tcpdump -pnli eth0 -s0 -w ipv6.pcap ip6 [...] $ tcpdump -nlr ipv6.pcap 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, echo request 2a00:800:752:1::5c:1 > 2a00:800:752:1::5c:2: ICMP6, echo reply 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, echo request 2a00:800:752:1::5c:1 > 2a00:800:752:1::5c:2: ICMP6, echo reply 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, echo request 2a00:800:752:1::5c:1 > 2a00:800:752:1::5c:2: ICMP6, echo reply [ here I run "clear ipv6 neighbors" on the Cisco router ] 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, echo request 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, echo request [ ... more repeated echo requests, no reply ... ] [ here i run "ip l set promisc on eth0" ] 2a00:800:752:1::5c:1 > ff02::1:ff5c:2: ICMP6, neighbor solicitation, who has 2a00:800:752:1::5c:2 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6,neighbor advertisement, tgt is 2a00:800:752:1::5c:2 2a00:800:752:1::5c:1 > 2a00:800:752:1::5c:2: ICMP6, echo reply 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, echo request 2a00:800:752:1::5c:1 > 2a00:800:752:1::5c:2: ICMP6, echo reply 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, echo request 2a00:800:752:1::5c:1 > 2a00:800:752:1::5c:2: ICMP6, echo reply [ here I run "clear ipv6 neigbors" again ] 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, echo request 2a00:800:752:1::5c:2 > 2a00:800:752:1::5c:1: ICMP6, ...
I suspect its time to ask broadcom guys some help :)
I have same adapter here (Hewlett-Packard Company NC326m PCIe Dual Port
Adapter) and could not reproduce the problem.
Try following patch to check tg3 receives correct multicast list (its OK
for me, seen on dmesg output)
[17162.120238] add mc_addr(ha->addr=33:33:00:00:00:01)
[17162.120270] add mc_addr(ha->addr=01:00:5e:00:00:01)
[17162.120298] add mc_addr(ha->addr=33:33:ff:87:96:ce)
[17162.120326] add mc_addr(ha->addr=33:33:ff:5c:00:02)
[17162.120355] filters=80000001 00000000 00400000 40000000
But if problem remains even with "ifconfig eth0 allmulti" I suspect a
NIC firmware problem. (allmulti set to 1 all the 128 bits of filters)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index bc3af78..34510f5 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -9317,12 +9317,14 @@ static void __tg3_set_rx_mode(struct net_device *dev)
u32 crc;
netdev_for_each_mc_addr(ha, dev) {
+ pr_err("add mc_addr(ha->addr=%pM)\n", ha->addr);
crc = calc_crc(ha->addr, ETH_ALEN);
bit = ~crc & 0x7f;
regidx = (bit & 0x60) >> 5;
bit &= 0x1f;
mc_filter[regidx] |= (1 << bit);
}
+ pr_err("filters=%08X %08x %08x %08x\n", mc_filter[0], mc_filter[1], mc_filter[2], mc_filter[3]);
tw32(MAC_HASH_REG_0, mc_filter[0]);
tw32(MAC_HASH_REG_1, mc_filter[1]);
--
Right after boot: $ dmesg | egrep 'eth0|^add mc|^filters=' tg3 0000:03:04.0: eth0: Tigon3 [partno(N/A) rev 9003] (PCIX:133MHz:64-bit) MAC address 00:24:81:a3:44:24 tg3 0000:03:04.0: eth0: attached PHY is 5714 (10/100/1000Base-T Ethernet) (WireSpeed[1]) tg3 0000:03:04.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1] tg3 0000:03:04.0: eth0: dma_rwctrl[76148000] dma_mask[40-bit] add mc_addr(ha->addr=33:33:00:00:00:01) filters=80000000 00000000 00000000 00000000 add mc_addr(ha->addr=33:33:00:00:00:01) filters=80000000 00000000 00000000 00000000 add mc_addr(ha->addr=33:33:00:00:00:01) filters=80000000 00000000 00000000 00000000 add mc_addr(ha->addr=33:33:00:00:00:01) add mc_addr(ha->addr=01:00:5e:00:00:01) filters=80000000 00000000 00000000 40000000 ADDRCONF(NETDEV_UP): eth0: link is not ready add mc_addr(ha->addr=33:33:00:00:00:01) add mc_addr(ha->addr=01:00:5e:00:00:01) filters=80000000 00000000 00000000 40000000 add mc_addr(ha->addr=33:33:00:00:00:01) add mc_addr(ha->addr=01:00:5e:00:00:01) filters=80000000 00000000 00000000 40000000 add mc_addr(ha->addr=33:33:00:00:00:01) add mc_addr(ha->addr=01:00:5e:00:00:01) add mc_addr(ha->addr=33:33:ff:5c:00:02) filters=80000001 00000000 00000000 40000000 tg3 0000:03:04.0: eth0: Link is up at 1000 Mbps, full duplex tg3 0000:03:04.0: eth0: Flow control is off for TX and off for RX ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready add mc_addr(ha->addr=33:33:00:00:00:01) add mc_addr(ha->addr=01:00:5e:00:00:01) add mc_addr(ha->addr=33:33:ff:5c:00:02) add mc_addr(ha->addr=33:33:ff:a3:44:24) filters=80020001 00000000 00000000 40000000 eth0: no IPv6 routers present [ ifconfig eth0 allmulti (ip l and ifconfig say ALLMULTI is on) ] add mc_addr(ha->addr=33:33:00:00:00:01) add mc_addr(ha->addr=01:00:5e:00:00:01) add mc_addr(ha->addr=33:33:ff:5c:00:02) add mc_addr(ha->addr=33:33:ff:a3:44:24) filters=80020001 00000000 00000000 40000000 [ $ sudo ifconfig eth0 -allmulti Warning: Interface eth0 still in ...
I suspect Eric is right. "allmulti" has the effect of enabling all 128 bits of the multicast hash --
$ sudo ethtool -i eth0
driver: tg3
version: 3.110
firmware-version: 5715-v3.28, UMP 1.15
bus-info: 0000:03:04.0
---------
typedef struct me_s {
char name[] = { "Thomas Habets" };
char email[] = { "thomas@habets.pp.se" };
char kernel[] = { "Linux" };
char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" };
char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" };
char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;
--
Thanks. I put the question out to the firmware developer. While we wait, can you keep Eric's patch in place and give me the results along with the output of 'ethtool -d eth0 | grep 0x047' after the problem happens? Eric's patch shows the hash registers at the time they are programmed. I'm interested to see if the values change (by firmware) after the failure. --
Sure.
I think the problem occurs shortly after booting, or is triggered by it
Linux getting a neighbor table entry for the router. The reason it took a
while for everything to actually stop working is that the router was
caching and presumably updating its neighbors cache when it saw traffic.
That is, maybe it only works if the router sets up its neigbor table
first, and not otherwise.
The problem is there now. Last output in the kernel log about this is:
$ dmesg | egrep 'eth0|^add mc|^filters='
[...]
add mc_addr(ha->addr=33:33:00:00:00:01)
add mc_addr(ha->addr=01:00:5e:00:00:01)
add mc_addr(ha->addr=33:33:ff:5c:00:02)
add mc_addr(ha->addr=33:33:ff:a3:44:24)
filters=80020001 00000000 00000000 40000000
$ sudo ethtool -d eth0 | grep 0x047
0x0470 0x80020001
0x0474 0x00000000
0x0478 0x00000000
Look the same.
But a strange thing is that if I delete the ipv6 neighbor on the Linux
box (ip ne del 2a00:800:752:1::5c:1 dev eth0) it suddenly answers a ND
solicitation. I tried it just now and it "wakes it up".
Nothing was written to the kernel log when I ran this command, and the
ethtools -d output is the same afterwards as it was before. So unless
there's another code path that changes the registers when I do "ip ne
del" it may still be something else.
---------
typedef struct me_s {
char name[] = { "Thomas Habets" };
char email[] = { "thomas@habets.pp.se" };
char kernel[] = { "Linux" };
char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" };
char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" };
char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;
--
Do you have access to any diagnostic software that might have come with your machine? --
I'm don't know what diagnostic software that would be, nor does other
people here. So "no", i guess.
---------
typedef struct me_s {
char name[] = { "Thomas Habets" };
char email[] = { "thomas@habets.pp.se" };
char kernel[] = { "Linux" };
char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" };
char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" };
char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;
--
I've continued this a bit off-list but thought I would summarize for the archives. Summary ------- It looks like a firmware issue on the network card. When ILO is enabled it shares the first network card with the OS. When it does this multicast is broken. When multicast (on a L2 level) is broken IPv6 neighbor discovery breaks. Only eth0 breaks, eth1 is unaffected. System ------ HP Proliant DL320 G5p Xeon 3GHz 1GB RAM Arch: amd64 NIC: Broadcom Corporation NetXtreme BCM5715 Gigabit Ethernet (rev a3) Debian Lenny (5.0.5) Kernels: 2.6.35 mainline, 2.6.33.6 Config: http://pastebin.com/raw.php?i=Y6S8iKW7 Problem ------- Buggy box will not answer IPv6 ND or ping to ff02::1. May work at some point in the boot process, but once box is fully booted it does not. If I on the neighboring Cisco router run "clear ipv6 neighbors" (or it times out) that router cannot re-acquire the neigborship with the buggy box. Instant IPv6 breakage until I do one of: * Turn on promisc mode long enough for IPv6 ND to do its thing * ip ne del <address of neighbor> on the buggy host. Workarounds ----------- Either one of these will hide the problem: * Set promisc mode on interface (ip link set promisc on eth0) forever * Disable ILO * Use eth1 instead of eth0. Troubleshooting --------------- Got patch for kernel from Eric Dumazet (eric.dumazet@gmail.com) to output what MAC addresses are being subscribed to, and some registers from the card. Output is earlier in this thread, along with "ethtool -i eth0" and some other data. Managed to get diagnostic tool[1] booting from stick (no CD drive in server), but did not set up memory (himem.sys etc..). Running b57udiag it therefore failed due to insufficient memory at test "Group D. Driver Associated tests". Card is assumed to be OK anyway. Matt Carlson (mcarlson@broadcom.com) suspected firmware bug and asked me to try disabling ASF and/or IPMI using the diagnostic tool, but running "setasf -d" and "setipmi -d" ...
Thanks a lot Thomas for this very detailed report ! --
Hi Thomas, So are you running with this set to "Shared Network Port" mode? I'm There was another report on netdev back in 11/2008 on this exact hardware, I dug-up my notes on the problem, and from what I can tell, the receive multicast filters on the NIC were getting removed, causing both incoming IPv6 and IPv4 multicast packets to get dropped. I'm not sure if there was ever a fix developed, or if we ever came to a conclusion on where the bug was - iLO, tg3, or some other area. -Brian --
Sorry for the late reply. I've been swamped.
I can't seem to find it. Do you happen to have the subject line or
Sounds about right. From what I understand the relevant registers were
still the same for me when it wasn't working though (if that indeed is
how the filter is implemented).
---------
typedef struct me_s {
char name[] = { "Thomas Habets" };
char email[] = { "thomas@habets.pp.se" };
char kernel[] = { "Linux" };
char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" };
char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" };
char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;
--
It was actually a month earlier in 2008, I mis-typed, here's the link: One of the outcomes of that investigation was to update the firmware and/or iLO, I'm not sure if either fixed the problem. -Brian --
Nope, the patch displays mc list and filters bits only if not promiscuous and not allmulti (normal ethernet mode) If promiscuous -> a special PROMISC bit is selected on NIC (no display) If allmulti -> all 128 bits are set (but not displayed in my patch) I wanted to make sure the correct list of mc addrs is handled on your machine. It seems to be the case, so there might be a hardware problem with the multicast rx on this particular NIC --
I should add that I turned promisc mode back off at about here. Otherwise
---------
typedef struct me_s {
char name[] = { "Thomas Habets" };
char email[] = { "thomas@habets.pp.se" };
char kernel[] = { "Linux" };
char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" };
char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" };
char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;
--
