Hi! I have a box with two external ISDN modems attached to it that acts as an outgoing modem pool to a number of remote located ISDN routers (zyxel P-202H Plus v2). Recently I was given the go-ahead to upgrade this 3.6 box so I swapped the disk with a new one with a fresh install of 4.1 (4.2 cds haven't yet arrived then). Unfortuantely, this upgrade caused the outgoing ppp connections to fail. My initial investigation didn't show me a clear reason for this so I went back to 3.6 and then upgraded version by version, until I found the moment when the unchanged very simple ppp.conf (at the bottom of this mail) didn't work anymore. To cut a long story shirt, something happened between 3.8 and 3.9 that changes the behaviour of the LCP negotiation protocol in my specific case: 3.8: Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: LayerStart Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: SendConfigReq(1) state = Stopped Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACFCOMP[2] Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: PROTOCOMP[2] Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACCMAP[6] 0x00000000 Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MRU[4] 1500 Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MAGICNUM[6] 0x434b14d1 Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: State change Stopped --> Req-Sent Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: RecvConfigRej(1) state = Req-Sent Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: PROTOCOMP[2] Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACFCOMP[2] Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: SendConfigReq(2) state = Req-Sent Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACCMAP[6] 0x00000000 Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MRU[4] 1500 Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MAGICNUM[6] 0x434b14d1 Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: RecvConfigAck(2) state = Req-Sent 3.9: Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: ...
I managed to narrow down this isssue significantly. My hardware setup on this box is a soekris 4801 board + a 4-port serial card by Sunix ( http://www.sunix.com.tw/it/en/Product_Detail.php?cate=2&class_a_id=34&sid=36 1 ), full dmesg at the end. The serial ports are: .... puc0 at pci0 dev 10 function 0 "Sunix 40XX" rev 0x01: ports: 4 com pccom3 at puc0 port 0 irq 11: ti16750, 64 byte fifo pccom4 at puc0 port 1 irq 11: ti16750, 64 byte fifo pccom5 at puc0 port 2 irq 11: ti16750, 64 byte fifo pccom6 at puc0 port 3 irq 11: ti16750, 64 byte fifo ... pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pccom0: console pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo ..... OpenBSD 3.6-3.8: modem comms using an ISDN modem over /dev/cua03 to cua06 work OK. OpenBSD 3.9 and later up to -current of a few days ago: using /dev/cua0[3,4] fails, /dev/cua01 works OK. When I use the sunix ports, on the remote end it appears that my ppp process is sending corrupted data mixed with valid data fields in LCP proposals. The fact that the on-board isa-attached pccom1 works shows that userland ppp is not at fault for this breakage. So far everything points at the changes in puc(4) between 3.8 and 3.9. Before I start reverting the (luckily few) changes one by one, does anyone have a better idea? Or a hint on how to debug this better. Thanks, Mitja (and apologies for not submiting a dmesg the first time, but I felt that the original mail was long enough even without it) OpenBSD 4.2-current (GENERIC) #10: Sat Oct 20 15:59:16 CEST 2007 root@test.xxxxxxx.net:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Geode(TM) Integrated Processor by National Semi ("Geode by NSC" 586-class) 267 MHz cpu0: FPU,TSC,MSR,CX8,CMOV,MMX cpu0: TSC disabled real mem = 133787648 (127MB) avail mem = 121565184 (115MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 20/70/08, BIOS32 rev. 0 @ 0xf7840 pcibios0 at bios0: rev 2.0 @ 0xf0000/0x10000 pcibios0: pcibios_get_intr_routing - ...
Can you try a more recent snap? It has fifo probe code enabled, and it is interesteing to see if the fifo probe reports a different depth. That commit was done to resolve an issue for a 2 port Sunix card. For this card (4036A) the actual fifo depth is 32 instead of 64. See http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/ic/com_subr.c.diff?r1=1.10& r2=1.11
Good catch! ... puc0 at pci0 dev 10 function 0 "Sunix 40XX" rev 0x01: ports: 4 com pccom3 at puc0 port 0 irq 11: ti16750, 64 byte fifo pccom3: probed fifo depth: 32 bytes pccom4 at puc0 port 1 irq 11: ti16750, 64 byte fifo pccom4: probed fifo depth: 32 bytes pccom5 at puc0 port 2 irq 11: ti16750, 64 byte fifo pccom5: probed fifo depth: 32 bytes pccom6 at puc0 port 3 irq 11: ti16750, 64 byte fifo pccom6: probed fifo depth: 32 bytes .... And guess what? The ppp modem connection now works! Now, is there a way to add a quirk for this specific card for 4.2? Or should I take my chances and uncomment com_fifo_probe() in the 4.2 source? I would rather not run -current in production at this early stage o
