Hello, PROBLEM: transfer speed is ONLY HALF if queue is defined in pf.conf although queue is 950Mbit (1000Mbit-5%) pf disabled: 768 Mbits/sec pf enabled, queue 950Mbit: 337 Mbits/sec ANALYSIS: - OpenBSD 4.8 default intallation. - Test made between OpenBSD 4.8 and Debian Linux. (between two Debian systems speed is more than 900Mbit/s) ********************************************************* LAN interface: Intel PRO/1000 PT Desktop Adapter (PCIe, model: EXPI9300PTBLK) DMESG: em0 at pci1 dev 0 function 0 "Intel PRO/1000 PT (82572EI)" rev 0x06: apic 1 int 16 (irq 5), address 00:1b:21:05:1f:39 ********************************************************* Default settings of TCP window size: net.inet.tcp.recvspace=16384 net.inet.tcp.sendspace=16384 ********************************************************* 1a) pf disabled root@router-test (/root)# iperf -i 1 -t 3 -c 10.0.0.6 ------------------------------------------------------------ Client connecting to 10.0.0.6, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.0.8 port 27600 connected with 10.0.0.6 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 54.7 MBytes 459 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 1.0- 2.0 sec 54.7 MBytes 458 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 2.0- 3.0 sec 54.7 MBytes 459 Mbits/sec 1b) pf enabled, no queue root@router-test (/root)# iperf -i 1 -t 3 -c 10.0.0.6 ------------------------------------------------------------ Client connecting to 10.0.0.6, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.0.8 port 46912 connected with 10.0.0.6 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 53.9 MBytes 452 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 1.0- 2.0 sec 52.6 MBytes 441 Mbits/sec [ ID] ...
The default length for a queue is 50 packets - this only allows you to queue
around 75,000 bytes and the burstiness of TCP slow-start is likely to well
exceed this in your configuration (due to the BDP). I'd suggest increasing
the queue length - also run 'pfctl -vvs queue' or 'systat queue' and see
what's happening with regards to packets drops.
--
"Stop assuming that systems are secure unless demonstrated insecure;
start assuming that systems are insecure unless designed securely."
- Bruce Schneier
root@router-test (/root)# systat queue
QUEUE BW SCH PRIO PKTS BYTES
DROP_P DROP_B QLEN BORROW SUSPEN P/S B/S
root_em0 1000M cbq 0 1947967 2879364K
0 0 0 0 0 29412 44525K
q_lan 950M cbq 1947967 2879364K
0 0 0 0 0 29412 44525K
root@router-test (/root)# pfctl -vvs queue
queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {q_lan}
[ pkts: 4793481 bytes: 7256036778 dropped pkts: 0 bytes:
0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
[ measured: 29385.4 packets/s, 355.86Mb/s ]
queue q_lan on em0 bandwidth 950Mb cbq( default )
[ pkts: 4793481 bytes: 7256036778 dropped pkts: 0 bytes:
0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
[ measured: 29385.4 packets/s, 355.86Mb/s ]
best regards,
Robert Lewandowski
If I am reading it wright, no packets are droped. Changing values like: kern.somaxconn net.inet.ip.maxqueue net.bpf.bufsize net.bpf.maxbufsize net.inet.ipcomp.enable net.inet.tcp.ackonpush net.inet.tcp.ecn does not help either. It only has some influence on network speed with PF disabled. With PF enabled speed is alwasy around 350mbit/s :(( So any new ideas about debuging the problem or possible solution? best regards, Robert Lewandowski
ok, a set qlimit to 200 and then 500, no difference
queue root_em0 on em0 bandwidth 1Gb priority 0 qlimit 500 cbq( wrr root
) {q_lan}
[ pkts: 858820 bytes: 1300055838 dropped pkts: 0 bytes:
0 ]
[ qlength: 0/500 borrows: 0 suspends: 0 ]
[ measured: 29222.9 packets/s, 353.90Mb/s ]
queue q_lan on em0 bandwidth 950Mb qlimit 500 cbq( borrow default )
[ pkts: 858820 bytes: 1300055838 dropped pkts: 0 bytes:
0 ]
[ qlength: 0/500 borrows: 0 suspends: 0 ]
[ measured: 29222.9 packets/s, 353.90Mb/s ]
best regards,
RLW
What does CPU usage look like when this is happening? is there any other resources that appear to be constrained? J
Thanks for all the answers, but the problem still exists.
To sum up:
OpenBSD 4.8 default install
cpu0: Intel(R) Celeron(R) CPU 2.80GHz ("GenuineIntel" 686-class) 2.80 GHz
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
em0 at pci1 dev 0 function 0 "Intel PRO/1000 PT (82572EI)" rev 0x06:
apic 1 int 16 (irq 5), address XX:XX:XX:XX:XX:XX
1.
- tcp send and receive window @ default
- pf disabled
- transfer speed on em0 tested by iperf: 458 Mbits/sec
- top shows:
load averages: 0.90, 0.42, 0.18
25 processes: 1 running, 23 idle, 1 on processor
CPU states: 0.6% user, 0.0% nice, 51.5% system, 33.5% interrupt, 14.4%
idle
Memory: Real: 9164K/43M act/tot Free: 442M Swap: 0K/759M used/tot
2.
- tcp send and receive window: 131072
- pf disabled
- transfer speed on em0 tested by iperf: 767 Mbits/sec
- top shows:
load averages: 0.85, 0.56, 0.29
25 processes: 1 running, 23 idle, 1 on processor
CPU states: 1.6% user, 0.0% nice, 71.5% system, 26.9% interrupt, 0.0%
idle
Memory: Real: 9172K/43M act/tot Free: 442M Swap: 0K/759M used/tot
3.
- tcp send and receive window: 131072
- pf enabled, no queue
- transfer speed on em0 tested by iperf: 677 Mbits/sec
- top shows:
load averages: 0.84, 0.58, 0.38
25 processes: 1 running, 23 idle, 1 on processor
CPU states: 1.4% user, 0.0% nice, 70.1% system, 28.5% interrupt, 0.0%
idle
Memory: Real: 9184K/44M act/tot Free: 441M Swap: 0K/759M used/tot
4.
- tcp send and receive window: 131072
- pf enabled
- to default pf.conf added (as Joel Sing suggested qlimit changed from
50 to 500):
altq on em0 cbq bandwidth 1Gb qlimit 500 queue { q_lan }
queue q_lan bandwidth 950Mb qlimit 500 cbq (default)
- transfer speed on em0 tested by iperf: 337 Mbits/sec
- top shows:
load averages: 0.94, 0.68, 0.45
25 processes: 1 running, 23 idle, 1 on processor
CPU states: 0.6% user, 0.0% nice, 79.4% system, 20.0% interrupt, 0.0%
idle
Memory: Real: 9184K/44M act/tot Free: 441M Swap: ...Hello, I see, that while I am testing network speed by iperf, 100% CPU is being used, but is that normal for default install of OpenBSD 4.8 with default pf.conf?? I have second computer exactly like that one (IBM ThinkCentre A51P), on which i am running this tests but with P4 3Ghz CPU 2mb cache (not celeron 2.8) and the same is happening (100% CPU). LAN interface is Intel PRO/1000 PT Desktop Adapter (PCIe, model: EXPI9300PTBLK) and this is the only pcie adapter in computer (maybe broadcom integrated nic is also pcie but is not used) So the conclusion might be: - there is problem with my Intel NIC model/cheapset - there is problem with em driver - there is problem with my hardware (I need serwer motherboard with pcie and pci 64bit 66mhz) - I need faster CPU than P4 3GHz ?? ------------ best regards, RLW
root@router-test (/root)# vmstat -i interrupt total rate irq0/clock 4836013 100 irq83/em0 2771337 57 irq83/bge0 1 0 irq81/pciide0 4790 0 irq85/ichiic0 172044 3 Total 7784185 160 http://erydium.pl/upload/vmstat.gif http://erydium.pl/upload/systat.gif ---- best regards, RLW
or - there is a problem with iperf :) How about measuring with something else? Did you try tcpbench? Or something even simpler, like scp-ing from /dev/null to /dev/null? With pf and queues enabled you can monitor the B/S rate.
there is no tcpbench in packages for 4.8 and for debian linux 1. transferring file by scp from router-test to linux machine: transfer speed: 16.1MB/s ~ 128.8 Mbits/s root@router-test (/root)# top load averages: 1.59, 1.05, 0.67 28 processes: 2 running, 25 idle, 1 on processor CPU states: 33.3% user, 0.0% nice, 47.9% system, 14.2% interrupt, 4.6% idle Memory: Real: 11M/80M act/tot Free: 405M Swap: 0K/759M used/tot root@router-test (/root)# systat queue 2 users Load 1.49 0.99 0.74 Thu Nov 18 16:27:25 2010 QUEUE BW SCH PR PKTS BYTES DROP_P DROP_B QLEN BORR SUSP P/S B/S root_em0 1000M cbq 0 10M 14G 0 0 0 0 0 122 17M q_lan 950M cbq 10M 14G 0 0 0 0 0 122 17M 2. transferring file back from linux machine to router-test: transfer speed: 19.9MB/s ~ 159.2 Mbits/s root@router-test (/root)# top load averages: 1.13, 0.95, 0.69 25 processes: 1 running, 23 idle, 1 on processor CPU states: 40.1% user, 0.0% nice, 33.5% system, 26.3% interrupt, 0.0% idle Memory: Real: 11M/80M act/tot Free: 405M Swap: 0K/759M used/tot 3. as comparison transfer speed between two debian boxes: - tested by iperf: 940 Mbits/sec - transfering file by scp: 42.6MB/s ~ 340.8 Mbits/s top: Tasks: 81 total, 1 running, 80 sleeping, 0 stopped, 0 zombie Cpu(s): 1.8%us, 13.8%sy, 0.0%ni, 79.0%id, 0.0%wa, 1.0%hi, 4.3%si, 0.0%st Mem: 1028836k total, 924868k used, 103968k free, 23800k buffers Swap: 1951856k total, 51524k used, 1900332k free, 617788k cached ---- best regards RLW
Because it's in base: /usr/bin/tcpbench ciao, david
I removed Intel NIC and run test on broadcom integrated Gbit NIC to see
if there is problem with em driver
bge0 at pci1 dev 11 function 0 "Broadcom BCM5705K" rev 0x03, BCM5705 A3
(0x3003): apic 1 int 16 (irq 5), address XX:XX:XX:XX:XX:XX
brgphy0 at bge0 phy 1: BCM5705 10/100/1000baseT PHY, rev. 2
1.
pf enabled, queue 950mbit, qlimit 500
iperf test: 410 Mbits/sec
root@router-test (/root)# top
load averages: 0.95, 0.53, 0.26
23 processes: 1 running, 21 idle, 1 on processor
CPU states: 1.2% user, 0.0% nice, 84.4% system, 14.4% interrupt, 0.0%
idle
Memory: Real: 8972K/42M act/tot Free: 443M Swap: 0K/759M used/tot
2. test made between two OpenBSD 4.8 boxes (there is no tcpbench for debian)
transfers by tcpbench:
Conn: 1 Mbps: 399.972 Peak Mbps: 406.093 Avg Mbps: 399.972
133996 45932008 370.419 100.00%
Conn: 1 Mbps: 370.419 Peak Mbps: 406.093 Avg Mbps: 370.419
134999 46833528 373.920 100.00%
Conn: 1 Mbps: 373.920 Peak Mbps: 406.093 Avg Mbps: 373.920
136074 43531224 323.953 100.00%
Conn: 1 Mbps: 323.953 Peak Mbps: 406.093 Avg Mbps: 323.953
137002 41013960 353.950 100.00%
Conn: 1 Mbps: 353.950 Peak Mbps: 406.093 Avg Mbps: 353.950
137996 50500448 406.442 100.00%
Conn: 1 Mbps: 406.442 Peak Mbps: 406.442 Avg Mbps: 406.442
root@router-test (/root)# top (while running tcpbench)
load averages: 1.26, 0.80, 0.49
22 processes: 1 running, 20 idle, 1 on processor
CPU states: 0.0% user, 0.0% nice, 77.2% system, 15.6% interrupt, 7.2%
idle
Memory: Real: 8752K/43M act/tot Free: 442M Swap: 0K/759M used/tot
root@router-test (/root)# systat queue (while running tcpbench)
2 users Load 0.82 0.69 0.51 Thu Nov 18 17:13:10 2010
QUEUE BW SCH PR PKTS BYTES DROP_P DROP_B QLEN BORR SUSP
P/S B/S
root_bge0 ...No the problem is altq. Altq(4) was written when 100Mbps was common and people shaped traffic in the low megabit range. It seems to hit a wall when doing hundreds of megabits. Guess someone needs to run a profiling kernel and see where all that time is spent and then optimize altq(4).
Its nice to hear from OpenBSD developer on this matter. I am wondering who is gonna be that "someone"? ;) and when it could happen? Claudio can you add this problem as a bug to fix maybe in next release? ---- best regards, RLW
The "someone" running a profiling kernel to identify the hot spots could be you. cd /sys/arch/<arch>/config config -p <kernelname> build a kernel from the ../compile/<kernelname>.PROF directory in the usual way kgmon -b to start profiling (generate some traffic) kgmon -h to stop profiling kgmon -p to dump stats gprof /bsd gmon.out to read stats... Assuming you're interested in routed traffic (rather than queuing traffic generated on a box itself), make sure you run the traffic source and sink on other machines routing through the altq box, don't source/sink traffic on the altq box itself.
Sorry to others developers for not recognizing them. I am interested in making network traffic on gbit lan around 800-900mbit/s not 350mbit/s with altq so its rather the option mentioned by you in brackets ;) I am rather standard OpenBSD user and to be honest I think it would be faster and simpler if I just give some OpenBSD developer access to this box. ---- best regards, RLW
Hi, Could somebody include this in the FAQ? I found Daniel Hartmeier personal page which shows how to get stack trace and line numbers. I know that the stacktrace info is included somewhere on openbsd.org. But the way Daniel presented was much better. That particular presentation or a link should be in the FAQ too. They would be really useful.
Like this one http://www.openbsd.org/faq/faq2.html#Bugs ? Example on the end
Hello again ;)
I finaly had time to do kernel profiling.
So we have:
- default OpenBSD 4.8 install
- em0 nic (at pci express slot)
- default sysctl
- definition of queue in pf.conf:
altq on em0 cbq bandwidth 1Gb queue { q_lan }
queue q_lan bandwidth 950Mb cbq (default)
- low speed between Linux Debian box (as iperf server) and OpenBSD box
(as iperf client):
[ ID] Interval Transfer Bandwidth
[ 3] 41.0-42.0 sec 17.1 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 42.0-43.0 sec 17.2 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 43.0-44.0 sec 17.1 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 44.0-45.0 sec 17.2 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 45.0-46.0 sec 17.2 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 46.0-47.0 sec 17.1 MBytes 143 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 47.0-48.0 sec 17.2 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 48.0-49.0 sec 17.1 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 49.0-50.0 sec 17.1 MBytes 144 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-50.0 sec 858 MBytes 144 Mbits/sec
- stats from kernel profiling at:
http://erydium.pl/upload/20101230_profiling.txt
best regards,
RLW
From the porfile output. index %time self descendents called+self name index [1] 83.9 0.00 359.49 sched_idle [1] You spent > 80% in idle. So while forwarding all that traffic the box was mostly idle. Interesting are: [6] 3.3 0.02 14.26 4028087 acpi_get_timecount [6] [7] 3.2 0.14 13.66 3854116 binuptime [7] I guess these are that high up in the profile because of altq. These seem to be altq related: [10] 2.9 0.05 12.22 3234667 cbq_pfattach [10] [13] 2.2 0.03 9.18 2386574 tbr_dequeue [13] [14] 2.1 0.38 8.71 2386574 rmc_dequeue_next [14] Now this profiles shows one thing, the problem is not CPU bound but actually it seems like the TBR (token bucket regulator) is the problem. What seems to happen is that the TBR runs low and returns NULL and requires a timeout to fire to move the packets.
although kernel profiling stats show that system spends 80% time in idle, top during the iperf test shows: load averages: 0.40, 0.16, 0.11 15:43:02 27 processes: 1 running, 25 idle, 1 on processor CPU states: 0.6% user, 0.0% nice, 77.6% system, 21.8% interrupt, 0.0% idle Memory: Real: 10M/83M act/tot Free: 402M Swap: 0K/759M used/tot PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND i dont know that is TBR, but can I do something with it?? best regards, RLW
I forget if the profile counts interrupts as coming from idle. If you're mostly forwarding traffic, the idle loop will be the top function on the stack just about always.
any new ideas about where the problem is and how to fix it? best regards, RLW
dmesg missing! Your computer horsepower will definitely affect the maximum bandwith pf will be able to manage.
