Re: Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

Previous thread: creating menu's by Bryan Irvine on Tuesday, May 8, 2007 - 1:22 pm. (5 messages)

Next thread: OT: Monitoring tools and integration with SIM products by carlopmart on Tuesday, May 8, 2007 - 2:26 pm. (4 messages)
From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 1:44 pm

I am trying to improve my performance and fix my problem on httpd, but 
look like I am hitting the roof regardless if I test in lab using an old 
850MHz i386 or an new AMD64 at 1.6GHz. Both have > 2GB of ram, so that's 
the issue both have. I can't pass more then ~300 to 325 simultaneous 
httpd process and timeout goes jump high.

So, I guess may be the limit are in the connection process of the TCP 
stack, more then the httpd itself. But I am at a lots as to where to 
look. Tested both on 4.1 and 3.9 just to see.

Where are the OS bottleneck that I can may be improve here?

Please read for more details and more can be provided as well.

I need some help as I even went as far as order 4x X4100 with 2x dual 
core processor 2.4GHz and 2x 10K SAS drives in them with 8GB of ram as 
well, so 4GB per processors and I am afraid to hit the same limitations. 
There isn't any reason that I shouldn't be able to pass these limits.

I don't have the new Sun yet, may be a week before I have them, but I am 
trying to get ahead of the setup to fix my problem and test in lab. It 
really is a capacity issue and look likes putting more powerful hardware 
at it will not fix it.

I have:

# sysctl kern.maxproc
kern.maxproc=2048

Both also have noatime setup on the partition that the web files comes 
from and I even send the logs of httpd to >/dev/null to be sure it's not 
writing logs that would slow it down.

I use http_load to test my configuration and changes, but I am not 
successful at improving it more. Look like connections are timing out 
and I can't get more then ~ 300 process serving for httpd. Yes I have 
also increase and recompile the httpd to allow more then the hard limit 
of 250 and I can start 1500 httpd process if I want and they do run, but 
they do not server traffic looks like and I am still getting timeout.

Even if I start "StartServers 2500" httpd process to be sure I don't run 
out, or that the start of additional one is not the limit here, I can't 
get more then ...

first, are you sure you are testing the server and not the client?

second, what happens if you start another web server on port 8080 and
test simultaneously?

From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 2:47 pm

I will try a different server. For now, I use a Sun V120 with nothing 
running on it as the client. I will use more beef one to be sure and 
report back.

Also PF is not running on either client and servers for tests.

I also try these tests:

net.inet.ip.maxqueue=300 -> 1000

and

kern.somaxconn: 128 -> 512

In any case, what I see is that I can't pass 5.8Mb/sec on the old i386 
server and 9.0Mb/sec on the HP145 AMD64 one regardless if I use 100 
parallel connection or 400. More then 400 really put all numbers down 

No, but I will. I am really looking for any ideas as I am at a lost and 
I will use heavyer clients to be sure it's not the problem here.

From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 3:04 pm

Yes confirmed, it's not the client. I just did it from and IBM e365 with 
dual core processor. dmesg lower, but the results below for the Sun and 
the IBM looks similar. So, no client issue that I can see:

IBM e365 client:

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.33069e+07 bytes, in 19.0603 seconds
5322.74 mean bytes/connection
131.163 fetches/sec, 698146 bytes/sec
msecs/connect: 140.559 mean, 6014.22 max, -7.799 min
msecs/first-response: 919.846 mean, 8114.42 max, -3.572 min
HTTP response codes:
   code 200 -- 2500

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.39552e+07 bytes, in 18.2373 seconds
5582.08 mean bytes/connection
137.082 fetches/sec, 765203 bytes/sec
msecs/connect: 814.221 mean, 18006.5 max, -7.838 min
msecs/first-response: 1248.39 mean, 11165.7 max, -3.433 min
HTTP response codes:
   code 200 -- 2500


Sun V120 client:

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.37375e+07 bytes, in 19.137 seconds
5494.99 mean bytes/connection
130.637 fetches/sec, 717851 bytes/sec
msecs/connect: 232.358 mean, 6005.86 max, 0.439 min
msecs/first-response: 872.213 mean, 10733.2 max, 3.409 min
HTTP response codes:
   code 200 -- 2500

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.37627e+07 bytes, in 18.6019 seconds
5505.09 mean bytes/connection
134.395 fetches/sec, 739854 bytes/sec
msecs/connect: 1182 mean, 18013.3 max, 0.502 min
msecs/first-response: 1001.47 mean, 9873.65 max, 3.435 min
HTTP response codes:
   code 200 -- 2500


http_load Client dmesg:

# dmesg
OpenBSD 4.0 (GENERIC.MP) #967: Sat Sep 16 20:38:15 MDT 2006
     deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 1072672768 (1047532K)
avail mem = 907272192 (886008K)
using 22937 buffers containing 107474944 bytes (104956K) of memory
mainbus0 (root)
bios0 at ...
From: Joachim Schipper
Date: Tuesday, May 8, 2007 - 3:30 pm

Just a question - what do you seen when trying from localhost? That
would eliminate quite a few networking issues, at least.

		Joachim

-- 
TFMotD: factor, primes (6) - factor a number, generate primes

From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 4:00 pm

Not that much different. I would even say that may be not as good 
locally. Plus I sent an other example for two different servers with the 
test done locally as well. Should show up on marc very soon. Not there yet.

Local:
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 52 max parallel, 1.42596e+07 bytes, in 20.8623 seconds
5703.82 mean bytes/connection
119.833 fetches/sec, 683507 bytes/sec
msecs/connect: 107.61 mean, 6061.48 max, 1.224 min
msecs/first-response: 39.1055 mean, 6008.52 max, 3.384 min
HTTP response codes:
   code 200 -- 2500

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 82 max parallel, 1.35499e+07 bytes, in 20.7909 seconds
5419.97 mean bytes/connection
120.245 fetches/sec, 651724 bytes/sec
msecs/connect: 290.4 mean, 6059.02 max, 1.253 min
msecs/first-response: 33.4435 mean, 6004.2 max, 3.459 min
HTTP response codes:
   code 200 -- 2500

Remote:

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.34383e+07 bytes, in 18.4801 seconds
5375.32 mean bytes/connection
135.281 fetches/sec, 727177 bytes/sec
msecs/connect: 1016.4 mean, 18012.9 max, 0.406 min
msecs/first-response: 1104.19 mean, 10505.5 max, 3.455 min
HTTP response codes:
   code 200 -- 2500
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.36846e+07 bytes, in 23.4292 seconds
5473.85 mean bytes/connection
106.704 fetches/sec, 584083 bytes/sec
msecs/connect: 391.978 mean, 6006.38 max, 0.486 min
msecs/first-response: 742.048 mean, 10497.9 max, 3.403 min
HTTP response codes:
   code 200 -- 2500

From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 3:35 pm

Even run locally, the numbers don't look much better. Even in this case, 
looks like it can't do the required number of parallel requested:

old i386
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 94 max parallel, 1.37816e+07 bytes, in 20.7814 seconds
5512.65 mean bytes/connection
120.3 fetches/sec, 663172 bytes/sec
msecs/connect: 326.667 mean, 6062.79 max, 1.248 min
msecs/first-response: 36.5991 mean, 6071.86 max, 3.419 min
HTTP response codes:
   code 200 -- 2500
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 90 max parallel, 1.38708e+07 bytes, in 20.9679 seconds
5548.31 mean bytes/connection
119.23 fetches/sec, 661525 bytes/sec
msecs/connect: 346.224 mean, 6130.06 max, 1.228 min
msecs/first-response: 43.7965 mean, 6055.29 max, 3.392 min
HTTP response codes:
   code 200 -- 2500


new amd64
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 64 max parallel, 1.33453e+07 bytes, in 14.2911 seconds
5338.11 mean bytes/connection
174.934 fetches/sec, 933819 bytes/sec
msecs/connect: 107.002 mean, 6016.89 max, 0.802 min
msecs/first-response: 19.2824 mean, 512.538 max, 1.706 min
HTTP response codes:
   code 200 -- 2500
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 63 max parallel, 1.37396e+07 bytes, in 14.1811 seconds
5495.84 mean bytes/connection
176.291 fetches/sec, 968869 bytes/sec
msecs/connect: 106.943 mean, 6022.11 max, -8.932 min
msecs/first-response: 21.5082 mean, 3041.49 max, 1.716 min
HTTP response codes:
   code 200 -- 2500

From: Wijnand Wiersma
Date: Tuesday, May 8, 2007 - 2:30 pm

Daniel,

Maybe I am about to say something really stupid, but ok, here I go:
are you testing from one location only? Maybe that host is the
bottleneck itself.

Wijnand

From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 2:50 pm

Nothing is stupid for me right now. I am looking for any ideas that can 
help. Even if that look stupid, I am welling to test it.

As for the setup for the test, all servers and client are connected to 
the same Cisco switch directly.

From: Wijnand Wiersma
Date: Tuesday, May 8, 2007 - 3:52 pm

I meant the client being the bottleneck ;-)
Sorry for not being clear.


Wijnand

From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 4:13 pm

Nope. I sent updates on that too with a more powerful server. And I am 
doing tests now with three clients at once to see and I can get a bit 
more process running on the server side, but still no more output of 
that server.

It is cap somehow and I am not sure what does it yet.

From: Douglas Allan Tutty
Date: Tuesday, May 8, 2007 - 6:53 pm

I'm new at this so please ignore if its not helpful.

Is this a bandwidth (hardware) limitation on the computer itself?  If so
then a faster processor won't help.  Bus contention?

Doug.

From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 8:35 pm

Could always be a possibility, but if you take the data sent and the 
time spend to send it, you would see that one server in all tests look 
like it cap at around 5.8Mb/sec and the other one at 9.0Mb/sec. These 
numbers are sure way to low to be a bus problem here. Even drive speed, 
look to me that drives these days sure can spit data lots faster then 
this for sure.

I am trying so many different things without success so far. But I am 
sure there have to be something I am overlooking here. Doesn't make 
sense to me that one would be cap at that level. I don't believe it 
anyway, but on the other end, I am running out of idea to check and 
Google doesn't provide me lots more to try that I haven't done already.

I am sure Henning can get more out of his servers then this, but I am 
not sure how he does it to be honest.

From: Otto Moerbeek
Date: Tuesday, May 8, 2007 - 10:10 pm

Loks at the memory usage. 300 httpd procces could take up 3000M
easily, especially with stuff like php. In that case, the machine
starts swapping and your hit the roof. As a general rul, do not allow
more httpd procces than our machine can handle without swapping. Also,
a long KeepAliveTmeout can works against you, by holding slots. 

	-Otto

From: Daniel Ouellet
Date: Tuesday, May 8, 2007 - 10:30 pm

Thanks Otto,

I am still doing tests and tweak, but as far as swap, I checked that and 
same for keep alive in httpd.conf and I even changed it in:

net.inet.tcp.keepinittime=10
net.inet.tcp.keepidle=30
net.inet.tcp.keepintvl=30

For testing only. I am not saying the value above are any good, but I am 
testing multiple things and reading a lot on sysctl and what each one does.

KeepAliveTmeout is at 5 seconds.

No swapping is happening, even with 1000 httpd running.

load averages: 123.63, 39.74, 63.3285                      01:26:47
1064 processes:1063 idle, 1 on processor
CPU states:  0.8% user,  0.0% nice,  3.1% system,  0.8% interrupt, 95.4% 
idle
Memory: Real: 648M/1293M act/tot  Free: 711M  Swap: 0K/4096M used/tot

From: Otto Moerbeek
Date: Tuesday, May 8, 2007 - 11:15 pm

These parameters do not have a lot to do with what you are seeing.

I was talking abouty the KeepAliveTimeout of apache. It's by default
15s. WIth a long timout, any processs that has served a request will
wait 15s to see if the client issues more requests on the same
connection before it becomes available to serve other requests. For


From: Daniel Ouellet
Date: Wednesday, May 9, 2007 - 12:55 am

Here is more tests with always repeated results.

I increase the number of contiguous connection only by 5, from 305 to 
310, and you get 3 times slower response for always the same thing and 
repeated all the time. Very consistent and from different clients as well.

You can do any variation of 10 to 300 connections and you will always 
get the same results, or very close to it. See that at the end as well 
for proof.

So, I know I am hitting a hard limit someplace, but can't find where.

Note that I use a difference of 5 here, but I can reproduce the results 
almost all the time, just by increasing the number of connections by 1. 
 From 307 to 308 I get 75% of the time the same results as below, 
meaning times it;'s 6.7 seconds for the same transfer and other is 18.1 
seconds.

See below. Always the same transfer size, always the same amount of 
requests, always 100% success, but 3x slower.

Also, if I continue to increase it more, then I start to also get drop 
in replies, etc.

So, far I have played with 26 different sysctl setting that may affect 
that based on various possibility and from the man page and Google, but 
I can improve it some, not to the point of be able to use 500 
connections or more for example.

What is it that really limit the number of connection that badly and 
that hard?

===================
305 parallel

# http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test
500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.71609 seconds
13098 mean bytes/connection
74.4481 fetches/sec, 975121 bytes/sec
msecs/connect: 1813.57 mean, 6007.53 max, 0.418 min
msecs/first-response: 509.309 mean, 1685.92 max, 3.606 min
HTTP response codes:
   code 200 -- 500
# http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test
500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.8586 seconds
13098 mean bytes/connection
72.9012 fetches/sec, 954860 bytes/sec
msecs/connect: 1957.35 mean, 6007.17 max, 0.445 min
msecs/first-response: 485.676 mean, 1559.27 max, ...
From: Srebrenko Sehic
Date: Wednesday, May 9, 2007 - 4:46 am

You've assumed that Apache is the bottleneck, but perhaps your
benchmark tool could be limited in some way. I suggest you try with
apache benchmark or some other tool just to verify the results.

Apache (especially in the prefork model) is known to have concurrency
issues. I doubt that there are knobs you can twist OpenBSD-wise that
will compensate for Apache and somehow magically make it scale.

From: Daniel Ouellet
Date: Wednesday, May 9, 2007 - 4:51 am

Actually I have found a few things that fix it tonight.

I spend the last 24 hours reading like crazy and all night testing and 
reading more.

I can now have two clients using 1000 parallel connections to one i386 
850MHz server, my old one that I was testing with and I get all that no 
problem now. No delay and I can even push it more, but I figure at 2000 
parallel connections I should be able to get some breathing time now.

I will send the results soon.

All only in sysctl.conf

Now, I am still having some drop, not much, but some when I put pf in 
actions. So, that would be the next step I guess, but not now. I need 
some sleep.

Thanks

Daniel

From: Karsten McMinn
Date: Wednesday, May 9, 2007 - 11:33 am

I've spent considerable time with tuning apache on openbsd to
consume all available resources in OpenBSD. Here's the
relevant httpd.conf sections:

Timeout 300
KeepAlive On
MaxKeepAliveRequests 5000
KeepAliveTimeout 15

MinSpareServers 20
MaxSpareServers 30
StartServers 50
MaxClients 5000
MaxRequestsPerChild 0

I had staticlly compiled php into my httpd binary and obviously
raised HARD_LIMIT to 5000, using OpenBSD's apache.

This netted me an ability to serve about a max of 3000
requests per second on a 1.6ghz athlon with 256MB of memory.

hth.

From: Daniel Ouellet
Date: Wednesday, May 9, 2007 - 1:24 pm

Thanks. My configuration is more aggressive them yours and I can tell 
you for a fact that the problem and limitations where not in the httpd 
configuration, but in the OS part in my case anyway.

Some of your value I think would/could crash your system. Specially the:

MaxKeepAliveRequests 5000
MaxClients 5000

I don't think you could reach that high. Why, simply on a memory usage 
stand point. That was my next exploration, but it's possible that one 
apache process could take as much as 11MB

  6035 www        2    0   11M 9392K sleep    netcon   0:56  0.00% httpd

Obviously not all process would use that much. The question is really 
depending on content. If small images and lots of them, then each 
process use less memory. But if it is to serve all big files, then it's 
possible to use a good amount of memory per process. Now I don't have 
that answer here and I am not sure how to come with some logic on that, 
but even if each process was using only 1MB, then 5000 would give you 
5GB or RAM with is more then what OpenBSD was supporting until not so 
long ago, so you will start to swap and god knows what will happen then.


I use KeepAliveTimeout 5 and I am considering to reduce it.

If you think aboiut your suggestion here, you have KeepAliveTimeout 15 
and then MaxKeepAliveRequests 5000, don't you see the paradox here?

If your server is really busy, and lots of images on one page for 
example, then you would have a lots of process stuck in KeepAliveTimeout 
time out stage, so that's why you most likely increase your MaxClients 
5000 to compensate for that, but that's wrong I believe. It makes your 
server use more resources and be slower to react.

I use a logic here for the value on how to fix it.

MaxKeepAliveRequests I think should be set based on how many possible 
additional requests a URL from a browser that support keep alive and 
multiple requests at once could have. How many, well I think it's based 
on how many elements your web page can have. That's ...
From: Daniel Ouellet
Date: Wednesday, May 9, 2007 - 3:41 pm

Hi,

I am passing my finding around for the configuration of sysctl.conf to 
remove bottleneck I found in httpd as I couldn't get more then 300 httpd 
process without crapping out badly and above that, the server simply got 
out of wack.

All is default install and the tests are done with a server that is an 
old one. dmesg at the end in case you are interested. This is on OpenBSD 
4.0 and I pick that server just to see what's possible as it's not 
really a very powerful one.

You can also see the iostat output and the vmstat as well with the 
changes in place.

You sure can see a few page fault as I am really pushing the server 
much, but even then I get decent results and the bottleneck was remove, 
even with 2000 parallel connections. In that case I had to use two 
different clients as the http_load only support up to 1021 parallel 
connections, so to test pass that, I use more then one clients to push 
the server more.

But in all, the results are much better then a few days ago and now 
looks like we get more for the buck and adding more powerful hardware 
will be use better now instead of suffering the same limitations.

I put also the value changed in sysctl.conf to come to this final setup.

I am not saying the value are the best possible choice, but they work 
well in the test situation and there is many as you will see. Some are 
very surprising to me, like the change in net.inet.ip.portfirst. Yes I 
know, but if I leave it as default, then I can't get full success in the 
test below and get time out, some errors and efficiency is not as good. 
May be that's because of the random ports range calculations, I can't 
say, but in any case, the effect is there and tested.

I try to stay safe in my choices and comments are welcome, but I have to 
point out as well that ALL the values below needs to be changes to that 
new value to get working well. If even only one of them is not at the 
level below, the results in the tests start to be affected pretty bad at ...
From: Ted Unangst
Date: Wednesday, May 9, 2007 - 10:24 pm

never mind the rest, but these two really make no sense.  none.

From: Daniel Ouellet
Date: Wednesday, May 9, 2007 - 11:31 pm

Make no sense in the test and improving results, or make no sense in 
setting them as such here?

net.inet.ip.redirect=0

Is to disable ICMP routing redirects. Otherwise, your system could have 
its routing table misadjusted by an attacker. Wouldn't be wise to do so? 
May be if PF is turn on, then there is no reason for this, but with PF 
ON, I get drop and need to address that. Didn't pursue it yet as dead 
however.

As for the net.bpf.bufsize, I am looking again in my notes and tests, 
it's use for Berkeley Packet Filter (BPF), to maintains an internal 
kernel buffer for storing packets received off the wire.

Yes in that case it make sense not to have that here. I redid the tests 
with the default value and yes you are right! This one is wrong here. 
May be lack of sleep. (;> Thanks for correcting me!

I also have the revise my statement on the net.inet.ip.portfirst=32768 
effect. In a series of new tests, it doesn't have the impact noted the 
first test runs. So, I would keep it as default value as well now. May 
be it was when PF was enable that I have more of an impact then. But my 
notes are not clear on that specific one.

Anything else you see that may be questionable in what I sent? I am 
doing more tests with different hardware to be sure it's all sane value 
in the end.

Other wise many thanks for having taken the time to look it over and 
give me your feedback on it!

I sure appreciate it big time!

Best

Daniel

From: Claudio Jeker
Date: Thursday, May 10, 2007 - 1:40 am

net.inet.ip.redirect has only an effect if you enable
net.inet.ip.forwarding. As you are running a server and not a router I
doubt this is the case. Additionally net.inet.ip.redirect does not modify

With many shortliving connections you have a lot of sockets in TIME_WAIT.
Because you are testing from one host only you start to hit these entries
more and more often this often results in a retry from the client.
Additionally by filling all available ports the port allocation algorithm
is starting to get slower but that's a problem that you will only see on

I think there are a few knobs that you should reconsider. I will write an
other mail about that.


From: Daniel Ouellet
Date: Thursday, May 10, 2007 - 2:29 am

More reading in the man pages did the truck on that one and yes you are 

I did test it with a few more hosts and as stated, the OpenBSD default 

That sure would be welcome. I would be curious to see what else, or 
differences you may see. I did lots of tests in different setup, but I 
am always happy to see improvements.

I have for now my somewhat final version done and looks pretty good. 
Much better then before for sure anyway. Now I can enjoy seeing traffic 
coming in instead of worry about complains. (;>

But more improvements and suggestions with explications would be welcome 
as understanding on my side anyway.

Many thanks!

Daniel

From: Daniel Ouellet
Date: Thursday, May 10, 2007 - 2:18 am

As requested a few times in private to make the results available, here 
you go with what works for me. Hope this help some anyway.

Use what make sense to you based on your setup, hardware and traffic.

Final value in use after testing are now set as follow for me assuming a 
good amount of memory to allow so many process to run. I use minimum 
2GB, some have 4GB.

Recompile httpd with upper limits for process. I put 2048 to allow more 
room in the future if needed, but I still want to be safe and limit the 
process lower that that. If php is in use for example, static 
compilation would improve, but I choose to keep the system as much as 
possible as default for many reasons, including maintenance, support and 
regular upgrades. Your choice may vary.

In fstab
========
A partition for the files used by the sites set with noatime set on it 
to avoid the change in last access time for each files. Definitely 
improve access time a lots under heavy load!

httpd logs could be on it's own partition as well, mounted softdep to 
gain some efficiency in logs updates if very busy sites.

For httpd.conf
==============
Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5
MinSpareServers 50
MaxSpareServers 100
StartServers 75
MaxClients 768
MaxRequestsPerChild 0


In sysctl.conf
==============
# Below are values added to improve performance of httpd after
# testing with http_load under parallel and rate setting.

kern.maxclusters=12000		# The maximum number of mbuf(9) clusters
				# that may be allocated.

kern.maxfiles=4096		# The maximum number of open files that
				# may be open in the system.

kern.maxproc=2048		# The maximum number of simultaneous
				# processes the system will allow.

kern.seminfo.semmni=1024	# The maximum number of semaphore
				# identifiers allowed.

kern.seminfo.semmns=4096	# The maximum number of semaphores
				# allowed in the system.

kern.shminfo.shmall=16384	# The maximum amount of total shared
				# memory ...
From: Claudio Jeker
Date: Thursday, May 10, 2007 - 2:16 am

What does netstat -m tell you about the peak usage of clusters is it

Is httpd really so slow in accepting sockets that you had to increase this


Are you sure you need to tune the IP fragment queue? You are using TCP
which does PMTU discovery and sets the DF flag by default so no IP


These values are super aggressive especially the keepidle and keepintvl
values are doubtful for your test. Is your benchmark using SO_KEEPALIVE? I
doubt that and so these two values have no effect and are actually

This is another knob that should not be changed unless you really know
what you are doing. The mss calculation uses this value as safe default
that is always accepted. Pushing that up to this value may have unpleasant
sideeffects for people behind IPSec tunnels. The used mss is the max
between mssdflt and the MTU of the route to the host minus IP and TCP

If you need to tune the syncache in such extrem ways you should consider
to adjust TCP_SYN_HASH_SIZE and leave synbucketlimit as is. The
synbucketlimit is here to limit attacks to the hash list by overloading
the bucket list. On your system it may be necessary to traverse 420 nodes
on a lookup. Honestly the syncachelimit and synbucketlimit knob are totaly
useless. If anything we should allow to resize the hash and calculate the
both limits from there.


From: Daniel Ouellet
Date: Thursday, May 10, 2007 - 2:55 am

I will do an other series of tests in the next few days and be sure of 
it before putting my foot in my mouth. But at 10000, I was getting drops 

Yes, I was doing tests using a few clients and pushing the server at 
2000 parallel connections to test with. That was in lab test and in real 
life, I assume that half should be fine. But I wanted to be safe. So, 


With smaller queue I was getting slower responses and drop. May be a 


Yes, aggressive I was/am. Keep Alive was/is in use yes. I will have more 
to play with in lab and see if I was to aggressive and look like you 
would think I am. The default value give me not as good results however. 
More tests needed specifically on this and I will do so. May be the 
defaults are fine, I will see if I can find a way to be more objective 

I will review and read more on it. I based my changes on results seen 
with the setup under heavy load. There is always place for improvements. 

Interesting! I will retest with that in mind. Didn't see that 
explication in my reading so far. Thanks for this!

You are most helpful and this gives me something to research more and I 
sure appreciates your time in passing the informations.

Looks like a few more days of testing needed.

Many thanks!

Daniel

From: Daniel Ouellet
Date: Thursday, May 10, 2007 - 3:23 am

You are right again! (;>

# netstat -m
14140 mbufs in use:
         1098 mbufs allocated to data
         12527 mbufs allocated to packet headers
         515 mbufs allocated to socket names and addresses
585/694/4096 mbuf clusters in use (current/peak/max)
4976 Kbytes allocated to network (94% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

I was not looking at the right place. Back to default value.

Thanks for the help!

Daniel

From: Douglas Allan Tutty
Date: Wednesday, May 9, 2007 - 7:52 am

How does this server do with 1000 non-httpd processes running?  Perhaps
I need a newer Nemeth et al, but in my 3rd edition, pg 759 middle of the
page says "Modern systems do not deal welll with load averages over
about 6.0".

Could your bottleneck be in context-switching between so many processes?
With so many, the memory cache will be faulting during the context
switching and have to be retreived from main memory.  I don't think that
such slow-downs appear in top, and I don't know about vmstat.  I don't
know if there's a tool to measure this on i386.

I've never run httpd but it looks to me like a massivly parralized
problem where each connection is trivial to serve (hense low CPU usage,
no disk-io waiting) but there are just so many of them.  

How does the server do with other connection services, e.g. pop or ftp?

Doug.

From: Daniel Ouellet
Date: Wednesday, May 9, 2007 - 12:39 pm

Be careful when reading these numbers here. Don't forget that I am doing 
this in labs with abuse, etc. I am trying to push the server as much as 
I can here. In production, I do see some server reaching 10, 18 and some 
time I saw up to 25, but all these were in extreme cases, most of the 
time, it's always below 10.

I can't answer this question with proper knowledge here as I don't 
pretend to know that answer. May be someone else can speak knowingly 

Wasn't. However yes there is and I can see faulting. I check both the 
vmstat and iostat to see what's up. Obviously the number are higher on 
older hardware as it run out of horse power obviously. But the problem 
was the be able to handle more then 300 parallel connections and why it 
just 3x when only 2 more process were added. So, no, I don't think the 
context-switching had anything to do with it here.

You will see when I post the changes I did and the test I did. Some are 

One multi core and multi processor hardware with proper memory, it 

I only run one application per servers, always did and most likely 
always will. So, any mail server is a mail server, and a web server is 
only a web server here anyway. Even DNS are only running DNS as well, etc.

Previous thread: creating menu's by Bryan Irvine on Tuesday, May 8, 2007 - 1:22 pm. (5 messages)

Next thread: OT: Monitoring tools and integration with SIM products by carlopmart on Tuesday, May 8, 2007 - 2:26 pm. (4 messages)