I am trying to improve my performance and fix my problem on httpd, but look like I am hitting the roof regardless if I test in lab using an old 850MHz i386 or an new AMD64 at 1.6GHz. Both have > 2GB of ram, so that's the issue both have. I can't pass more then ~300 to 325 simultaneous httpd process and timeout goes jump high. So, I guess may be the limit are in the connection process of the TCP stack, more then the httpd itself. But I am at a lots as to where to look. Tested both on 4.1 and 3.9 just to see. Where are the OS bottleneck that I can may be improve here? Please read for more details and more can be provided as well. I need some help as I even went as far as order 4x X4100 with 2x dual core processor 2.4GHz and 2x 10K SAS drives in them with 8GB of ram as well, so 4GB per processors and I am afraid to hit the same limitations. There isn't any reason that I shouldn't be able to pass these limits. I don't have the new Sun yet, may be a week before I have them, but I am trying to get ahead of the setup to fix my problem and test in lab. It really is a capacity issue and look likes putting more powerful hardware at it will not fix it. I have: # sysctl kern.maxproc kern.maxproc=2048 Both also have noatime setup on the partition that the web files comes from and I even send the logs of httpd to >/dev/null to be sure it's not writing logs that would slow it down. I use http_load to test my configuration and changes, but I am not successful at improving it more. Look like connections are timing out and I can't get more then ~ 300 process serving for httpd. Yes I have also increase and recompile the httpd to allow more then the hard limit of 250 and I can start 1500 httpd process if I want and they do run, but they do not server traffic looks like and I am still getting timeout. Even if I start "StartServers 2500" httpd process to be sure I don't run out, or that the start of additional one is not the limit here, I can't get more then ...
first, are you sure you are testing the server and not the client? second, what happens if you start another web server on port 8080 and test simultaneously?
I will try a different server. For now, I use a Sun V120 with nothing running on it as the client. I will use more beef one to be sure and report back. Also PF is not running on either client and servers for tests. I also try these tests: net.inet.ip.maxqueue=300 -> 1000 and kern.somaxconn: 128 -> 512 In any case, what I see is that I can't pass 5.8Mb/sec on the old i386 server and 9.0Mb/sec on the HP145 AMD64 one regardless if I use 100 parallel connection or 400. More then 400 really put all numbers down No, but I will. I am really looking for any ideas as I am at a lost and I will use heavyer clients to be sure it's not the problem here.
Yes confirmed, it's not the client. I just did it from and IBM e365 with dual core processor. dmesg lower, but the results below for the Sun and the IBM looks similar. So, no client issue that I can see: IBM e365 client: # http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 200 max parallel, 1.33069e+07 bytes, in 19.0603 seconds 5322.74 mean bytes/connection 131.163 fetches/sec, 698146 bytes/sec msecs/connect: 140.559 mean, 6014.22 max, -7.799 min msecs/first-response: 919.846 mean, 8114.42 max, -3.572 min HTTP response codes: code 200 -- 2500 # http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 400 max parallel, 1.39552e+07 bytes, in 18.2373 seconds 5582.08 mean bytes/connection 137.082 fetches/sec, 765203 bytes/sec msecs/connect: 814.221 mean, 18006.5 max, -7.838 min msecs/first-response: 1248.39 mean, 11165.7 max, -3.433 min HTTP response codes: code 200 -- 2500 Sun V120 client: # http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 200 max parallel, 1.37375e+07 bytes, in 19.137 seconds 5494.99 mean bytes/connection 130.637 fetches/sec, 717851 bytes/sec msecs/connect: 232.358 mean, 6005.86 max, 0.439 min msecs/first-response: 872.213 mean, 10733.2 max, 3.409 min HTTP response codes: code 200 -- 2500 # http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 400 max parallel, 1.37627e+07 bytes, in 18.6019 seconds 5505.09 mean bytes/connection 134.395 fetches/sec, 739854 bytes/sec msecs/connect: 1182 mean, 18013.3 max, 0.502 min msecs/first-response: 1001.47 mean, 9873.65 max, 3.435 min HTTP response codes: code 200 -- 2500 http_load Client dmesg: # dmesg OpenBSD 4.0 (GENERIC.MP) #967: Sat Sep 16 20:38:15 MDT 2006 firstname.lastname@example.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 1072672768 (1047532K) avail mem = 907272192 (886008K) using 22937 buffers containing 107474944 bytes (104956K) of memory mainbus0 (root) bios0 at ...
Just a question - what do you seen when trying from localhost? That would eliminate quite a few networking issues, at least. Joachim -- TFMotD: factor, primes (6) - factor a number, generate primes
Not that much different. I would even say that may be not as good locally. Plus I sent an other example for two different servers with the test done locally as well. Should show up on marc very soon. Not there yet. Local: # http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 52 max parallel, 1.42596e+07 bytes, in 20.8623 seconds 5703.82 mean bytes/connection 119.833 fetches/sec, 683507 bytes/sec msecs/connect: 107.61 mean, 6061.48 max, 1.224 min msecs/first-response: 39.1055 mean, 6008.52 max, 3.384 min HTTP response codes: code 200 -- 2500 # http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 82 max parallel, 1.35499e+07 bytes, in 20.7909 seconds 5419.97 mean bytes/connection 120.245 fetches/sec, 651724 bytes/sec msecs/connect: 290.4 mean, 6059.02 max, 1.253 min msecs/first-response: 33.4435 mean, 6004.2 max, 3.459 min HTTP response codes: code 200 -- 2500 Remote: # http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 400 max parallel, 1.34383e+07 bytes, in 18.4801 seconds 5375.32 mean bytes/connection 135.281 fetches/sec, 727177 bytes/sec msecs/connect: 1016.4 mean, 18012.9 max, 0.406 min msecs/first-response: 1104.19 mean, 10505.5 max, 3.455 min HTTP response codes: code 200 -- 2500 # http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 200 max parallel, 1.36846e+07 bytes, in 23.4292 seconds 5473.85 mean bytes/connection 106.704 fetches/sec, 584083 bytes/sec msecs/connect: 391.978 mean, 6006.38 max, 0.486 min msecs/first-response: 742.048 mean, 10497.9 max, 3.403 min HTTP response codes: code 200 -- 2500
Even run locally, the numbers don't look much better. Even in this case, looks like it can't do the required number of parallel requested: old i386 # http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 94 max parallel, 1.37816e+07 bytes, in 20.7814 seconds 5512.65 mean bytes/connection 120.3 fetches/sec, 663172 bytes/sec msecs/connect: 326.667 mean, 6062.79 max, 1.248 min msecs/first-response: 36.5991 mean, 6071.86 max, 3.419 min HTTP response codes: code 200 -- 2500 # http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2 2500 fetches, 90 max parallel, 1.38708e+07 bytes, in 20.9679 seconds 5548.31 mean bytes/connection 119.23 fetches/sec, 661525 bytes/sec msecs/connect: 346.224 mean, 6130.06 max, 1.228 min msecs/first-response: 43.7965 mean, 6055.29 max, 3.392 min HTTP response codes: code 200 -- 2500 new amd64 # http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www1 2500 fetches, 64 max parallel, 1.33453e+07 bytes, in 14.2911 seconds 5338.11 mean bytes/connection 174.934 fetches/sec, 933819 bytes/sec msecs/connect: 107.002 mean, 6016.89 max, 0.802 min msecs/first-response: 19.2824 mean, 512.538 max, 1.706 min HTTP response codes: code 200 -- 2500 # http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www1 2500 fetches, 63 max parallel, 1.37396e+07 bytes, in 14.1811 seconds 5495.84 mean bytes/connection 176.291 fetches/sec, 968869 bytes/sec msecs/connect: 106.943 mean, 6022.11 max, -8.932 min msecs/first-response: 21.5082 mean, 3041.49 max, 1.716 min HTTP response codes: code 200 -- 2500
Daniel, Maybe I am about to say something really stupid, but ok, here I go: are you testing from one location only? Maybe that host is the bottleneck itself. Wijnand
Nothing is stupid for me right now. I am looking for any ideas that can help. Even if that look stupid, I am welling to test it. As for the setup for the test, all servers and client are connected to the same Cisco switch directly.
I meant the client being the bottleneck ;-) Sorry for not being clear. Wijnand
Nope. I sent updates on that too with a more powerful server. And I am doing tests now with three clients at once to see and I can get a bit more process running on the server side, but still no more output of that server. It is cap somehow and I am not sure what does it yet.
I'm new at this so please ignore if its not helpful. Is this a bandwidth (hardware) limitation on the computer itself? If so then a faster processor won't help. Bus contention? Doug.
Could always be a possibility, but if you take the data sent and the time spend to send it, you would see that one server in all tests look like it cap at around 5.8Mb/sec and the other one at 9.0Mb/sec. These numbers are sure way to low to be a bus problem here. Even drive speed, look to me that drives these days sure can spit data lots faster then this for sure. I am trying so many different things without success so far. But I am sure there have to be something I am overlooking here. Doesn't make sense to me that one would be cap at that level. I don't believe it anyway, but on the other end, I am running out of idea to check and Google doesn't provide me lots more to try that I haven't done already. I am sure Henning can get more out of his servers then this, but I am not sure how he does it to be honest.
Loks at the memory usage. 300 httpd procces could take up 3000M easily, especially with stuff like php. In that case, the machine starts swapping and your hit the roof. As a general rul, do not allow more httpd procces than our machine can handle without swapping. Also, a long KeepAliveTmeout can works against you, by holding slots. -Otto
Thanks Otto, I am still doing tests and tweak, but as far as swap, I checked that and same for keep alive in httpd.conf and I even changed it in: net.inet.tcp.keepinittime=10 net.inet.tcp.keepidle=30 net.inet.tcp.keepintvl=30 For testing only. I am not saying the value above are any good, but I am testing multiple things and reading a lot on sysctl and what each one does. KeepAliveTmeout is at 5 seconds. No swapping is happening, even with 1000 httpd running. load averages: 123.63, 39.74, 63.3285 01:26:47 1064 processes:1063 idle, 1 on processor CPU states: 0.8% user, 0.0% nice, 3.1% system, 0.8% interrupt, 95.4% idle Memory: Real: 648M/1293M act/tot Free: 711M Swap: 0K/4096M used/tot
These parameters do not have a lot to do with what you are seeing. I was talking abouty the KeepAliveTimeout of apache. It's by default 15s. WIth a long timout, any processs that has served a request will wait 15s to see if the client issues more requests on the same connection before it becomes available to serve other requests. For
Here is more tests with always repeated results. I increase the number of contiguous connection only by 5, from 305 to 310, and you get 3 times slower response for always the same thing and repeated all the time. Very consistent and from different clients as well. You can do any variation of 10 to 300 connections and you will always get the same results, or very close to it. See that at the end as well for proof. So, I know I am hitting a hard limit someplace, but can't find where. Note that I use a difference of 5 here, but I can reproduce the results almost all the time, just by increasing the number of connections by 1. From 307 to 308 I get 75% of the time the same results as below, meaning times it;'s 6.7 seconds for the same transfer and other is 18.1 seconds. See below. Always the same transfer size, always the same amount of requests, always 100% success, but 3x slower. Also, if I continue to increase it more, then I start to also get drop in replies, etc. So, far I have played with 26 different sysctl setting that may affect that based on various possibility and from the man page and Google, but I can improve it some, not to the point of be able to use 500 connections or more for example. What is it that really limit the number of connection that badly and that hard? =================== 305 parallel # http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test 500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.71609 seconds 13098 mean bytes/connection 74.4481 fetches/sec, 975121 bytes/sec msecs/connect: 1813.57 mean, 6007.53 max, 0.418 min msecs/first-response: 509.309 mean, 1685.92 max, 3.606 min HTTP response codes: code 200 -- 500 # http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test 500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.8586 seconds 13098 mean bytes/connection 72.9012 fetches/sec, 954860 bytes/sec msecs/connect: 1957.35 mean, 6007.17 max, 0.445 min msecs/first-response: 485.676 mean, 1559.27 max, ...
You've assumed that Apache is the bottleneck, but perhaps your benchmark tool could be limited in some way. I suggest you try with apache benchmark or some other tool just to verify the results. Apache (especially in the prefork model) is known to have concurrency issues. I doubt that there are knobs you can twist OpenBSD-wise that will compensate for Apache and somehow magically make it scale.
Actually I have found a few things that fix it tonight. I spend the last 24 hours reading like crazy and all night testing and reading more. I can now have two clients using 1000 parallel connections to one i386 850MHz server, my old one that I was testing with and I get all that no problem now. No delay and I can even push it more, but I figure at 2000 parallel connections I should be able to get some breathing time now. I will send the results soon. All only in sysctl.conf Now, I am still having some drop, not much, but some when I put pf in actions. So, that would be the next step I guess, but not now. I need some sleep. Thanks Daniel
I've spent considerable time with tuning apache on openbsd to consume all available resources in OpenBSD. Here's the relevant httpd.conf sections: Timeout 300 KeepAlive On MaxKeepAliveRequests 5000 KeepAliveTimeout 15 MinSpareServers 20 MaxSpareServers 30 StartServers 50 MaxClients 5000 MaxRequestsPerChild 0 I had staticlly compiled php into my httpd binary and obviously raised HARD_LIMIT to 5000, using OpenBSD's apache. This netted me an ability to serve about a max of 3000 requests per second on a 1.6ghz athlon with 256MB of memory. hth.
Thanks. My configuration is more aggressive them yours and I can tell you for a fact that the problem and limitations where not in the httpd configuration, but in the OS part in my case anyway. Some of your value I think would/could crash your system. Specially the: MaxKeepAliveRequests 5000 MaxClients 5000 I don't think you could reach that high. Why, simply on a memory usage stand point. That was my next exploration, but it's possible that one apache process could take as much as 11MB 6035 www 2 0 11M 9392K sleep netcon 0:56 0.00% httpd Obviously not all process would use that much. The question is really depending on content. If small images and lots of them, then each process use less memory. But if it is to serve all big files, then it's possible to use a good amount of memory per process. Now I don't have that answer here and I am not sure how to come with some logic on that, but even if each process was using only 1MB, then 5000 would give you 5GB or RAM with is more then what OpenBSD was supporting until not so long ago, so you will start to swap and god knows what will happen then. I use KeepAliveTimeout 5 and I am considering to reduce it. If you think aboiut your suggestion here, you have KeepAliveTimeout 15 and then MaxKeepAliveRequests 5000, don't you see the paradox here? If your server is really busy, and lots of images on one page for example, then you would have a lots of process stuck in KeepAliveTimeout time out stage, so that's why you most likely increase your MaxClients 5000 to compensate for that, but that's wrong I believe. It makes your server use more resources and be slower to react. I use a logic here for the value on how to fix it. MaxKeepAliveRequests I think should be set based on how many possible additional requests a URL from a browser that support keep alive and multiple requests at once could have. How many, well I think it's based on how many elements your web page can have. That's ...
Hi, I am passing my finding around for the configuration of sysctl.conf to remove bottleneck I found in httpd as I couldn't get more then 300 httpd process without crapping out badly and above that, the server simply got out of wack. All is default install and the tests are done with a server that is an old one. dmesg at the end in case you are interested. This is on OpenBSD 4.0 and I pick that server just to see what's possible as it's not really a very powerful one. You can also see the iostat output and the vmstat as well with the changes in place. You sure can see a few page fault as I am really pushing the server much, but even then I get decent results and the bottleneck was remove, even with 2000 parallel connections. In that case I had to use two different clients as the http_load only support up to 1021 parallel connections, so to test pass that, I use more then one clients to push the server more. But in all, the results are much better then a few days ago and now looks like we get more for the buck and adding more powerful hardware will be use better now instead of suffering the same limitations. I put also the value changed in sysctl.conf to come to this final setup. I am not saying the value are the best possible choice, but they work well in the test situation and there is many as you will see. Some are very surprising to me, like the change in net.inet.ip.portfirst. Yes I know, but if I leave it as default, then I can't get full success in the test below and get time out, some errors and efficiency is not as good. May be that's because of the random ports range calculations, I can't say, but in any case, the effect is there and tested. I try to stay safe in my choices and comments are welcome, but I have to point out as well that ALL the values below needs to be changes to that new value to get working well. If even only one of them is not at the level below, the results in the tests start to be affected pretty bad at ...
never mind the rest, but these two really make no sense. none.
Make no sense in the test and improving results, or make no sense in setting them as such here? net.inet.ip.redirect=0 Is to disable ICMP routing redirects. Otherwise, your system could have its routing table misadjusted by an attacker. Wouldn't be wise to do so? May be if PF is turn on, then there is no reason for this, but with PF ON, I get drop and need to address that. Didn't pursue it yet as dead however. As for the net.bpf.bufsize, I am looking again in my notes and tests, it's use for Berkeley Packet Filter (BPF), to maintains an internal kernel buffer for storing packets received off the wire. Yes in that case it make sense not to have that here. I redid the tests with the default value and yes you are right! This one is wrong here. May be lack of sleep. (;> Thanks for correcting me! I also have the revise my statement on the net.inet.ip.portfirst=32768 effect. In a series of new tests, it doesn't have the impact noted the first test runs. So, I would keep it as default value as well now. May be it was when PF was enable that I have more of an impact then. But my notes are not clear on that specific one. Anything else you see that may be questionable in what I sent? I am doing more tests with different hardware to be sure it's all sane value in the end. Other wise many thanks for having taken the time to look it over and give me your feedback on it! I sure appreciate it big time! Best Daniel
net.inet.ip.redirect has only an effect if you enable net.inet.ip.forwarding. As you are running a server and not a router I doubt this is the case. Additionally net.inet.ip.redirect does not modify With many shortliving connections you have a lot of sockets in TIME_WAIT. Because you are testing from one host only you start to hit these entries more and more often this often results in a retry from the client. Additionally by filling all available ports the port allocation algorithm is starting to get slower but that's a problem that you will only see on I think there are a few knobs that you should reconsider. I will write an other mail about that.
More reading in the man pages did the truck on that one and yes you are I did test it with a few more hosts and as stated, the OpenBSD default That sure would be welcome. I would be curious to see what else, or differences you may see. I did lots of tests in different setup, but I am always happy to see improvements. I have for now my somewhat final version done and looks pretty good. Much better then before for sure anyway. Now I can enjoy seeing traffic coming in instead of worry about complains. (;> But more improvements and suggestions with explications would be welcome as understanding on my side anyway. Many thanks! Daniel
As requested a few times in private to make the results available, here you go with what works for me. Hope this help some anyway. Use what make sense to you based on your setup, hardware and traffic. Final value in use after testing are now set as follow for me assuming a good amount of memory to allow so many process to run. I use minimum 2GB, some have 4GB. Recompile httpd with upper limits for process. I put 2048 to allow more room in the future if needed, but I still want to be safe and limit the process lower that that. If php is in use for example, static compilation would improve, but I choose to keep the system as much as possible as default for many reasons, including maintenance, support and regular upgrades. Your choice may vary. In fstab ======== A partition for the files used by the sites set with noatime set on it to avoid the change in last access time for each files. Definitely improve access time a lots under heavy load! httpd logs could be on it's own partition as well, mounted softdep to gain some efficiency in logs updates if very busy sites. For httpd.conf ============== Timeout 300 KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 5 MinSpareServers 50 MaxSpareServers 100 StartServers 75 MaxClients 768 MaxRequestsPerChild 0 In sysctl.conf ============== # Below are values added to improve performance of httpd after # testing with http_load under parallel and rate setting. kern.maxclusters=12000 # The maximum number of mbuf(9) clusters # that may be allocated. kern.maxfiles=4096 # The maximum number of open files that # may be open in the system. kern.maxproc=2048 # The maximum number of simultaneous # processes the system will allow. kern.seminfo.semmni=1024 # The maximum number of semaphore # identifiers allowed. kern.seminfo.semmns=4096 # The maximum number of semaphores # allowed in the system. kern.shminfo.shmall=16384 # The maximum amount of total shared # memory ...
What does netstat -m tell you about the peak usage of clusters is it Is httpd really so slow in accepting sockets that you had to increase this Are you sure you need to tune the IP fragment queue? You are using TCP which does PMTU discovery and sets the DF flag by default so no IP These values are super aggressive especially the keepidle and keepintvl values are doubtful for your test. Is your benchmark using SO_KEEPALIVE? I doubt that and so these two values have no effect and are actually This is another knob that should not be changed unless you really know what you are doing. The mss calculation uses this value as safe default that is always accepted. Pushing that up to this value may have unpleasant sideeffects for people behind IPSec tunnels. The used mss is the max between mssdflt and the MTU of the route to the host minus IP and TCP If you need to tune the syncache in such extrem ways you should consider to adjust TCP_SYN_HASH_SIZE and leave synbucketlimit as is. The synbucketlimit is here to limit attacks to the hash list by overloading the bucket list. On your system it may be necessary to traverse 420 nodes on a lookup. Honestly the syncachelimit and synbucketlimit knob are totaly useless. If anything we should allow to resize the hash and calculate the both limits from there.
I will do an other series of tests in the next few days and be sure of it before putting my foot in my mouth. But at 10000, I was getting drops Yes, I was doing tests using a few clients and pushing the server at 2000 parallel connections to test with. That was in lab test and in real life, I assume that half should be fine. But I wanted to be safe. So, With smaller queue I was getting slower responses and drop. May be a Yes, aggressive I was/am. Keep Alive was/is in use yes. I will have more to play with in lab and see if I was to aggressive and look like you would think I am. The default value give me not as good results however. More tests needed specifically on this and I will do so. May be the defaults are fine, I will see if I can find a way to be more objective I will review and read more on it. I based my changes on results seen with the setup under heavy load. There is always place for improvements. Interesting! I will retest with that in mind. Didn't see that explication in my reading so far. Thanks for this! You are most helpful and this gives me something to research more and I sure appreciates your time in passing the informations. Looks like a few more days of testing needed. Many thanks! Daniel
You are right again! (;> # netstat -m 14140 mbufs in use: 1098 mbufs allocated to data 12527 mbufs allocated to packet headers 515 mbufs allocated to socket names and addresses 585/694/4096 mbuf clusters in use (current/peak/max) 4976 Kbytes allocated to network (94% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines I was not looking at the right place. Back to default value. Thanks for the help! Daniel
How does this server do with 1000 non-httpd processes running? Perhaps I need a newer Nemeth et al, but in my 3rd edition, pg 759 middle of the page says "Modern systems do not deal welll with load averages over about 6.0". Could your bottleneck be in context-switching between so many processes? With so many, the memory cache will be faulting during the context switching and have to be retreived from main memory. I don't think that such slow-downs appear in top, and I don't know about vmstat. I don't know if there's a tool to measure this on i386. I've never run httpd but it looks to me like a massivly parralized problem where each connection is trivial to serve (hense low CPU usage, no disk-io waiting) but there are just so many of them. How does the server do with other connection services, e.g. pop or ftp? Doug.
Be careful when reading these numbers here. Don't forget that I am doing this in labs with abuse, etc. I am trying to push the server as much as I can here. In production, I do see some server reaching 10, 18 and some time I saw up to 25, but all these were in extreme cases, most of the time, it's always below 10. I can't answer this question with proper knowledge here as I don't pretend to know that answer. May be someone else can speak knowingly Wasn't. However yes there is and I can see faulting. I check both the vmstat and iostat to see what's up. Obviously the number are higher on older hardware as it run out of horse power obviously. But the problem was the be able to handle more then 300 parallel connections and why it just 3x when only 2 more process were added. So, no, I don't think the context-switching had anything to do with it here. You will see when I post the changes I did and the test I did. Some are One multi core and multi processor hardware with proper memory, it I only run one application per servers, always did and most likely always will. So, any mail server is a mail server, and a web server is only a web server here anyway. Even DNS are only running DNS as well, etc.