A recent report on the lkml suggested improved IO/writeback performance in the recently released 2.6.24-rc1 kernel compared to the earlier 2.6.19.2 and 2.6.22.6 kernels. Credit was given to some patches by Peter Zijlstra. Ingo Molnar replied, "wow, really nice results! Peter does know how to make stuff fast :) Now lets pick up some of Peter's other, previously discarded patches as well :-)" He pointed to several patches "as a starter", then quipped, "I think the MM should get out of deep-feature-freeze mode - there's tons of room to improve :-/"
Andrew Morton replied, "kidding. We merged about 265 MM patches in 2.6.24-rc1: 482 files changed, 8071 insertions(+), 5142 deletions(-)". He added, "a lot of that was new functionality. That's easier to add than things which change long-standing functionality." Of the patches Ingo pointed to, Peter noted he was currently working on polishing the swap-over-NFS patch, "will post that one again, soonish.... Esp. after Linus professed liking to have swap over NFS." Rik van Riel also replied regarding rewriting the page replacement code, "at the moment I only have the basic 'plumbing' of the split VM working and am fixing some bugs in that. Expect a patch series with that soon, so you guys can review that code and tell me where to beat it into shape some more :)"
From: Ingo Molnar Subject: Re: 2.6.24-rc1: First impressions Date: Oct 26, 8:22 am 2007 * Martin Knoblauch <spamtrap@knobisoft.de> wrote: > Hi , > > just to give some feedback on 2.6.24-rc1. For some time I am tracking > IO/writeback problems that hurt system responsiveness big-time. I > tested Peters stuff together with Fenguangs additions and it looked > promising. Therefore I was very happy to see Peters stuff going into > 2.6.24 and waited eagerly for rc1. In short, I am impressed. This > really looks good. IO throughput is great and I could not reproduce > the responsiveness problems so far. > > Below are a some numbers of my brute-force I/O tests that I can use > to bring responsiveness down. My platform is a HP/DL380g4, dual CPUs, > HT-enabled, 8 GB Memory, SmartaArray6i controller with 4x72GB SCSI > disks as RAID5 (battery protected writeback cahe enabled) and gigabit > networking (tg3). User space is 64-bit RHEL4.3 > > I am basically doing copies using "dd" with 1MB blocksize. Local > Filesystem ist ext2 (noatime). IO-Scheduler is dealine, as it tends > to give best results. NFS3 Server is a Sun/T2000/Solaris10. The tests > are: > > dd1 - copy 16 GB from /dev/zero to local FS > dd1-dir - same, but using O_DIRECT for output > dd2/dd2-dir - copy 2x7.6 GB in parallel from /dev/zero to local FS > dd3/dd3-dir - copy 3x5.2 GB in parallel from /dev/zero lo local FS > net1 - copy 5.2 GB from NFS3 share to local FS > mix3 - copy 3x5.2 GB from /dev/zero to local disk and two NFS3 shares > > I did the numbers for 2.6.19.2, 2.6.22.6 and 2.6.24-rc1. All units > are MB/sec. > > test 2.6.19.2 2.6.22.6 2.6.24.-rc1 > ---------------------------------------------------------------- > dd1 28 50 96 > dd1-dir 88 88 86 > dd2 2x16.5 2x11 2x44.5 > dd2-dir 2x44 2x44 2x43 > dd3 3x9.8 3x8.7 3x30 > dd3-dir 3x29.5 3x29.5 3x28.5 > net1 30-33 50-55 37-52 > mix3 17/32 25/50 96/35 (disk/combined-network) wow, really nice results! Peter does know how to make stuff fast :) Now lets pick up some of Peter's other, previously discarded patches as well :-) Such as the rewritten reclaim (clockpro) patches: http://programming.kicks-ass.net/kernel-patches/page-replace/ The improve-swap-performance (swap-token) patches: http://programming.kicks-ass.net/kernel-patches/swap_token/ His enable-swap-over-NFS [and other complex IO transports] patches: http://programming.kicks-ass.net/kernel-patches/vm_deadlock/ And the concurrent pagecache patches: http://programming.kicks-ass.net/kernel-patches/concurrent-pagecache/ as a starter :-) I think the MM should get out of deep-feature-freeze mode - there's tons of room to improve :-/ Ingo "runs and hides" Molnar -
From: Andrew Morton Subject: Re: 2.6.24-rc1: First impressions Date: Oct 26, 12:21 pm 2007 On Fri, 26 Oct 2007 17:22:21 +0200 Ingo Molnar <mingo@elte.hu> wrote: > > * Martin Knoblauch <spamtrap@knobisoft.de> wrote: > > > Hi , > > > > just to give some feedback on 2.6.24-rc1. For some time I am tracking > > IO/writeback problems that hurt system responsiveness big-time. I > > tested Peters stuff together with Fenguangs additions and it looked > > promising. Therefore I was very happy to see Peters stuff going into > > 2.6.24 and waited eagerly for rc1. In short, I am impressed. This > > really looks good. IO throughput is great and I could not reproduce > > the responsiveness problems so far. > > > > Below are a some numbers of my brute-force I/O tests that I can use > > to bring responsiveness down. My platform is a HP/DL380g4, dual CPUs, > > HT-enabled, 8 GB Memory, SmartaArray6i controller with 4x72GB SCSI > > disks as RAID5 (battery protected writeback cahe enabled) and gigabit > > networking (tg3). User space is 64-bit RHEL4.3 > > > > I am basically doing copies using "dd" with 1MB blocksize. Local > > Filesystem ist ext2 (noatime). IO-Scheduler is dealine, as it tends > > to give best results. NFS3 Server is a Sun/T2000/Solaris10. The tests > > are: > > > > dd1 - copy 16 GB from /dev/zero to local FS > > dd1-dir - same, but using O_DIRECT for output > > dd2/dd2-dir - copy 2x7.6 GB in parallel from /dev/zero to local FS > > dd3/dd3-dir - copy 3x5.2 GB in parallel from /dev/zero lo local FS > > net1 - copy 5.2 GB from NFS3 share to local FS > > mix3 - copy 3x5.2 GB from /dev/zero to local disk and two NFS3 shares > > > > I did the numbers for 2.6.19.2, 2.6.22.6 and 2.6.24-rc1. All units > > are MB/sec. > > > > test 2.6.19.2 2.6.22.6 2.6.24.-rc1 > > ---------------------------------------------------------------- > > dd1 28 50 96 > > dd1-dir 88 88 86 > > dd2 2x16.5 2x11 2x44.5 > > dd2-dir 2x44 2x44 2x43 > > dd3 3x9.8 3x8.7 3x30 > > dd3-dir 3x29.5 3x29.5 3x28.5 > > net1 30-33 50-55 37-52 > > mix3 17/32 25/50 96/35 (disk/combined-network) > > wow, really nice results! Those changes seem suspiciously large to me. I wonder if there's less physical IO happening during the timed run, and correspondingly more afterwards. > I think the MM should get out of deep-feature-freeze > mode - there's tons of room to improve :-/ Kidding. We merge about 265 MM patches in 2.6.24-rc1: 482 files changed, 8071 insertions(+), 5142 deletions(-) -
From: Ingo Molnar Subject: Re: 2.6.24-rc1: First impressions Date: Oct 26, 12:33 pm 2007 * Andrew Morton <akpm@linux-foundation.org> wrote: > > > dd1 - copy 16 GB from /dev/zero to local FS > > > dd1-dir - same, but using O_DIRECT for output > > > dd2/dd2-dir - copy 2x7.6 GB in parallel from /dev/zero to local FS > > > dd3/dd3-dir - copy 3x5.2 GB in parallel from /dev/zero lo local FS > > > net1 - copy 5.2 GB from NFS3 share to local FS > > > mix3 - copy 3x5.2 GB from /dev/zero to local disk and two NFS3 shares > > > > > > I did the numbers for 2.6.19.2, 2.6.22.6 and 2.6.24-rc1. All units > > > are MB/sec. > > > > > > test 2.6.19.2 2.6.22.6 2.6.24.-rc1 > > > ---------------------------------------------------------------- > > > dd1 28 50 96 > > > dd1-dir 88 88 86 > > > dd2 2x16.5 2x11 2x44.5 > > > dd2-dir 2x44 2x44 2x43 > > > dd3 3x9.8 3x8.7 3x30 > > > dd3-dir 3x29.5 3x29.5 3x28.5 > > > net1 30-33 50-55 37-52 > > > mix3 17/32 25/50 96/35 (disk/combined-network) > > > > wow, really nice results! > > Those changes seem suspiciously large to me. I wonder if there's less > physical IO happening during the timed run, and correspondingly more > afterwards. so a final 'sync' should be added to the test too, and the time it takes factored into the bandwidth numbers? > > I think the MM should get out of deep-feature-freeze mode - there's > > tons of room to improve :-/ > > Kidding. We merge about 265 MM patches in 2.6.24-rc1: > > 482 files changed, 8071 insertions(+), 5142 deletions(-) impressive :) Ingo -
From: Andrew Morton Subject: Re: 2.6.24-rc1: First impressions Date: Oct 26, 12:42 pm 2007 On Fri, 26 Oct 2007 21:33:40 +0200 Ingo Molnar <mingo@elte.hu> wrote: > > * Andrew Morton <akpm@linux-foundation.org> wrote: > > > > > dd1 - copy 16 GB from /dev/zero to local FS > > > > dd1-dir - same, but using O_DIRECT for output > > > > dd2/dd2-dir - copy 2x7.6 GB in parallel from /dev/zero to local FS > > > > dd3/dd3-dir - copy 3x5.2 GB in parallel from /dev/zero lo local FS > > > > net1 - copy 5.2 GB from NFS3 share to local FS > > > > mix3 - copy 3x5.2 GB from /dev/zero to local disk and two NFS3 shares > > > > > > > > I did the numbers for 2.6.19.2, 2.6.22.6 and 2.6.24-rc1. All units > > > > are MB/sec. > > > > > > > > test 2.6.19.2 2.6.22.6 2.6.24.-rc1 > > > > ---------------------------------------------------------------- > > > > dd1 28 50 96 > > > > dd1-dir 88 88 86 > > > > dd2 2x16.5 2x11 2x44.5 > > > > dd2-dir 2x44 2x44 2x43 > > > > dd3 3x9.8 3x8.7 3x30 > > > > dd3-dir 3x29.5 3x29.5 3x28.5 > > > > net1 30-33 50-55 37-52 > > > > mix3 17/32 25/50 96/35 (disk/combined-network) > > > > > > wow, really nice results! > > > > Those changes seem suspiciously large to me. I wonder if there's less > > physical IO happening during the timed run, and correspondingly more > > afterwards. > > so a final 'sync' should be added to the test too, and the time it takes > factored into the bandwidth numbers? That's one way of doing it. Or just run the test for a "long" time. ie: much longer than (total-memory / disk-bandwidth). Probably the latter will give a more accurate result, but it can get boring. > > > I think the MM should get out of deep-feature-freeze mode - there's > > > tons of room to improve :-/ > > > > Kidding. We merge about 265 MM patches in 2.6.24-rc1: > > > > 482 files changed, 8071 insertions(+), 5142 deletions(-) > > impressive :) A lot of that was new functionality. That's easier to add than things which change long-standing functionality. -
From: Peter Zijlstra Subject: Re: 2.6.24-rc1: First impressions Date: Oct 26, 8:29 am 2007 On Fri, 2007-10-26 at 17:22 +0200, Ingo Molnar wrote: > * Martin Knoblauch <spamtrap@knobisoft.de> wrote: >=20 > > Hi , > >=20 > > just to give some feedback on 2.6.24-rc1. For some time I am tracking=20 > > IO/writeback problems that hurt system responsiveness big-time. I=20 > > tested Peters stuff together with Fenguangs additions and it looked=20 > > promising. Therefore I was very happy to see Peters stuff going into=20 > > 2.6.24 and waited eagerly for rc1. In short, I am impressed. This=20 > > really looks good. IO throughput is great and I could not reproduce=20 > > the responsiveness problems so far. > >=20 > > Below are a some numbers of my brute-force I/O tests that I can use=20 > > to bring responsiveness down. My platform is a HP/DL380g4, dual CPUs,=20 > > HT-enabled, 8 GB Memory, SmartaArray6i controller with 4x72GB SCSI=20 > > disks as RAID5 (battery protected writeback cahe enabled) and gigabit=20 > > networking (tg3). User space is 64-bit RHEL4.3 > >=20 > > I am basically doing copies using "dd" with 1MB blocksize. Local=20 > > Filesystem ist ext2 (noatime). IO-Scheduler is dealine, as it tends=20 > > to give best results. NFS3 Server is a Sun/T2000/Solaris10. The tests=20 > > are: > >=20 > > dd1 - copy 16 GB from /dev/zero to local FS > > dd1-dir - same, but using O_DIRECT for output > > dd2/dd2-dir - copy 2x7.6 GB in parallel from /dev/zero to local FS > > dd3/dd3-dir - copy 3x5.2 GB in parallel from /dev/zero lo local FS > > net1 - copy 5.2 GB from NFS3 share to local FS > > mix3 - copy 3x5.2 GB from /dev/zero to local disk and two NFS3 shares > >=20 > > I did the numbers for 2.6.19.2, 2.6.22.6 and 2.6.24-rc1. All units=20 > > are MB/sec. > >=20 > > test 2.6.19.2 2.6.22.6 2.6.24.-rc1 > > ---------------------------------------------------------------- > > dd1 28 50 96 > > dd1-dir 88 88 86 > > dd2 2x16.5 2x11 2x44.5 > > dd2-dir 2x44 2x44 2x43 > > dd3 3x9.8 3x8.7 3x30 > > dd3-dir 3x29.5 3x29.5 3x28.5 > > net1 30-33 50-55 37-52 > > mix3 17/32 25/50 96/35 (disk/combined-netw= ork) >=20 > wow, really nice results! Peter does know how to make stuff fast :) Now=20 > lets pick up some of Peter's other, previously discarded patches as well > :-) >=20 > Such as the rewritten reclaim (clockpro) patches: >=20 > http://programming.kicks-ass.net/kernel-patches/page-replace/ I think riel is taking over that stuff with his split vm and policies per type. > The improve-swap-performance (swap-token) patches: >=20 >
