Re: kswapd min order, slub max order [was Re: -mm merge plans for 2.6.24]

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Christoph Lameter
Date: Tuesday, October 2, 2007 - 5:37 pm

On Tue, 2 Oct 2007, Christoph Lameter wrote:


A patch like this? This is based on the number of page structs on the 
system. Maybe it needs to be based on the number of MAX_ORDER blocks
for antifrag?


SLUB: Determine slub_max_order depending on the number of pages available

Determine the maximum order to be used for slabs and the mininum
desired number of objects in a slab from the amount of pages that
a system has available (like SLAB does for the order 1/0 distinction).

For systems with less than 128M only use order 0 allocations (SLAB does 
that for <32M only). The order 0 config is useful for small systems to 
minimize the memory used. Memory easily fragments since we have less than 
32k pages to play with. Order 0 insures that higher order allocations are 
minimized (Larger orders must still be used for objects that do not fit 
into order 0 pages).

Then step up to order 1 for systems < 256000 pages (1G)

Order 2 limit to systems < 1000000 page structs (4G)

Order 3 for systems larger than that.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/slub.c |   49 +++++++++++++++++++++++++------------------------
 1 file changed, 25 insertions(+), 24 deletions(-)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2007-10-02 09:26:16.000000000 -0700
+++ linux-2.6/mm/slub.c	2007-10-02 16:40:22.000000000 -0700
@@ -153,25 +153,6 @@ static inline void ClearSlabDebug(struct
 /* Enable to test recovery from slab corruption on boot */
 #undef SLUB_RESILIENCY_TEST
 
-#if PAGE_SHIFT <= 12
-
-/*
- * Small page size. Make sure that we do not fragment memory
- */
-#define DEFAULT_MAX_ORDER 1
-#define DEFAULT_MIN_OBJECTS 4
-
-#else
-
-/*
- * Large page machines are customarily able to handle larger
- * page orders.
- */
-#define DEFAULT_MAX_ORDER 2
-#define DEFAULT_MIN_OBJECTS 8
-
-#endif
-
 /*
  * Mininum number of partial slabs. These will be left on the partial
  * lists even if they are empty. kmem_cache_shrink may reclaim them.
@@ -1718,8 +1699,9 @@ static struct page *get_object_page(cons
  * take the list_lock.
  */
 static int slub_min_order;
-static int slub_max_order = DEFAULT_MAX_ORDER;
-static int slub_min_objects = DEFAULT_MIN_OBJECTS;
+static int slub_max_order;
+static int slub_min_objects = 4;
+static int manual;
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
@@ -2237,7 +2219,7 @@ static struct kmem_cache *kmalloc_caches
 static int __init setup_slub_min_order(char *str)
 {
 	get_option (&str, &slub_min_order);
-
+	manual = 1;
 	return 1;
 }
 
@@ -2246,7 +2228,7 @@ __setup("slub_min_order=", setup_slub_mi
 static int __init setup_slub_max_order(char *str)
 {
 	get_option (&str, &slub_max_order);
-
+	manual = 1;
 	return 1;
 }
 
@@ -2255,7 +2237,7 @@ __setup("slub_max_order=", setup_slub_ma
 static int __init setup_slub_min_objects(char *str)
 {
 	get_option (&str, &slub_min_objects);
-
+	manual = 1;
 	return 1;
 }
 
@@ -2566,6 +2548,16 @@ int kmem_cache_shrink(struct kmem_cache 
 }
 EXPORT_SYMBOL(kmem_cache_shrink);
 
+/*
+ * Table to autotune the maximum slab order based on the number of pages
+ * that the system has available.
+ */
+static unsigned long __initdata phys_pages_for_order[PAGE_ALLOC_COSTLY_ORDER] = {
+	32768,		/* >128M if using 4K pages, >512M (16k), >2G (64k) */
+	256000,		/* >1G if using 4k pages, >4G (16k), >16G (64k) */
+	1000000		/* >4G if using 4k pages, >16G (16k), >64G (64k) */
+};
+
 /********************************************************************
  *			Basic setup of slabs
  *******************************************************************/
@@ -2575,6 +2567,15 @@ void __init kmem_cache_init(void)
 	int i;
 	int caches = 0;
 
+	if (!manual) {
+		/* No manual parameters. Autotune for system */
+		for (i = 0; i < PAGE_ALLOC_COSTLY_ORDER; i++)
+			if (num_physpages > phys_pages_for_order[i]) {
+				slub_max_order++;
+				slub_min_objects <<= 1;
+			}
+	}
+
 #ifdef CONFIG_NUMA
 	/*
 	 * Must first have the slab cache available for the allocations of the
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
-mm merge plans for 2.6.24, Andrew Morton, (Mon Oct 1, 2:22 pm)
wibbling over the cpuset shed domain connnection, Paul Jackson, (Mon Oct 1, 2:34 pm)
x86 patches was Re: -mm merge plans for 2.6.24, Andi Kleen, (Mon Oct 1, 11:18 pm)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andrew Morton, (Mon Oct 1, 11:32 pm)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andi Kleen, (Tue Oct 2, 12:01 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andrew Morton, (Tue Oct 2, 12:18 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, KAMEZAWA Hiroyuki, (Tue Oct 2, 12:36 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Ingo Molnar, (Tue Oct 2, 12:37 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andrew Morton, (Tue Oct 2, 12:43 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andi Kleen, (Tue Oct 2, 12:46 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Matt Mackall, (Tue Oct 2, 12:55 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Thomas Gleixner, (Tue Oct 2, 12:58 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andi Kleen, (Tue Oct 2, 12:59 am)
v4l-stk11xx* [Was: -mm merge plans for 2.6.24], Jiri Slaby, (Tue Oct 2, 12:59 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, KAMEZAWA Hiroyuki, (Tue Oct 2, 1:16 am)
per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Peter Zijlstra, (Tue Oct 2, 1:17 am)
writeback fixes, Fengguang Wu, (Tue Oct 2, 1:39 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andy Whitcroft, (Tue Oct 2, 2:26 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Yasunori Goto, (Tue Oct 2, 3:48 am)
Re: wibbling over the cpuset shed domain connnection, Nick Piggin, (Tue Oct 2, 5:36 am)
Re: wibbling over the cpuset shed domain connnection, Nick Piggin, (Tue Oct 2, 6:12 am)
Re: -mm merge plans for 2.6.24, Pekka Enberg, (Tue Oct 2, 9:12 am)
new aops merge [was Re: -mm merge plans for 2.6.24], Hugh Dickins, (Tue Oct 2, 9:21 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Nish Aravamudan, (Tue Oct 2, 9:40 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Lee Schermerhorn, (Tue Oct 2, 10:17 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Lee Schermerhorn, (Tue Oct 2, 10:25 am)
remove zero_page (was Re: -mm merge plans for 2.6.24), Nick Piggin, (Tue Oct 2, 10:45 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Christoph Lameter, (Tue Oct 2, 11:16 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Christoph Lameter, (Tue Oct 2, 11:18 am)
Re: kswapd min order, slub max order [was Re: -mm merge pl ..., Christoph Lameter, (Tue Oct 2, 11:28 am)
Re: kswapd min order, slub max order [was Re: -mm merge pl ..., Christoph Lameter, (Tue Oct 2, 5:37 pm)
Re: wibbling over the cpuset shed domain connnection, Paul Jackson, (Tue Oct 2, 10:21 pm)
Re: wibbling over the cpuset shed domain connnection, Paul Jackson, (Wed Oct 3, 12:00 am)
Re: wibbling over the cpuset shed domain connnection, Andrew Morton, (Wed Oct 3, 3:57 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Martin Knoblauch, (Wed Oct 3, 4:00 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Wed Oct 3, 8:21 am)
r/o bind mounts, was Re: -mm merge plans for 2.6.24, Christoph Hellwig, (Tue Oct 9, 2:19 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Tue Oct 9, 7:52 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Tue Oct 9, 7:22 pm)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Tue Oct 9, 8:06 pm)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Tue Oct 9, 10:20 pm)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Wed Oct 10, 8:04 am)
Re: -mm merge plans for 2.6.24, Borislav Petkov, (Sat Oct 13, 1:44 am)
Re: -mm merge plans for 2.6.24, Andrew Morton, (Sat Oct 13, 1:52 am)
Re: -mm merge plans for 2.6.24, Borislav Petkov, (Sat Oct 13, 4:45 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Trond Myklebust, (Fri Oct 26, 9:37 am)
[PATCH] mm: sysfs: expose the BDI object in sysfs, Peter Zijlstra, (Fri Nov 2, 7:59 am)