Re: [PATCH/RFCv4 2/6] mm: cma: Contiguous Memory Allocator added

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Mel Gorman
Date: Thursday, August 26, 2010 - 6:47 am

On Fri, Aug 20, 2010 at 11:50:42AM +0200, Michal Nazarewicz wrote:

Please do not consider this a proper review. I'm only glancing through
it.


So more than 6MB of memory means the page allocator cannot automatically
grant the requests. That's fine but I'd still like to be as close to the
page allocator if possible.


An important consideration is if the alignment is always a natural
alignment? i.e. a 64K buffer must be 64K aligned, 128K must be 128K aligned
etc. I ask because the buddy allocator is great at granting natural alignments
but is difficult to work with for other alignments.


If drivers are using bootmem and custom allocators, I agree that some common
framework is needed. If every device depended on bootmem, there would be huge
chunks of unusable memory. i.e. At first glance, I think this is important
in concept.


It'd be very nice if the shared regions could also be used by normal movable
memory allocations to minimise the amount of wastage. I imagine this would
be particularly important on memory-constrained devices. So right now,
we have
 
#define MIGRATE_UNMOVABLE     0
#define MIGRATE_RECLAIMABLE   1
#define MIGRATE_MOVABLE       2
#define MIGRATE_PCPTYPES      3 /* the number of types on the pcp lists */
#define MIGRATE_RESERVE       3
#define MIGRATE_ISOLATE       4 /* can't allocate from here */
#define MIGRATE_TYPES         5

Conceptually speaking we also want

MIGRATE_MOVABLE_STICKY	  /* Set by CMA, used by CMA and GFP_MOVABLE */
MIGRATE_MOVABLE_EXCLUSIVE /* Set by CMA, exclusive use of a device */

Sticky would be usable by the page allocator and other than forcing
the migrate_type to be MIGRATE_MOVABLE, it would otherwise be normal
memory. Exclusive would be isolated from normal usage by taking the pages from
the normal free lists and putting them on a free list managed by CMA. Normally
the page allocator uses zone->free_area[] for its free lists.  The allocator
would need to handle either free_area from a zone or one provided by a device
using CMA. Would be tricky to pass through admittedly.

I recognise this is not straight-forward so consider these to be suggestions,
not requirements. Glancing through, I don't see why these patches could not
evolve to be closer to the page allocator for example rather than happening
at the very start.


My ideal would be the default allocator was the buddy allocator
beginning at __rmqueue_smallest() with a custom allocator provided only
if absolutly required.

Custom allocators are not easy to get right :/


Based on these requirements I guess it would go something like

For camera=region
	1. Allocate free_area for free lists and associate with cma_region
	2. Find contiguous range of MIGRATE_MOVABLE blocks
	3. Mark MIGRATE_MOVABLE_STICKY
	4. Remove pages from zone free lists and add to cma_region freelist

	On allocation, cma_alloc passes a cma_control structure
	including the cma_region. Bypass the per-cpu allocator and all
	that. Use the normal allocator but use the cma_region free
	lists.

	All allocations for CMA must be compound so that the page has a
	destructor. Store what cma_region the compound page belongs on the
	struct page. This is tricky for single pages so it would be
	ideal if the page could always be compound.

	On free, the destructor adds the page back onto the cma_region
	free list. Hugetlbfs does something like this

	So, other than where the free list is, allocation is using the
	core page allocator

	That all said, you could just always go with your BEST_FIT
	allocator when the use is exclusive.

For *=common
	1. Find contiguous range of MIGRATE_MOVABLE blocks
	2. Mark MIGRATE_MOVABLE_STICKY

	On allocation for < MAX_ORDER_NR_PAGES, just specify __GFP_CMA. This
	will allow allocation from regions marked MIGRATE_MOVABLE_STICKY.
	If a suitable page is not found, compaction is used to vacate all
	MOVABLE pages from all MIGRATE_MOVABLE_STICKY regions (you could be
	smarter about it but as a start, move everything)

	If the allocation is > MAX_ORDER_NR_PAGES, start by migrating all
	movable pages out of the MIGRATE_MOVABLE_STICKY region and then fall
	back to a linear scan. You could fall back to BEST_FIT if and only
	if all the other MOVABLE pages that the best fit algorithm is not
	aware of got moved out of the way or that the best fit algorithm
	was informed where the unmovable pages happen to be.

Again, I recognise this is not easy and there will be some weird interaction
with page reclaim which will need to take the number of CMA regions into
account. It's just a suggestion on what direction you could take it to avoid
a mess of custom allocators. The current design looks like it could migrate
towards the core page allocator and share pages to minimise wastage so it's
not a blocker to merging.


s/lever/level/ ?


Unnecessary whitespace. I'll keep these comments to a minimum. They are
distracting at best but I suggest you take a pass at cleaning up stuff
like this. It'll avoid your feedback being a mess of trivial cleanups
and no "proper" feedback :)


So, lets say hypothetically speaking you used the core page allocator
where possible, you would be tranlating a size to an order. This is not
a problem but you'd have to watch the alignment because the page
allocator is only suitable for natural alignments.


Don't put it in the header then :/


Nice one, must be a power of two implies natural alignment to the page
allocator!


Is it not an error to free a non-existant chunk? Hope it WARN()s at
least.


How lower? If it can be hidden, put it in a private header.


Ideally named regions would be managed by default by free_area and the core
page allocator.


So, I would hope that a default allocator would be something that sits above
__rmqueue_smallest with some juggling to allow __rmqueue_smallest to take
an arbitrary free_area. The CMA wrapper around it would need to know things
like how to call compaction to move pages out of MIGRATE_MOVABLE_STICKY
if necessary.


Is there any scope for reusing parts of kernel/resource.c? Frankly, I
didn't look at your requirements closely enough or at kernel/resource.c
capabilities but at a glance, there appears to be some commonality.

Minimally, if there is a good reason to *not* use resource.c, it should
be in the changelog or I guarantee that in 3 months time, someone else
will ask you exactly the same question :)


Again, with some jiggery pokery I think you could make the guts of the page
allocator the default cma_allocator.


As an aside, it does not seem necessary to have everything CMA related
in the same header. Maybe split it out to minimise the risk of drivers
abusing the layers. Up to you really, I don't feel very strongly on
header layout.


and hopefully there will be a CMA_CORE_PGALLOC in the future. The
principal advantage again of such a move is that the pages would be
usable for normal allocations while the device is inactive.


You don't need to default y this if CMA is selecting it, right?

also CMA should default n.


I confess I didn't review this part at all. I'm only commenting on how
you might better integrate with the core allocator in the future.


I'm not certain about the "any later version" part of this license and
how it applies to kernel code but I'm no licensing guru. I know we have
duel licensing elsewhere for BSD but someone should double check this
license is ok.


Odd comment.


For the most part other than style issues, nothing horrible jumped out.
There is nothing "surprising" about the allocator or how it is
structured as such. At least, not at first glance :)

There are concepts it shares with a standard arena allocator and the managing
of buffer information is similar to how slab manages objects.  There might
be some scope with getting closer to slab in the future but it would be
premature as a starting point.

I am curious about one thing though. Have you considered reusing the bootmem
allocator code to manage the regions instead of your custom stuff here? Instead
of the cma_regions core structures, you would associate cma_region with
a new bootmem_data_t, keep the bootmem code around and allocate using its
allocator. It's a bitmap allocator too and would be less code in the kernel?


I'm afraid I ran out of beans reading the patch so it isn't a detailed
review. Nothing horrible jumps out but I'd be interested in hearing your
thoughts on suggestions for bringing it closer to the page allocator and
reusing the bootmem allocator instead of introducing new code for CMA.

Thanks

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH/RFCv4 0/6] The Contiguous Memory Allocator framework, Michal Nazarewicz, (Fri Aug 20, 2:50 am)
[PATCH/RFCv4 1/6] lib: rbtree: rb_root_init() function added, Michal Nazarewicz, (Fri Aug 20, 2:50 am)
[PATCH/RFCv4 2/6] mm: cma: Contiguous Memory Allocator added, Michal Nazarewicz, (Fri Aug 20, 2:50 am)
[PATCH/RFCv4 3/6] mm: cma: Added SysFS support, Michal Nazarewicz, (Fri Aug 20, 2:50 am)
[PATCH/RFCv4 4/6] mm: cma: Added command line parameters s ..., Michal Nazarewicz, (Fri Aug 20, 2:50 am)
[PATCH/RFCv4 5/6] mm: cma: Test device and application added, Michal Nazarewicz, (Fri Aug 20, 2:50 am)
[PATCH/RFCv4 6/6] arm: Added CMA to Aquila and Goni, Michal Nazarewicz, (Fri Aug 20, 2:50 am)
Re: [PATCH/RFCv4 2/6] mm: cma: Contiguous Memory Allocator ..., Konrad Rzeszutek Wilk, (Wed Aug 25, 1:32 pm)
Re: [PATCH/RFCv4 3/6] mm: cma: Added SysFS support, Konrad Rzeszutek Wilk, (Wed Aug 25, 1:37 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., KAMEZAWA Hiroyuki, (Wed Aug 25, 5:58 pm)
Re: [PATCH/RFCv4 3/6] mm: cma: Added SysFS support, Michał Nazarewicz, (Wed Aug 25, 6:20 pm)
Re: [PATCH/RFCv4 2/6] mm: cma: Contiguous Memory Allocator ..., Michał Nazarewicz, (Wed Aug 25, 6:22 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., Michał Nazarewicz, (Wed Aug 25, 6:28 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., Michał Nazarewicz, (Wed Aug 25, 6:38 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., Michał Nazarewicz, (Wed Aug 25, 6:49 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., Michał Nazarewicz, (Wed Aug 25, 7:12 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., Michał Nazarewicz, (Wed Aug 25, 7:40 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., KAMEZAWA Hiroyuki, (Wed Aug 25, 7:50 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., KAMEZAWA Hiroyuki, (Wed Aug 25, 8:44 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., Michał Nazarewicz, (Wed Aug 25, 9:01 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., KAMEZAWA Hiroyuki, (Wed Aug 25, 9:30 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., KAMEZAWA Hiroyuki, (Wed Aug 25, 9:39 pm)
[PATCH/RFCv4.1 2/6] mm: cma: Contiguous Memory Allocator added, Michal Nazarewicz, (Wed Aug 25, 11:25 pm)
Re: [PATCH/RFCv4 2/6] mm: cma: Contiguous Memory Allocator ..., Mel Gorman, (Thu Aug 26, 6:47 am)
Re: [PATCH/RFCv4 2/6] mm: cma: Contiguous Memory Allocator ..., Michał Nazarewicz, (Thu Aug 26, 7:09 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., Michał Nazarewicz, (Thu Aug 26, 7:41 pm)
Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator fram ..., KAMEZAWA Hiroyuki, (Fri Aug 27, 1:16 am)
Re: [PATCH/RFCv4 2/6] mm: cma: Contiguous Memory Allocator ..., Michał Nazarewicz, (Sat Aug 28, 6:48 pm)