Hi Ingo, Peter, small question. It was a patch recently posted which removes memset(x, 0, x) after __alloc_bootmem call. There are a few another code snippets who still call memset(x, 0, x). And who is responsible for memory clearing? bootmem allocator or caller? - Cyrill - --
hm, bootmem allocator is supposed to clear memory. We have a couple of places that rely on that. Ingo --
I was actually considering to change that for the GB pages hugetlbfs patchkit, because memset for 1G is a little slow and not needed (will be cleared later anyways) and it might be a problem for very large systems with a lot of such pages at boot. -Andi --
I think it would be a good idea ;) Btw maybe would be better to call memset on the code witch relies on "clear" memory explicitly? So we will clear memory allocated *only* if we really need this. Cyrill --
Are there any users of bootmem which will allocate a significant amount of memory and don't need it zeroed? -hpa --
Actually, I don't know without reading the code... will take a look today evening (have no access to sources in my office). --
Well, the only really significant allocation I found is in
arch/ia64/kernel/iosapi.c:
---[...]---
static struct iosapic_rte_info * __init_refok iosapic_alloc_rte (void)
{
int i;
struct iosapic_rte_info *rte;
int preallocated = 0;
if (!iosapic_kmalloc_ok && list_empty(&free_rte_list)) {
---> rte = alloc_bootmem(sizeof(struct iosapic_rte_info) *
NR_PREALLOCATE_RTE_ENTRIES);
if (!rte)
return NULL;
for (i = 0; i < NR_PREALLOCATE_RTE_ENTRIES; i++, rte++)
list_add(&rte->rte_list, &free_rte_list);
}
if (!list_empty(&free_rte_list)) {
rte = list_entry(free_rte_list.next, struct iosapic_rte_info,
rte_list);
list_del(&rte->rte_list);
preallocated++;
} else {
rte = kmalloc(sizeof(struct iosapic_rte_info), GFP_ATOMIC);
if (!rte)
return NULL;
}
memset(rte, 0, sizeof(struct iosapic_rte_info));
if (preallocated)
rte->flags |= RTE_PREALLOCATED;
return rte;
}
---[...]---
but it requires zeroed memory too. So, no, I didn't found any large
number of bytes allocated by bootmem scheme without needing of its
clearing.
- Cyrill -
--
changing the default behavior of bootmem alloc to be non-clearing is a really bad idea that will only cause unrobustness. The proper approach is to add an _opt-in_ API that does not clear memory (bootmem_alloc_dontclear() or whatever), available to callers that know it for sure that they dont need the clearing. Ingo --
I was considering that too, but we have so many weird variants of bootmem with opt in and opt out and even combinations of both now that the whole thing is starting to look really pear shaped (I admit I added some of them in the past myself but I'm not proud). Would be a great project for someone to consolidate that all a bit. -Andi --
[Andi Kleen - Tue, Apr 08, 2008 at 10:12:06AM +0200] | On Tue, Apr 08, 2008 at 10:04:46AM +0200, Ingo Molnar wrote: | > | > * Andi Kleen <andi@firstfloor.org> wrote: | > | > > > hm, bootmem allocator is supposed to clear memory. We have a couple | > > > of places that rely on that. | > > | > > I was actually considering to change that for the GB pages hugetlbfs | > > patchkit, because memset for 1G is a little slow and not needed (will | > > be cleared later anyways) and it might be a problem for very large | > > systems with a lot of such pages at boot. | > | > changing the default behavior of bootmem alloc to be non-clearing is a | > really bad idea that will only cause unrobustness. The proper approach | > is to add an _opt-in_ API that does not clear memory | | I was considering that too, but we have so many weird variants of bootmem | with opt in and opt out and even combinations of both now that the whole thing | is starting to look really pear shaped (I admit I added some of them | in the past myself but I'm not proud). Would be a great project for | someone to consolidate that all a bit. | | -Andi | OK, thanks to all of you - Cyrill - --
[Andi Kleen - Tue, Apr 08, 2008 at 10:12:06AM +0200] | On Tue, Apr 08, 2008 at 10:04:46AM +0200, Ingo Molnar wrote: | > | > * Andi Kleen <andi@firstfloor.org> wrote: | > | > > > hm, bootmem allocator is supposed to clear memory. We have a couple | > > > of places that rely on that. | > > | > > I was actually considering to change that for the GB pages hugetlbfs | > > patchkit, because memset for 1G is a little slow and not needed (will | > > be cleared later anyways) and it might be a problem for very large | > > systems with a lot of such pages at boot. | > | > changing the default behavior of bootmem alloc to be non-clearing is a | > really bad idea that will only cause unrobustness. The proper approach | > is to add an _opt-in_ API that does not clear memory | | I was considering that too, but we have so many weird variants of bootmem | with opt in and opt out and even combinations of both now that the whole thing | is starting to look really pear shaped (I admit I added some of them | in the past myself but I'm not proud). Would be a great project for | someone to consolidate that all a bit. | | -Andi | Andi, could you a bit clarify what exactly do you mean? - Cyrill - --
True, but if the gbpages hugetlbfs case is the *only* case which wants a nonclearing interface (which it sounds like it is), then it seems like a really bad idea to change the default. hugetlbfs is a pretty special case. -hpa --
Yes, changing the default of bootmem_alloc is a bad idea. I just changed a bunch of static arrays to bootmem alloc's and it was pointed out early that not only does bootmem_alloc clear memory, but also panics if it's not A specialized call to bootmem_alloc probably needs it's own API... ;-) Thanks, Mike --
There are more and more bootmem calls that don't want the panic actually. That is why _nopanic was invented (and gets more and more variants) At some point the default could be even switched. I think the right way would be to survey the callers (there are not that many) and then come up with a sane single API that caters to the majority of them by default and passes flags for the special cases. -Andi --
Hi Andi,
I really don't care(*), but there's lot's of code that expects a certain
behavior. Either all the source calls have to be modified en masse (and you
well know that's difficult given the _zillion_ source trees), or you have to
introduce the new API transparently. That means leaving a backdoor for old
calls:
#define bootmem_alloc_low(...) \
__new_bootmem_alloc(..., FLAGS_FOR_OLD_BOOTMEM_ALLOC_LOW);
Then I think you're free to optimize away... d;-) [happy face with a baseball cap]
Cheers,
Mike
(*) As long as I don't have to debug problems as a result of the change...! ;-)
--
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux Kernel Mailing List | ixbge: remove TX lock and redo TX accounting. |
| Linux Kernel Mailing List | ixgbe: fix several counter register errata |
| Linux Kernel Mailing List | b43: fix build with CONFIG_SSB_PCIHOST=n |
| Linux Kernel Mailing List | 9p: block-based virtio client |
| Michael Breuer |
