On 08.05.2008 [18:51:11 +0200], Hans Rosenfeld wrote:Right, I agree *if* we allow huge pages in the walker. But AIUI, things are broken now with hugepages in the process' address space. This is a bug upstream and leads to hugepages leaking out of the kernel when /proc/pid/pagemap is read. Why not, instead (as a short-term fix), skip hugepage mappings altogether in the page-walker code? Hrm, upon further investigation, this seems to be a pretty clear limitation of walk_page_range(). One that is avoided in the other two callers, i.e. static int show_smap(struct seq_file *m, void *v) { ... if (vma->vm_mm && !is_vm_hugetlb_page(vma)) walk_page_range(vma->vm_mm, vma->vm_start, vma->vm_end, &smaps_walk, &mss); ... } static ssize_t clear_refs_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { ... for (vma = mm->mmap; vma; vma = vma->vm_next) if (!is_vm_hugetlb_page(vma)) walk_page_range(mm, vma->vm_start, vma->vm_end, &clear_refs_walk, vma); ... } No such protection exists for static ssize_t pagemap_read(struct file *file, char __user *buf, size_t count, loff_t *ppos); So, is there any way to either add a is_vm_hugetlb_page(vma) check into pagemap_read()? Or can we modify walk_page_range to take the a vma and skip the walking if is_vm_hugetlb_page(vma) is set [to avoid complications down the road until hugepage walking is fixed]. I guess the latter isn't possible for pagemap_read(), since we are just looking at arbitrary addresses in the process space? Dunno, seems quite clear that the bug is in pagemap_read(), not any hugepage code, and that the simplest fix is to make pagemap_read() do what the other walker-callers do, and skip hugepage regions. Thanks, Nish -- Nishanth Aravamudan <nacc@us.ibm.com> IBM Linux Technology Center --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux Kernel Mailing List | ixbge: remove TX lock and redo TX accounting. |
| Linux Kernel Mailing List | ixgbe: fix several counter register errata |
| Linux Kernel Mailing List | b43: fix build with CONFIG_SSB_PCIHOST=n |
| Linux Kernel Mailing List | 9p: block-based virtio client |
