Keeping Ramdisk Pages Dirty

Submitted by Jeremy
on October 11, 2007 - 1:01pm

"We have seen ramdisk based install systems, where some pages of mapped libraries and programs were suddendly zeroed under memory pressure. This should not happen, as the ramdisk avoids freeing its pages by keeping them dirty all the time," Christian Borntraeger began, explaining the need for his small patch to the ramdisk driver. He continued, "it turns out that there is a case, where the VM makes a ramdisk page clean, without telling the ramdisk driver. On memory pressure shrink_zone runs and it starts to run shrink_active_list. There is a check for buffer_heads_over_limit, and if true, pagevec_strip is called. pagevec_strip calls try_to_release_page. If the mapping has no releasepage callback, try_to_free_buffers is called. try_to_free_buffers has now a special logic for some file systems to make a dirty page clean, if all buffers are clean. Thats what happened in our test case."

He provided two methods for duplicating the reported problem, "you have to make buffer_heads_over_limit true" This is done by either lowering max_buffer_heads or having a system with lots of high memory. "The solution is to provide a noop-releasepage callback for the ramdisk driver. This avoids try_to_free_buffers for ramdisk pages."


From: Christian Borntraeger
Subject: [PATCH] ramdisk: fix zeroed ramdisk pages on memory pressure
Date: Oct 9, 4:21 am 2007

From: Christian Borntraeger <borntraeger@de.ibm.com>

We have seen ramdisk based install systems, where some pages of mapped 
libraries and programs were suddendly zeroed under memory pressure. This 
should not happen, as the ramdisk avoids freeing its pages by keeping them 
dirty all the time.

It turns out that there is a case, where the VM makes a ramdisk page clean, 
without telling the ramdisk driver.
On memory pressure shrink_zone runs and it starts to run shrink_active_list. 
There is a check for buffer_heads_over_limit, and if true, pagevec_strip is 
called. pagevec_strip calls try_to_release_page. If the mapping has no 
releasepage callback, try_to_free_buffers is called. try_to_free_buffers has 
now a special logic for some file systems to make a dirty page clean, if all 
buffers are clean. Thats what happened in our test case.

The solution is to provide a noop-releasepage callback for the ramdisk driver.
This avoids try_to_free_buffers for ramdisk pages. 

To trigger the problem, you have to make buffer_heads_over_limit true, which
means:
- lower max_buffer_heads 
or
- have a system with lots of high memory

Andrew, if there are no objections - please apply. The patch applies against
2.6.23-rc9.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

---
 drivers/block/rd.c |   13 +++++++++++++
 1 files changed, 13 insertions(+)

Index: linux-2.6/drivers/block/rd.c
===================================================================
--- linux-2.6.orig/drivers/block/rd.c
+++ linux-2.6/drivers/block/rd.c
@@ -189,6 +189,18 @@ static int ramdisk_set_page_dirty(struct
 	return 0;
 }
 
+/*
+ * releasepage is called by pagevec_strip/try_to_release_page if
+ * buffers_heads_over_limit is true. Without a releasepage function
+ * try_to_free_buffers is called instead. That can unset the dirty
+ * bit of our ram disk pages, which will be eventually freed, even
+ * if the page is still in use.
+ */
+static int ramdisk_releasepage(struct page *page, gfp_t dummy)
+{
+	return 0;
+}
+
 static const struct address_space_operations ramdisk_aops = {
 	.readpage	= ramdisk_readpage,
 	.prepare_write	= ramdisk_prepare_write,
@@ -196,6 +208,7 @@ static const struct address_space_operat
 	.writepage	= ramdisk_writepage,
 	.set_page_dirty	= ramdisk_set_page_dirty,
 	.writepages	= ramdisk_writepages,
+	.releasepage	= ramdisk_releasepage,
 };
 
 static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector,
-

Ramdisk is a thing of the past,

muwlgr
on
October 12, 2007 - 7:13pm

currently it is advised to use tmpfs instead.
Ramdisk unneededly replicates block device functionality in the RAM, increasing complexity, overhead and space for errors to occur.

Well, there are a couple of

intgr
on
October 13, 2007 - 1:54am

Well, there are a couple of reasons why one would want to use ramdisk, because it stores a normal block file system on it. For example, you can simply copy a whole partition onto a ramdisk (sequential read => very fast), instead of copying individual files onto tmpfs.

But you are right that tmpfs is generally more useful, more space-efficient and faster with file operations. Hence why ramdisk has been subject to bitrot.

can tmpfs be used as initrd

Anonymous (not verified)
on
October 13, 2007 - 2:00am

can tmpfs be used as initrd ?

With kernel 2.6, you have

muwlgr
on
October 13, 2007 - 3:30am

With kernel 2.6, you have initramfs, which on boot unpacks initrd contents, stored in cpio format, into tmpfs. Then goes usual initrd-like processing, finishing usually with pivot_root.
More at http://linuxdevices.com/articles/AT4017834659.html

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.