A university student studying operating systems asked about why the Linux kernel uses two chained lists in its LRU (least recently used) page replacement algorithm. Andrea Arcangeli [interview], whose virtual memory subsystem was merged into the 2.4.10 kernel, explained, "back then I designed it with two lru lists because by splitting the active from the inactive cache allows to detect the cache pollution before it starts discarding the working set." He went on to add, "a page in the inactive list will be collected much more quickly than a page in the active list, so the pollution will be collected more quickly than the working set. Then the VM while freeing cache tries to keep a balance between the size of the two lists to avoid being too unfair, obviously at some point the active list have to be de-activated too."
Rik van Riel [interview], author of the reverse mapping virtual memory code that was merged into the 2.5 kernel [story] noted, "since memory size has increased a lot more than disk speed over the last decade (and this is likely to continue for the next decades), the quality of page replacement algorithms is likely to become more and more important over time." In response to a proposal to split the LRU into two parts, one for the page cache and the other for mapped pages, Nick Piggin [interview] replied, "I actually had patches to do 'split active lists' a while back. They worked by lazily moving the page at reclaim-time, based on whether or not it is mapped. This isn't too much worse than the kernel's current idea of what a mapped page is." Rik offered some ideas on to how to further tune it, "for each list we keep track of: 1) the size of the list 2) the rate at which we scan the list 3) the fraction of (non new) pages that get referenced. That way we can determine which list has the largest fraction of 'idle' pages sitting around and consequently which list should be scanned more aggressively."