Well, swap control is another subject. I guess for that you'll need
to track which cgroup each swap page belongs to (rather more expensive
than the current swap_map of unsigned shorts). And I doubt it'll be
swap control as such that's required, but control of rss+swap.
But here I'm just worrying about how the existence of swap makes
something of a nonsense of your rss control.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I'd appreciate it if you'd teach me the right name for that!
Yes, readahead is anyway problematic. My guess is that in the file
cache case, you'll tend not to go too far wrong by charging to the
one that added - though we're all aware that's fairly unsatisfactory.
My point is that in the swap cache case, it's badly wrong: there's
no page more obviously owned by a cgroup than its anonymous pages
(forgetting for a moment that minority shared between cgroups
until copy-on-write), so it's very wrong for swapin readahead
or swapoff to go charging those to another or to no cgroup.
Imagine a cgroup at its rss limit, with more out on swap. Then
another cgroup does some swap readahead, bringing pages private
to the first into cache. Or runs swapoff which actually plugs
them into the rss of the first cgroup, so it goes over limit.
Those are pages we'd want to swap out when the first cgroup
faults to go further over its limit; but they're now not even
identified as belonging to the right cgroup, so won't be found.
My answer is definitely yes. I'm not suggesting that you need
general migration between cgroups at this stage (something for
later quite likely); but I am suggesting you need one pseudo-cgroup
to hold these cases temporarily, and that you cannot properly track
rss without it (if there is any swap).
Umm, please explain what's excellent about that.
Hugh
-