On Fri, 2010-11-19 at 15:24 +0100, Samuel Thibault wrote:
Colour me unconvinced, measuring shared cache footprint using PMUs might
help (and people have actually implemented and played with that at
various times in the past) but again, the added overhead of doing so
will hurt a lot more workloads than might benefit.
I'm not at all convinced using the process hierarchy will really help
much, but feel free to write the patch and test it. But making the
migration condition very complex will definitely hurt some workloads.
Pushing them away permits them to use a larger part of that same L3
cache allowing them to work on larger data sets. Most of the MPI apps
have a large compute to communication ratio because that is what allows
them to run in parallel so well (traditionally the interconnects were
terribly slow to boot), that suggests that working on larger data sets
is a good thing and running on the same node really doesn't matter since
communication is assumes slow anyway.
There really is no simple solution to his.
--