The per cpu memory use by subsystems is typically quite small. We already
have an 8k limitation for percpu space for modules. And that does not seem
to be a problem.
Typically these are fairly small 8 bytes * 5000 is only 20k.
We could do that yes.
But then per cpu data is not frequently allocated and freed.
Going away from allocpercpu saves a lot of memory. We could make this
128k or so to be safe?
--