[PATCH 12/12] perf: move sched perf functions on top of tracepoints

Previous thread: [PATCH 08/12] sched: add switch_in and tick tracepoints by Tejun Heo on Tuesday, May 4, 2010 - 5:38 am. (1 message)

Next thread: [PATCH 09/12] perf: factor out perf_event_switch_clones() by Tejun Heo on Tuesday, May 4, 2010 - 5:38 am. (1 message)
From: Tejun Heo
Date: Tuesday, May 4, 2010 - 5:38 am

Now that all sched perf functions are colocated with tracepoints,
those perf functions can be moved on top of tracepoints instead of
being called directly.  After this patch, if both perf and tracepoints
are enabled, the four sched perf macros become noop and the backend
functions are defined static and registered as trace point probes on
demand.

The enable part is relatively simple.  Perf functions are registered
as tp probes.  sched_in is registered the last so that contexts don't
get scheduled in without all the functions active.

Disable is a bit more involved.  First, all probes other than
sched_out are unregistered and drained and online cpus are recorded in
a cpumask.  With zero nr_events, sched_out always switches out task
context and records that there's no task context for the cpu.  A
periodic timer is setup to watch the cpumask and when it sees that all
cpus have switched out their contexts, the sched_out probe is
unregistered.

The timer trick is necessary because unregistering a probe requires
thread context while neither workqueue nor tasklet can be directly
used from sched_out which is called under rq lock.

This results in reduced overhead when both tracepoints and perf are
enabled and opens up possibilities for further optimization.  Although
the sched functions are the frequently called ones, other perf
functions can also be converted to use TPs in similar manner.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 include/linux/perf_event.h |    2 +-
 kernel/perf_event.c        |  152 +++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 152 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0ad898b..66f2cba 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -745,7 +745,7 @@ struct perf_output_handle {
 ...
Previous thread: [PATCH 08/12] sched: add switch_in and tick tracepoints by Tejun Heo on Tuesday, May 4, 2010 - 5:38 am. (1 message)

Next thread: [PATCH 09/12] perf: factor out perf_event_switch_clones() by Tejun Heo on Tuesday, May 4, 2010 - 5:38 am. (1 message)