[PATCH 3/3] tools, perf: Documentation for the power events API

Previous thread: [PATCH] tracing, perf : add cpu hotplug trace events by Vincent Guittot on Tuesday, January 4, 2011 - 2:50 am. (1 message)

Next thread: [RFC 1/2] drivers:misc:ti-st: change protocol parse logic by pavan_savoy on Tuesday, January 4, 2011 - 3:59 am. (2 messages)
From: jean.pihet
Date: Tuesday, January 4, 2011 - 3:17 am

From: Jean Pihet <j-pihet@ti.com>

Provides:
. calls to machine_suspend trace point,
. OMAP support,
. API Documentation

Applies on top of Thomas's 8 latest power trace API patches, cf.
http://marc.info/?l=linux-kernel&m=129130827309354&w=2

Jean Pihet (3):
  perf: add calls to suspend trace point
  perf: add OMAP support for the new power events
  tools, perf: Documentation for the power events API

 Documentation/trace/events-power.txt |   90 ++++++++++++++++++++++++++++++++++
 arch/arm/mach-omap2/pm34xx.c         |    7 +++
 arch/arm/mach-omap2/powerdomain.c    |    3 +
 arch/arm/plat-omap/clock.c           |   13 ++++-
 kernel/power/suspend.c               |    3 +
 5 files changed, 113 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/trace/events-power.txt

-- 
1.7.2.3

--

From: jean.pihet
Date: Tuesday, January 4, 2011 - 3:17 am

From: Jean Pihet <j-pihet@ti.com>

Uses the machine_suspend trace point, called from the
generic kernel suspend_enter function.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
CC: Thomas Renninger <trenn@suse.de>
---
 kernel/power/suspend.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index ecf7705..0650596 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/slab.h>
 #include <linux/suspend.h>
+#include <trace/events/power.h>
 
 #include "power.h"
 
@@ -164,7 +165,9 @@ static int suspend_enter(suspend_state_t state)
 	error = sysdev_suspend(PMSG_SUSPEND);
 	if (!error) {
 		if (!suspend_test(TEST_CORE) && pm_check_wakeup_events()) {
+			trace_machine_suspend(state);
 			error = suspend_ops->enter(state);
+			trace_machine_suspend(PWR_EVENT_EXIT);
 			events_check_enabled = false;
 		}
 		sysdev_resume();
-- 
1.7.2.3

--

From: Ingo Molnar
Date: Tuesday, January 4, 2011 - 3:39 am

Please use the scripts/get_maintainer.pl to construct a proper Cc: list and to 
gather the necessary Acked-by:

  scripts/get_maintainer.pl -f kernel/power/suspend.c

Thanks,

	Ingo
--

From: jean.pihet
Date: Tuesday, January 4, 2011 - 3:17 am

From: Jean Pihet <j-pihet@ti.com>

The patch adds the new power management trace points for
the OMAP architecture.

The trace points are for:
- default idle handler. Since the cpuidle framework is
  instrumented in the generic way there is no need to
  add trace points in the OMAP specific cpuidle handler;
- cpufreq (DVFS),
- clocks changes (enable, disable, set_rate),
- change of power domains next power states.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
---
 arch/arm/mach-omap2/pm34xx.c      |    7 +++++++
 arch/arm/mach-omap2/powerdomain.c |    3 +++
 arch/arm/plat-omap/clock.c        |   13 ++++++++++---
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c
index 0ec8a04..0ee0b0e 100644
--- a/arch/arm/mach-omap2/pm34xx.c
+++ b/arch/arm/mach-omap2/pm34xx.c
@@ -29,6 +29,7 @@
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/console.h>
+#include <trace/events/power.h>
 
 #include <plat/sram.h>
 #include <plat/clockdomain.h>
@@ -506,8 +507,14 @@ static void omap3_pm_idle(void)
 	if (omap_irq_pending() || need_resched())
 		goto out;
 
+	trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+	trace_cpu_idle(1, smp_processor_id());
+
 	omap_sram_idle();
 
+	trace_power_end(smp_processor_id());
+	trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
+
 out:
 	local_fiq_enable();
 	local_irq_enable();
diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
index 6527ec3..73cbe9a 100644
--- a/arch/arm/mach-omap2/powerdomain.c
+++ b/arch/arm/mach-omap2/powerdomain.c
@@ -23,6 +23,7 @@
 #include <linux/errno.h>
 #include <linux/err.h>
 #include <linux/io.h>
+#include <trace/events/power.h>
 
 #include <asm/atomic.h>
 
@@ -440,6 +441,8 @@ int pwrdm_set_next_pwrst(struct powerdomain *pwrdm, u8 pwrst)
 	pr_debug("powerdomain: setting next powerstate for %s to %0x\n",
 		 pwrdm->name, pwrst);
 
+	trace_power_domain_target(pwrdm->name, pwrst, ...
From: Ingo Molnar
Date: Tuesday, January 4, 2011 - 3:42 am

I suspect the gents and mailing lists listed by:

  scripts/get_maintainer.pl -f arch/arm/plat-omap/clock.c
  scripts/get_maintainer.pl -f arch/arm/mach-omap2/pm34xx.c

Would want to be Cc:-ed as well. That will also get the right Acked-by's. (if you 
want these commits to go upstream via the perf tree)

Thanks,

	Ingo
--

From: Pihet-XID, Jean
Date: Tuesday, January 4, 2011 - 3:58 am

Yes the idea is to get those upstream via the tip tree, since it now

Thanks,
Jean
--

From: Nishanth Menon
Date: Tuesday, January 4, 2011 - 11:03 am

jean.pihet@newoldbits.com had written, on 01/04/2011 04:17 AM, the 
following:

Dumb question: it just tells me which C state was attempted - not if 
actually succeeded in hitting it rt? Does'nt this give us a false data?

(from an offline discussion on a related topic): Would it also be nice 
to hook on mach-omap2/clock.c points as well to hook on indirect changes?
[..]

-- 
Regards,
Nishanth Menon
--

From: jean.pihet
Date: Tuesday, January 4, 2011 - 3:17 am

From: Jean Pihet <j-pihet@ti.com>

Provides documentation for the following:
- the new power trace API,
- the old (legacy) power trace API,
- the DEPRECATED Kconfig option usage.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
---
 Documentation/trace/events-power.txt |   90 ++++++++++++++++++++++++++++++++++
 1 files changed, 90 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/trace/events-power.txt

diff --git a/Documentation/trace/events-power.txt b/Documentation/trace/events-power.txt
new file mode 100644
index 0000000..8a50653
--- /dev/null
+++ b/Documentation/trace/events-power.txt
@@ -0,0 +1,90 @@
+
+			Subsystem Trace Points: power
+
+The power tracing system captures events related to power transitions
+within the kernel. Broadly speaking there are three major subheadings:
+
+  o Power state switch which reports events related to suspend (S-states),
+     cpuidle (C-states) and cpufreq (P-states)
+  o System clock related changes
+  o Power domains related changes and transitions
+
+This document describes what each of the tracepoints is and why they
+might be useful.
+
+Cf. include/trace/events/power.h for the events definitions.
+
+1. Power state switch events
+============================
+
+1.1 New trace API
+-----------------
+
+A 'cpu' event class gathers the CPU-related events: cpuidle and
+cpufreq.
+
+cpu_idle		"state=%lu cpu_id=%lu"
+cpu_frequency		"state=%lu cpu_id=%lu"
+
+A suspend event is used to indicate the system going in and out of the
+suspend mode:
+
+machine_suspend		"state=%lu"
+
+
+Note: the value of '-1' or '4294967295' for state means an exit from the current state,
+i.e. trace_cpu_idle(4, smp_processor_id()) means that the system
+enters the idle state 4, while trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id())
+means that the system exits the previous idle state.
+
+The event which has 'state=4294967295' in the trace is very important to the user
+space tools which are using it to detect the end of the ...
From: jean.pihet
Date: Tuesday, January 4, 2011 - 3:48 am

From: Jean Pihet <j-pihet@ti.com>

Uses the machine_suspend trace point, called from the
generic kernel suspend_enter function.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
CC: Thomas Renninger <trenn@suse.de>
---
 kernel/power/suspend.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index ecf7705..0650596 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/slab.h>
 #include <linux/suspend.h>
+#include <trace/events/power.h>
 
 #include "power.h"
 
@@ -164,7 +165,9 @@ static int suspend_enter(suspend_state_t state)
 	error = sysdev_suspend(PMSG_SUSPEND);
 	if (!error) {
 		if (!suspend_test(TEST_CORE) && pm_check_wakeup_events()) {
+			trace_machine_suspend(state);
 			error = suspend_ops->enter(state);
+			trace_machine_suspend(PWR_EVENT_EXIT);
 			events_check_enabled = false;
 		}
 		sysdev_resume();
-- 
1.7.2.3

--

From: Pavel Machek
Date: Tuesday, January 4, 2011 - 4:29 am

Ok... why this place? I mean, perhaps suspend time should include
device suspend?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Jean Pihet
Date: Tuesday, January 4, 2011 - 8:00 am

Hi,

This trace has been placed here because it traces the machine low
That makes sense. We have a few options here:
1) keep the traces as proposed to trace the low level machine code only,
2) move the traces to the entry and exit of suspend_enter so that it
includes the prepare and late_prepare (+ the associated wake-up)
callbacks as well,
3) move the traces to suspend_devices_and_enter so that it includes 2)
and the handling of the console and the devices,
4) move the traces to enter_state do that it includes 3), the call to
sys_sync and the user space freeze.

Note that the the SNAPSHOT_2RAM ioctl code also calls
suspend_devices_and_enter, so if only 4) is used no trace will be
generated in that case.

I am in favor of 3) of 4).

Thanks,
Jean
--

From: Rafael J. Wysocki
Date: Tuesday, January 4, 2011 - 4:01 pm

Why don't we keep the tracepoints as proposed _and_ add two additional
tracepoints around device suspend-resume?

Rafael
--

From: jean.pihet
Date: Tuesday, January 4, 2011 - 3:54 am

From: Jean Pihet <j-pihet@ti.com>

The patch adds the new power management trace points for
the OMAP architecture.

The trace points are for:
- default idle handler. Since the cpuidle framework is
  instrumented in the generic way there is no need to
  add trace points in the OMAP specific cpuidle handler;
- cpufreq (DVFS),
- clocks changes (enable, disable, set_rate),
- change of power domains next power states.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
---
 arch/arm/mach-omap2/pm34xx.c      |    7 +++++++
 arch/arm/mach-omap2/powerdomain.c |    3 +++
 arch/arm/plat-omap/clock.c        |   13 ++++++++++---
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c
index 0ec8a04..0ee0b0e 100644
--- a/arch/arm/mach-omap2/pm34xx.c
+++ b/arch/arm/mach-omap2/pm34xx.c
@@ -29,6 +29,7 @@
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/console.h>
+#include <trace/events/power.h>
 
 #include <plat/sram.h>
 #include <plat/clockdomain.h>
@@ -506,8 +507,14 @@ static void omap3_pm_idle(void)
 	if (omap_irq_pending() || need_resched())
 		goto out;
 
+	trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+	trace_cpu_idle(1, smp_processor_id());
+
 	omap_sram_idle();
 
+	trace_power_end(smp_processor_id());
+	trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
+
 out:
 	local_fiq_enable();
 	local_irq_enable();
diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
index 6527ec3..73cbe9a 100644
--- a/arch/arm/mach-omap2/powerdomain.c
+++ b/arch/arm/mach-omap2/powerdomain.c
@@ -23,6 +23,7 @@
 #include <linux/errno.h>
 #include <linux/err.h>
 #include <linux/io.h>
+#include <trace/events/power.h>
 
 #include <asm/atomic.h>
 
@@ -440,6 +441,8 @@ int pwrdm_set_next_pwrst(struct powerdomain *pwrdm, u8 pwrst)
 	pr_debug("powerdomain: setting next powerstate for %s to %0x\n",
 		 pwrdm->name, pwrst);
 
+	trace_power_domain_target(pwrdm->name, pwrst, ...
From: Paul Walmsley
Date: Tuesday, January 4, 2011 - 11:48 am

Hello Jean,


A question about these.  Are these only meant to track calls to these 
functions from outside the clock code?  Or meant to track actual hardware 
clock changes?  If the latter, then it might make sense to put these 
trace points into the functions that actually change the hardware 
registers, e.g., omap2_dflt_clk_{enable,disable}(), etc., since a 
clk_enable() on a leaf clock may result in many internal system clocks 
being enabled up the clock tree.




- Paul
--

Previous thread: [PATCH] tracing, perf : add cpu hotplug trace events by Vincent Guittot on Tuesday, January 4, 2011 - 2:50 am. (1 message)

Next thread: [RFC 1/2] drivers:misc:ti-st: change protocol parse logic by pavan_savoy on Tuesday, January 4, 2011 - 3:59 am. (2 messages)