[PATCH -tip 5/7] perf probe: Query basic types from debuginfo

Previous thread: [PATCH] mfd: Add WM8994 interrupt controller support by Mark Brown on Monday, March 29, 2010 - 1:28 pm. (6 messages)

Next thread: [PATCH tip/core/urgent] rcu: protect fork-time cgroup access by Paul E. McKenney on Monday, March 29, 2010 - 2:15 pm. (13 messages)
From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 1:37 pm

Hi Ingo,

Here are several updates of perf-probe. This series improves
data structure accessing.

- Set the name of argument which traces a data structure member
  as the last member of the data structure reference (e.g. f_mode
  of file->f_mode). This allows us to use perf-trace for tracin
  data-structure members.

- Add the basic type support. This allows us to fetch the memory
  with specified bitwidth. Usually, data-structure members are
  packed on the memory, this means if we want to read a member
  from memory, we have to access it with type casting.
  kprobe-tracer now support tracing argument with basic types
  (u8,u16,u32,u64,s8,s16,s32,s64), and perf-probe decodes the type
  information of the members.

- Support canonical frame address, which is used for refering
  frame-base on the kernel built without CONFIG_FRAME_POINTER.

TODOs (possible features):
  - Support array element (var[N])
  - Support string/dynamic arrays (*var, var[N..M])
  - Support tracing static variables (non global)
  - Support dynamic array-indexing (var[var2])
  - Support force type-casting ((type)var)
  - Show what deta-structure member is assigned to each argument.
  - Better support for probes on modules
  - More debugger like enhancements(%next, --disasm, etc.)

I decided to drop non-root user support (for --add/--del),
because this can set probes for getting sensitive data...

Thank you,

---

Masami Hiramatsu (7):
      perf probe: Support DW_OP_call_frame_cfa in debuginfo
      perf probe: Support basic type casting
      perf probe: Query basic types from debuginfo
      trace/kprobes: Support basic types
      perf probe: Use the last field name as the argument name
      perf probe: Support argument name
      [BUGFIX] perf probe: Fix to close dwarf when failing to analyze it


 Documentation/trace/kprobetrace.txt     |    4 
 kernel/trace/trace.h                    |   16 -
 kernel/trace/trace_kprobe.c             |  535 +++++++++++++++++++------------
 ...
From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 1:37 pm

Set given names to event arguments. The syntax is same as kprobe-tracer,
you can add 'NAME=' right before each argument.

e.g.
  ./perf probe vfs_read foo=file

 then, 'foo' is set to the argument name as below.

  ./perf probe -l
  probe:vfs_read       (on vfs_read@linux-2.6-tip/fs/read_write.c with foo)


Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
---

 tools/perf/Documentation/perf-probe.txt |   10 ++++++++-
 tools/perf/builtin-probe.c              |    4 ++--
 tools/perf/util/probe-event.c           |   34 ++++++++++++++++++++++---------
 tools/perf/util/probe-event.h           |    1 +
 tools/perf/util/probe-finder.c          |   27 +++++++++++++++----------
 5 files changed, 52 insertions(+), 24 deletions(-)

diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index 0f944b3..0657cf4 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -79,7 +79,15 @@ Probe points are defined by following syntax.
 'EVENT' specifies the name of new event, if omitted, it will be set the name of the probed function. Currently, event group name is set as 'probe'.
 'FUNC' specifies a probed function name, and it may have one of the following options; '+OFFS' is the offset from function entry address in bytes, ':RLN' is the relative-line number from function entry line, and '%return' means that it probes function return. And ';PTN' means lazy matching pattern (see LAZY MATCHING). Note that ';PTN' must be the end of the probe point definition.  In addition, '@SRC' specifies a source file which has that function.
 It is also possible to specify a probe point by the source line number or lazy matching by using 'SRC:ALN' or 'SRC;PTN' syntax, where 'SRC' is the ...
From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 1:37 pm

Fix to close libdw routine when failing to analyze it in
find_perf_probe_point().

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
---

 tools/perf/util/probe-finder.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index db52ec2..5f6cecc 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -812,8 +812,10 @@ int find_perf_probe_point(int fd, unsigned long addr,
 		return -ENOENT;
 
 	/* Find cu die */
-	if (!dwarf_addrdie(dbg, (Dwarf_Addr)addr, &cudie))
-		return -EINVAL;
+	if (!dwarf_addrdie(dbg, (Dwarf_Addr)addr, &cudie)) {
+		ret = -EINVAL;
+		goto end;
+	}
 
 	/* Find a corresponding line */
 	line = dwarf_getsrc_die(&cudie, (Dwarf_Addr)addr);


-- 
Masami Hiramatsu
e-mail: mhiramat@redhat.com
--

From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 1:38 pm

Set the last field name to the argument name when the argument
is refering a data-structure member.

e.g.
 ./perf probe --add 'vfs_read file->f_mode'
 Add new event:
   probe:vfs_read       (on vfs_read with f_mode=file->f_mode)

 This probe records file->f_mode, but the argument name becomes "f_mode".

This enables perf-trace command to parse trace event format correctly.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
---

 tools/perf/Documentation/perf-probe.txt |    2 +-
 tools/perf/util/probe-event.c           |    4 ++++
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index 0657cf4..d781fc5 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -87,7 +87,7 @@ Each probe argument follows below syntax.
 
  [NAME=]LOCALVAR|$retval|%REG|@SYMBOL
 
-'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), or kprobe-tracer argument format (e.g. $retval, %ax, etc).
+'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), or kprobe-tracer argument format (e.g. $retval, %ax, etc). Note that the name of this argument will be set as the last member name if you specify a local data structure member (e.g. field2 for 'var->field1.field2'.)
 
 LINE SYNTAX
 -----------
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index ab6f53d..19de8b7 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -481,6 +481,10 @@ static void parse_perf_probe_arg(const ...
From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 1:38 pm

Query the basic type information (byte-size and signed-flag) from
debuginfo and pass that to kprobe-tracer. This is especially useful
for tracing the members of data structure, because each member has
different byte-size on the memory.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
---

 tools/perf/util/probe-event.c  |    9 +++++
 tools/perf/util/probe-event.h  |    1 +
 tools/perf/util/probe-finder.c |   78 ++++++++++++++++++++++++++++++++++++----
 3 files changed, 80 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 19de8b7..05ca4a9 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -740,6 +740,13 @@ static int synthesize_kprobe_trace_arg(struct kprobe_trace_arg *arg,
 		buf += ret;
 		buflen -= ret;
 	}
+	/* Print argument type */
+	if (arg->type) {
+		ret = e_snprintf(buf, buflen, ":%s", arg->type);
+		if (ret <= 0)
+			return ret;
+		buf += ret;
+	}
 
 	return buf - tmp;
 }
@@ -848,6 +855,8 @@ void clear_kprobe_trace_event(struct kprobe_trace_event *tev)
 			free(tev->args[i].name);
 		if (tev->args[i].value)
 			free(tev->args[i].value);
+		if (tev->args[i].type)
+			free(tev->args[i].type);
 		ref = tev->args[i].ref;
 		while (ref) {
 			next = ref->next;
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index a73ede6..0759db6 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -23,6 +23,7 @@ struct kprobe_trace_arg_ref {
 struct kprobe_trace_arg {
 	char				*name;	/* Argument name */
 	char				*value;	/* Base value */
+	char				*type;	/* Type name */
 	struct kprobe_trace_arg_ref	*ref;	/* Referencing offset */
 };
 
diff --git ...
From: Arnaldo Carvalho de Melo
Date: Monday, March 29, 2010 - 1:51 pm

Question, should we use the equivalent to panic'ing the kernel in the
userspace bits in tools/?

I tend to see all code there as potentially part of a library, i.e. to
be callable by some unantecipated new tool or library that would rather
receive a return value telling it that the operation can't be performed
for some reason so that it can inform the user instead of having the
whole tool exit to the command line.

It may well be that some specific operation needs lots of resources but
many other don't, panic'c because the one that requires lots of
resources can't be performed, bringing down a gui/tui is really nasty.

Perhaps the answer will be the same as for when people added panic calls
in the kernel in the past, "you're screwed up anyway if that happens",
but it just feels wrong :-\

In the DWARF bits it has even the added twist namespace collapse with
DIE 8-)

- Arnaldo
--

From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 2:22 pm

OK, so that you want to see 

ret = library_func();
if (ret < 0)	/* Something wrong happened */
	return ret;

instead of 

library_func();

Hmm, agreed. But I'd like to have some API for storing dying^H^H^H^H^Herror
message. (maybe can we use setjump/longjump approach - like try/cache - for die()?)


-- 
Masami Hiramatsu
e-mail: mhiramat@redhat.com
--

From: Arnaldo Carvalho de Melo
Date: Monday, March 29, 2010 - 2:50 pm

Can't we do just as we do in the kernel and propagate the error back to
callers? All the way to userspace, that in this case would be the user
instead of an app started by a user?  :-)

Wrt api for storing messages, we have pr_{warning,error,debug}, etc, in
the TUI mode it is even redirected to the bottom line and I plan to have
them get into something browseable if the user wants to see the last pr_
messages.

IOW, just act like you're writing kernel code, that will make it more
likely that people that are used to writing code like that will feel at
ease while hacking tools/.

- Arnaldo
--

From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 3:05 pm

Ah, that's very nice:)
OK, I'll try to use it and remove all 'die()'s.


-- 
Masami Hiramatsu
e-mail: mhiramat@redhat.com
--

From: Arnaldo Carvalho de Melo
Date: Monday, March 29, 2010 - 3:23 pm

Thanks a lot!

- Arnaldo
--

From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 1:38 pm

Support basic types of integer (u8, u16, u32, u64, s8, s16, s32, s64) in
kprobe tracer. With this patch, users can specify above basic types on
each arguments after ':'. If omitted, the argument type is set as
unsigned long (u32 or u64, arch-dependent).

 e.g.
  echo 'p account_system_time+0 hardirq_offset=%si:s32' > kprobe_events

  adds a probe recording hardirq_offset in signed-32bits value on the
  entry of account_system_time.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
---

 Documentation/trace/kprobetrace.txt |    4 
 kernel/trace/trace.h                |   16 -
 kernel/trace/trace_kprobe.c         |  535 ++++++++++++++++++++++-------------
 3 files changed, 334 insertions(+), 221 deletions(-)

diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt
index a9100b2..ec94748 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -40,7 +40,9 @@ Synopsis of kprobe_events
   $stack	: Fetch stack address.
   $retval	: Fetch return value.(*)
   +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
-  NAME=FETCHARG: Set NAME as the argument name of FETCHARG.
+  NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
+  FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
+		  (u8/u16/u32/u64/s8/s16/s32/s64) are supported.
 
   (*) only for return probe.
   (**) this is useful for fetching a field of data structures.
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index bec2c97..3ebdb6b 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -102,29 +102,17 @@ struct syscall_trace_exit {
 	long			ret;
 };
 
-struct kprobe_trace_entry {
+struct ...
From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 1:38 pm

Add basic type casting for arguments to perf probe. This allows
users to specify the actual type of arguments. Of course, if
user sets invalid types, kprobe-tracer rejects that.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
---

 tools/perf/Documentation/perf-probe.txt |    3 ++-
 tools/perf/util/probe-event.c           |   23 ++++++++++++++++++++++-
 tools/perf/util/probe-event.h           |    1 +
 tools/perf/util/probe-finder.c          |   10 ++++++++--
 4 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index d781fc5..21820c3 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -85,9 +85,10 @@ PROBE ARGUMENT
 --------------
 Each probe argument follows below syntax.
 
- [NAME=]LOCALVAR|$retval|%REG|@SYMBOL
+ [NAME=]LOCALVAR|$retval|%REG|@SYMBOL[:TYPE]
 
 'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), or kprobe-tracer argument format (e.g. $retval, %ax, etc). Note that the name of this argument will be set as the last member name if you specify a local data structure member (e.g. field2 for 'var->field1.field2'.)
+'TYPE' casts the type of this argument (optional). If omitted, perf probe automatically set the type based on debuginfo.
 
 LINE SYNTAX
 -----------
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 05ca4a9..bef2805 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -435,7 +435,7 @@ static void parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 }
 
 /* Parse perf-probe event argument ...
From: Masami Hiramatsu
Date: Monday, March 29, 2010 - 1:38 pm

When building kernel without CONFIG_FRAME_POINTER, gcc uses
CFA (canonical frame address) for frame base. With this patch,
perf probe just gets CFI (call frame information) from debuginfo
and search corresponding CFA from the CFI. IOW, this allows
perf probe works correctly on the kernel without CONFIG_FRAME_POINTER.

<Before>
 ./perf probe -fn sched_slice:12 lw.weight
  Fatal: DW_OP 156 is not supported.
              (^^^ DW_OP_call_frame_cfa)

<After>
./perf probe -fn sched_slice:12 lw.weight
Add new event:
  probe:sched_slice    (on sched_slice:12 with weight=lw.weight)

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
---

 tools/perf/util/probe-finder.c |   14 +++++++++++---
 tools/perf/util/probe-finder.h |    1 +
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 11d6592..26f02b9 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -396,7 +396,6 @@ static void convert_location(Dwarf_Op *op, struct probe_finder *pf)
 	const char *regs;
 	struct kprobe_trace_arg *tvar = pf->tvar;
 
-	/* TODO: support CFA */
 	/* If this is based on frame buffer, set the offset */
 	if (op->atom == DW_OP_fbreg) {
 		if (pf->fb_ops == NULL)
@@ -623,11 +622,17 @@ static void convert_probe_point(Dwarf_Die *sp_die, struct probe_finder *pf)
 	/* Get the frame base attribute/ops */
 	dwarf_attr(sp_die, DW_AT_frame_base, &fb_attr);
 	ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops, &nops, 1);
-	if (ret <= 0 || nops == 0)
+	if (ret <= 0 || nops == 0) {
 		pf->fb_ops = NULL;
+	} else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa &&
+		   pf->cfi != NULL) {
+		Dwarf_Frame *frame;
+		ret = ...
Previous thread: [PATCH] mfd: Add WM8994 interrupt controller support by Mark Brown on Monday, March 29, 2010 - 1:28 pm. (6 messages)

Next thread: [PATCH tip/core/urgent] rcu: protect fork-time cgroup access by Paul E. McKenney on Monday, March 29, 2010 - 2:15 pm. (13 messages)