Hello Andrew,
On Thu, 2010-09-23 at 13:11 -0700, Andrew Morton wrote:
Or it can be used as alternative. Since procfs has its drawbacks (e.g.
performance) an alternative could be helpful.
And the taskstats interface with the TASKSTATS_CMD_ATTR_PID command
already exists and can be used. So we already have a second mechanism to
query tasks accounting information besides of procfs.
I agree.
I already thought about that problem. Another problem is that depending
on the kernel config options, some taskstats fields may be not
initialized. E.g. CONFIG_TASK_DELAY_ACCT or CONFIG_TASK_XACCT. Currently
there does not exist a good interface to userspace to query which fields
are valid.
Regarding the taskstats versions I described a possible solution in the
userspace tarball in the README.libtaskstats file:
The "struct taskstats" structure contains accounting information for one
Linux task. This structure is defined in "/usr/include/linux/taskstats.h".
With new kernel versions new fields can be added to that structure.
In that case the kernel taskstats version number defined with the macro
TASKSTATS_VERSION will be increased.
The taskstats library distinguishes between two taskstats versions:
* Kernel taskstats version (KV)
* Program compile taskstats version (CV)
Depending on the taskstats version CV that is used for compiling the program,
this version numbers can be different:
* KV > CV:
The libtaskstats library only copies the CV taskstats fields and the fields
that belong to version > CV will be ignored.
* KV < CV:
The libtaskstats library only copies the version KV fields and the fields
that belong to version > KV remain uninitialized.
If a program wants to support multiple taskstats versions, this can be done
using the ts_version() function and process fields according to that version
number.
Example:
if (ts_version() < 7) {
fprintf(stderr, "Error: kernel taskstats version too low\n");
exit(1);
}
if (ts_version() >= 7)
print_attrs_v7();
if (ts_version() >= 8)
print_attrs_v8();
In this example the program has to be compiled with a taskstats.h header file
that has at least version 8.
Sure, but if we could add the /proc/taskstats approach, this dependency
would not be there.
The system is a virtual machine and has three CPUs.
The update period is two seconds.
When I run that testcase on my laptop, 2 CPUs (Intel Core 2 - 2.33GHz),
I get about 1-2% system time for top.
The current idea is the following:
1. Open /proc/taskstats
2. Set the requested command (e.g. TASKSTATS_CMD_ATTR_PIDS) using
an ioctl. For the TASKSTATS_CMD_ATTR_PIDS ioctl the following
structure is sent:
struct taskstats_cmd_pids {
__u64 time_ns;
__u32 pid;
__u32 cnt;
};
3. After the command is defined, with a read() the command is executed
and the result is returned to the user's read buffer.
We could replace step 2 with a write, that transfers the command.
Yes
Patch 04/10 updates the taskstats version number from 7 to 8.
I didn't want to update the version number with each patch.
Yes, at least a proposal for that.
To be honest, I have not tested that. I assumed that the current
taskstats code does this correctly. E.g. it uses find_task_by_vpid() for
TASKSTATS_CMD_ATTR_PID and this function uses
"current->nsproxy->pid_ns". So I would assume that we get only tasks
from the caller's namespace. The new TASKSTATS_CMD_ATTR_PIDS command
also uses also only functions with "current->nsproxy->pid_ns".
Good question. Probably I have to learn a bit more about the PID
namespace implementation. Are PIDs over all namespaces unique?
Michael
--