[Resent with fixed netdev@ address] I think, tc should've no problem with displaying summary stats of multiqueue qdiscs or even all of them separately, as mentioned by Patrick. And, maybe I still miss something, but there should be As I wrote earlier, it's a nice work, but since it makes a new API I guess we shouldn't hurry with merging this patch, and publish it as an RFC first. Then my humble suggestions would be to reserve more space for most of the columns to make it readable not only for scripts when more TX#, bytes, packets etc. Users of non-default qdiscs would also miss things like: q->ops->id, q->handle, and q->parent at least. Plus, as I mentioned earlier, q->qstats.qlen update with q->q.qlen (or using it directly) is needed. --
I made a patch, but for a 8 queue device (bnx2), here is the "tc -s -d qdisc" result : $ tc -s -d qdisc show qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 51814 bytes 459 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 Same name "eth0" is displayed, that might confuse parsers... What naming convention should we choose for multiqueue devices ? --
From: Eric Dumazet <eric.dumazet@gmail.com> We could give an index field to multiple root qdiscs assigned to a device. --
Here is a patch then :)
Only point is that I am iterating from 0 to dev->real_num_tx_queues
instead of dev->num_tx_queues. I hope it's fine, because there are
allocated qdisc, but not really used.
Next patches to allow selective qdisc change/fetch (providing a TCA_QINDEX
selector value to kernel)
Thanks
[PATCH net-next-2.6] tc: report informations for multiqueue devices
qdisc and classes are not yet displayed by "tc -s -d {qdisc|class} show"
for multiqueue devices.
We use a new TCA_QINDEX attribute, to report queue index to user space.
iproute2 tc should be changed to eventually display this queue index as in :
$ tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 52498 bytes 465 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth0 qindex 1 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
include/linux/rtnetlink.h | 1
net/sched/sch_api.c | 118 ++++++++++++++++++++----------------
2 files changed, 67 insertions(+), 52 deletions(-)
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index ba3254e..b80e0f6 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -490,6 +490,7 @@ enum
TCA_FCNT,
TCA_STATS2,
TCA_STAB,
+ TCA_QINDEX,
__TCA_MAX
};
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 24d17ce..74cde83 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -35,9 +35,9 @@
#include <net/pkt_sched.h>
static int qdisc_notify(struct sk_buff *oskb, struct nlmsghdr *n, u32 clid,
- struct Qdisc *old, struct Qdisc *new);
+ struct Qdisc *old, struct Qdisc *new, int qnum);
static int tclass_notify(struct sk_buff *oskb, struct nlmsghdr *n,
- struct Qdisc *q, unsigned long cl, int event);
+ ...Here is the iproute2 patch as well, to display queue indexes
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index ba3254e..b80e0f6 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -490,6 +490,7 @@ enum
TCA_FCNT,
TCA_STATS2,
TCA_STAB,
+ TCA_QINDEX,
__TCA_MAX
};
diff --git a/tc/tc_class.c b/tc/tc_class.c
index 9d4eea5..1bc4bc6 100644
--- a/tc/tc_class.c
+++ b/tc/tc_class.c
@@ -196,6 +196,13 @@ int print_class(const struct sockaddr_nl *who,
if (filter_ifindex == 0)
fprintf(fp, "dev %s ", ll_index_to_name(t->tcm_ifindex));
+ if (tb[TCA_QINDEX]) {
+ int qindex = 0;
+
+ memcpy(&qindex, RTA_DATA(tb[TCA_QINDEX]), MIN(RTA_PAYLOAD(tb[TCA_QINDEX]), sizeof(int)));
+ fprintf(fp, "qindex %d ", qindex);
+ }
+
if (t->tcm_parent == TC_H_ROOT)
fprintf(fp, "root ");
else {
diff --git a/tc/tc_qdisc.c b/tc/tc_qdisc.c
index c7f2988..3ec2cf4 100644
--- a/tc/tc_qdisc.c
+++ b/tc/tc_qdisc.c
@@ -232,6 +232,14 @@ int print_qdisc(const struct sockaddr_nl *who,
fprintf(fp, "qdisc %s %x: ", (char*)RTA_DATA(tb[TCA_KIND]), t->tcm_handle>>16);
if (filter_ifindex == 0)
fprintf(fp, "dev %s ", ll_index_to_name(t->tcm_ifindex));
+
+ if (tb[TCA_QINDEX]) {
+ int qindex = 0;
+
+ memcpy(&qindex, RTA_DATA(tb[TCA_QINDEX]), MIN(RTA_PAYLOAD(tb[TCA_QINDEX]), sizeof(int)));
+ fprintf(fp, "qindex %d ", qindex);
+ }
+
if (t->tcm_parent == TC_H_ROOT)
fprintf(fp, "root ");
else if (t->tcm_parent) {
--
On Wed, 02 Sep 2009 14:48:27 +0200 Even if the kernel keeps the value in something else for the API QINDEX needs to be a fixed size unsigned type like u32. --
Just to avoid duplicate work - I'm currently trying to put a patch together that presents multiqueue qdiscs to userspace as regular child qdiscs of a dummy qdisc, which also contains the aggregated statistics. So far it actually looks pretty sane :) --
Excellent ! I am cooking a patch to make vlan devices multiqueue aware (ie having same tx_queue numbers than their real_dev) --
vlan devices are currently not multi-queue capable.
We can do that with a new rtnl_link_ops method,
get_tx_queues(), called from rtnl_create_link()
This new method gets num_tx_queues/real_num_tx_queues
from real device.
register_vlan_device() is also handled.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
include/net/rtnetlink.h | 2 ++
net/8021q/vlan.c | 5 +++--
net/8021q/vlan_netlink.c | 20 ++++++++++++++++++++
net/core/rtnetlink.c | 10 +++++++++-
4 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 3c1895e..0525a1d 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -70,6 +70,8 @@ struct rtnl_link_ops {
size_t (*get_xstats_size)(const struct net_device *dev);
int (*fill_xstats)(struct sk_buff *skb,
const struct net_device *dev);
+ int (*get_tx_queues)(struct net*, struct nlattr *tb[],
+ int *tx_queues, int *real_tx_queues);
};
extern int __rtnl_link_register(struct rtnl_link_ops *ops);
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index e814794..8836575 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -330,12 +330,13 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
snprintf(name, IFNAMSIZ, "vlan%.4i", vlan_id);
}
- new_dev = alloc_netdev(sizeof(struct vlan_dev_info), name,
- vlan_setup);
+ new_dev = alloc_netdev_mq(sizeof(struct vlan_dev_info), name,
+ vlan_setup, real_dev->num_tx_queues);
if (new_dev == NULL)
return -ENOBUFS;
+ new_dev->real_num_tx_queues = real_dev->real_num_tx_queues;
dev_net_set(new_dev, net);
/* need 4 bytes for extra VLAN header info,
* hope the underlying device can handle it.
diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
index e9c91dc..8ce4122 100644
--- a/net/8021q/vlan_netlink.c
+++ b/net/8021q/vlan_netlink.c
@@ -100,6 +100,25 @@ static int vlan_changelink(struct net_device ...Looks great. This will also make it easier to test multiqueue scheduling without a multiqueue-capable device :) --
int (*get_tx_queues)(struct net *net, struct nlattr *tb[], Total nitpick, but tbird highlighted it because of the missing "net". -Brian --
Thats right, and I believe some gcc versions could complain
Thanks Brian
[PATCH net-next-2.6] vlan: multiqueue vlan device
vlan devices are currently not multi-queue capable.
We can do that with a new rtnl_link_ops method,
get_tx_queues(), called from rtnl_create_link()
This new method gets num_tx_queues/real_num_tx_queues
from real device.
register_vlan_device() is also handled.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
include/net/rtnetlink.h | 2 ++
net/8021q/vlan.c | 5 +++--
net/8021q/vlan_netlink.c | 20 ++++++++++++++++++++
net/core/rtnetlink.c | 10 +++++++++-
4 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 3c1895e..3f4ab54 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -70,6 +70,8 @@ struct rtnl_link_ops {
size_t (*get_xstats_size)(const struct net_device *dev);
int (*fill_xstats)(struct sk_buff *skb,
const struct net_device *dev);
+ int (*get_tx_queues)(struct net *net, struct nlattr *tb[],
+ int *tx_queues, int *real_tx_queues);
};
extern int __rtnl_link_register(struct rtnl_link_ops *ops);
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index e814794..8836575 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -330,12 +330,13 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
snprintf(name, IFNAMSIZ, "vlan%.4i", vlan_id);
}
- new_dev = alloc_netdev(sizeof(struct vlan_dev_info), name,
- vlan_setup);
+ new_dev = alloc_netdev_mq(sizeof(struct vlan_dev_info), name,
+ vlan_setup, real_dev->num_tx_queues);
if (new_dev == NULL)
return -ENOBUFS;
+ new_dev->real_num_tx_queues = real_dev->real_num_tx_queues;
dev_net_set(new_dev, net);
/* need 4 bytes for extra VLAN header info,
* hope the underlying device can handle it.
diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
index e9c91dc..8ce4122 ...On Wed, 02 Sep 2009 20:55:24 +0200 Nitpick again. use unsigned unless you have some reason to allow -1. --
No problem :)
Thanks Stephen
[PATCH net-next-2.6] vlan: multiqueue vlan device
vlan devices are currently not multi-queue capable.
We can do that with a new rtnl_link_ops method,
get_tx_queues(), called from rtnl_create_link()
This new method gets num_tx_queues/real_num_tx_queues
from real device.
register_vlan_device() is also handled.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
include/net/rtnetlink.h | 3 +++
net/8021q/vlan.c | 5 +++--
net/8021q/vlan_netlink.c | 20 ++++++++++++++++++++
net/core/rtnetlink.c | 10 +++++++++-
4 files changed, 35 insertions(+), 3 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 3c1895e..b630196 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -70,6 +70,9 @@ struct rtnl_link_ops {
size_t (*get_xstats_size)(const struct net_device *dev);
int (*fill_xstats)(struct sk_buff *skb,
const struct net_device *dev);
+ int (*get_tx_queues)(struct net *net, struct nlattr *tb[],
+ unsigned int *tx_queues,
+ unsigned int *real_tx_queues);
};
extern int __rtnl_link_register(struct rtnl_link_ops *ops);
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index e814794..8836575 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -330,12 +330,13 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
snprintf(name, IFNAMSIZ, "vlan%.4i", vlan_id);
}
- new_dev = alloc_netdev(sizeof(struct vlan_dev_info), name,
- vlan_setup);
+ new_dev = alloc_netdev_mq(sizeof(struct vlan_dev_info), name,
+ vlan_setup, real_dev->num_tx_queues);
if (new_dev == NULL)
return -ENOBUFS;
+ new_dev->real_num_tx_queues = real_dev->real_num_tx_queues;
dev_net_set(new_dev, net);
/* need 4 bytes for extra VLAN header info,
* hope the underlying device can handle it.
diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
index e9c91dc..cd5c2ae 100644
--- ...From: Eric Dumazet <eric.dumazet@gmail.com> Applied, but now I need you to do an audit :-) I believe that drivers will change the number of TX queues in use at times where we'll need to trigger an event or something so that vlan's can learn about the value changing. --
Yes, I know :) This is why I kept both num_tx_queues and real_num_tx_queues values, and not the later :) Thanks --
Here is a followup to enable 'true' multiqueue vlan xmit.
I missed this in my first patch.
[PATCH net-next-2.6] vlan: enable multiqueue xmits
vlan_dev_hard_start_xmit() & vlan_dev_hwaccel_hard_start_xmit()
select txqueue number 0, instead of using index provided by
skb_get_queue_mapping().
This is not correct after commit 2e59af3dcbdf11635c03f
[vlan: multiqueue vlan device] because
txq->tx_packets & txq->tx_bytes changes are performed on
a single location, and not the right locking.
Fix is to take the appropriate struct netdev_queue pointer
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 53f84c7..3938c3e 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -291,7 +291,8 @@ static int vlan_dev_hard_header(struct sk_buff *skb, struct net_device *dev,
static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
struct net_device *dev)
{
- struct netdev_queue *txq = netdev_get_tx_queue(dev, 0);
+ int i = skb_get_queue_mapping(skb);
+ struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
struct vlan_ethhdr *veth = (struct vlan_ethhdr *)(skb->data);
/* Handle non-VLAN frames if they are sent to us, for example by DHCP.
@@ -329,7 +330,8 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
static netdev_tx_t vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb,
struct net_device *dev)
{
- struct netdev_queue *txq = netdev_get_tx_queue(dev, 0);
+ int i = skb_get_queue_mapping(skb);
+ struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
u16 vlan_tci;
vlan_tci = vlan_dev_info(dev)->vlan_id;
--
From: Eric Dumazet <eric.dumazet@gmail.com> Details details :-) Looks good, applied! --
There is one thing that bothers me with "tc -s -d qdisc", maybe its the right time to speak qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 54823 bytes 490 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth1 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 468 bytes 6 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 The rate estimation is given by kernel even if no rate estimation is performed. User space doesnt have an indication saying this 0 numbers are real or fake (rate 0bit 0pps). Could we set a bit at Qdisc level when a gen_new_estimator() is really done one a Qdisc, so that we do not call gnet_stats_copy_rate_est() from tc_fill_qdisc() if this bit is not set ? -> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 54823 bytes 490 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth1 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 468 bytes 6 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 --
That should be possible, I'll look into it once I've finished my patch. --
This might confuse existing userspace since the handle is not unique anymore. libnl f.i. will treat all but the first root qdisc as an update and use it to update the state of the first one. There's also no combined view for applications unaware of multiqueue. Please have a look at the mail I just wrote for some possible ways around this. --
Hum, how can we combine infos on qdisc/class if in the future we allow each queue index to have its own qdisc/classes ? htb on queue index 0 cbq on queue index 1 Combining info would lock us and not allow for special configurations. Say macvlan device 0 mapped to queue index 0 macvlan device 1 mapped to queue index 1... For old apps, just give informations for queue 0 as we do now, and allow kernel to give more informations only if new application provided a TCA_INDEX attribute in its request ? (-1 : all queue indexes, >=0 for a given queue index) --
My suggestion was to only dump the statistics in the combined view and use a virtual qdisc, something like: qdisc multiqueue 0: dev eth0 root queues 8 Sent ... rate ... and show each real qdisc as child of this qdisc: qdisc pfifo_fast <unique handle> dev eth0 parent 0: bands 3 ... qdisc pfifo_fast <unique handle> dev eth0 parent 0: bands 3 ... Configuration would be symetrical to this: tc qdisc add dev eth0 handle 0: root multiqueue tc qdisc add dev eth0 handle x: parent 0: pfifo_fast ... without the virtual multiqueue qdisc, the root qdisc would simply If we don't combine the information, existing multiqueue unaware applications will get incorrect information. There's also the problem of non-unique handles, I think we should encode the queue index in the handle instead of using a new attribute. This needs a bit more thought though to avoid clashes with user-defined handles. --
Actually, I wonder why it can't be a real "virtual" qdisc with classes, similar to... multiq, doing such mappings inside, according to the current api: tc qdisc add dev eth0 handle 1: root multiqueue tc qdisc add dev eth0 handle x: parent 1:1 pfifo_fast Jarek P. PS: but I'd prefer (more) different name, even mq, mtq or something. --
Hmm... anything could break here something for somebody, so there is still a (Patrick's) question if not sum it all? Otherwise, I wonder about using the qdisc handle (tcm_handle>>16): there would be at least one "pfifo_fast 0:" looking like proper root for somebody... Jarek P. --
I meant "proper" for pfifo_fast. On the other hand, I wonder why these multiqueue qdisc handles can't be really given such unique per dev (instead of per queue) handles? Jarek P. --
In my opinion the best way would be to sum up the stats and display them for the "main" qdisc to avoid any compatibility problems and additionally dump each queue for informational purposes. This raises a few more questions however. First of all, there's no "main" qdisc, so if we just use the first one for the summed up stats, the question is whether the parameters of all root qdiscs are the same. Currently they should be since pfifo_fast doesn't support changing parameters, but this might change in the future, in which case we might be displaying "incorrect" parameters. The next question would be whether and how to support changing parameters of individual multiq qdiscs. Similar to dumps, when changing qdisc parameters we always pick queue number 0. It we want to support changing parameters of different queues, we could replicate changes to all queues, which would avoid the problem stated above, or we could add an optional identifier for the queue number, or we could use a combination of both which only replicates settings when no queue number is specified. In the second and third case, one possibility to get around the incorrect parameters for the "main" qdisc would be to display a virtual (non-existant) qdisc as the root, which only contains the stats. I don't think the default choice of root qdisc type should be counted as belonging to the API and userspace should be prepared to deal with this. And finally, the TC API used to be symetrical in the sense that you could replay a dump to the kernel to get the same configuration. Dumping the individual queues would break that property unless we also support creating and configuring them in a symetrical fashion. So here's something of a possible solution, assuming that at some point we do want to support changing parameters for individual root qdiscs and taking the above into account: - when creating a qdisc on a multiqueue-device through the TC API, we keep the current behaviour and share it between all ...
Remember that we already have a naming convention in /proc/interrupts eth0-tx-<number> Lets not introduce too many new once ;-) Cheers, Jesper Brouer -- ------------------------------------------------------------------- MSc. Master of Computer Science Dept. of Computer Science, University of Copenhagen Author of http://www.adsl-optimizer.dk ------------------------------------------------------------------- --
The approach I'm currently working on will present multiqueue root qdiscs as children of a dummy classful qdisc. This avoids handle clashes and the need for new identifiers and allows to address each qdisc seperately, similar to how it works with other classful qdiscs: qdisc mq 1: root refcnt 2 Sent 126 bytes 3 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 class mq 1:1 root leaf 8001: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 class mq 1:2 root leaf 8002: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 class mq 1:3 root leaf 8003: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 class mq 1:4 root leaf 8004: Sent 126 bytes 3 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Its a bit tricky though so I'm likely going to need a few more hours. --
Seems pretty cool, thanks Patrick. --
I like your approach. Its well suited for the qdiscs :-) I especially like the possibility to access each qdisc seperately. Does it then support having seperate qdisc per TX queue? (I'm toying with the idea of transmitting our multicast traffic into/via a seperate TX hardware queue, and making a special qdisc for IPTV MPEG2-TS shaping) Cheers, Jesper Brouer -- ------------------------------------------------------------------- MSc. Master of Computer Science Dept. of Computer Science, University of Copenhagen Author of http://www.adsl-optimizer.dk ------------------------------------------------------------------- --
Yes, you can attach qdiscs to the classes representing the queues. At least it should work :) It would probably also be possible to use TC classifiers for queue selection. --
From: Patrick McHardy <kaber@trash.net> You mean just like Intel's multiqueue scheduler and multiqueue classifiers which we already have in the tree :-) --
No, actually integrated in the current infrastructure :) The patches add a virtual root qdisc, so we still have a single root from userspace's POV. This qdisc is only used for sch_api purposes and doesn't necessarily match the real root qdiscs. When a "mq" qdisc is attached (done by default for multiqueue devices), the queues are exposed as child classes of this qdisc and can be individually grafted with different qdiscs. Since the virtual root is global per device, what we could do is allow to attach classifiers and execute them in dev_pick_tx() to select the queue. But currently this would require global locking, so this would have to be fixed previously. Probably not an easy thing to do. --
Since its getting late and I won't finish this tonight, here are the patches I'm currently working on in case someone already wants to have a look. Its working as intended and should be testable. Patch 1 is good shape IMO, patch 3 still needs some work. Userspace needs no changes.
From: Jesper Dangaard Brouer <hawk@diku.dk> Indeed it sounds interesting. But I want to warn everyone against adding support for anything that adds a way to do operations "on every device qdisc" at the kernel level. I tried to do that and the error unwinding complexity is unacceptable. We can make the tools support things like this. Keep the kernel simple and allow it to only operate on one queue at a time for a given individual config request. --
From: Jesper Dangaard Brouer <hawk@diku.dk> Yes, please :-) --
Ok. Can you come up with a patch? net/sched/sch_api.c can likely be patches with the loop logic that Eric suggested earlier. --
Yes, it's nice Eric and Patrick found time to fix it most properly. Jarek P. --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/ |
