Re: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Herbert Xu
Date: Saturday, September 20, 2008 - 11:38 pm

On Sat, Sep 20, 2008 at 10:50:33PM -0700, David Miller wrote:

Looks good to me.

However, I've changed my mind about letting this thread die :)

Let's go back to the basic requirements.  I claim that there
are exactly two different ways for which multiple TX queues are
useful:  SMP scaling and QOS.  In the first case we stuff different
flows into different queues to reduce contention between CPUs.
In the latter case we put packets of different priorities into
different queues in order to prevent a storm of lower priority
packets from starving higher priority ones that arrive later.

Despite the different motivations, these two scenarios have
one thing in common, we can structure it such that there is
a one-to-one correspondence between the qdisc/software queues
and the hardware queues.  I know that this isn't currently the
case for prio but I'll get to that later in the message.

What I'm trying to say is that we shouldn't ever support cases
where a single qdisc empties into multiple TX queues.  It just
doesn't make sense.

For example, if you were using a qdisc like TBF, multiple queues
buy you absolutely nothing.  All it does is give you a longer queue
to stuff packets into but you can already get that in software.

Why am I saying all this? It's because a lot of the complexity
in the current code comes from supporting the case of one qdisc
queue mapping onto n hardware queues.  If we didn't do that then
handling stopped queues becomes trivial (or at least as easy as
it was before).

Put it another way, it makes absolutely no sense to map packets
onto different queues after you've dequeued them from a single
qdisc queue.  The mapping by hashing is for SMP scalability only
and if you've already gone through a single qdisc queue you can
stop worrying about it because it will suck on SMP :)

Going back to the case of prio, I think what we should do is to
create a separate qdisc queue for each band.  The qdisc selection
should be done before the packet is queued, just as we do in the
TX hashing case.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH] pkt_sched: Destroy gen estimators under rtnl_lock()., Jarek Poplawski, (Mon Aug 11, 1:53 pm)
Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_l ..., Denys Fedoryshchenko, (Sun Aug 17, 11:08 pm)
[PATCH] pkt_sched: Fix qdisc list locking, Jarek Poplawski, (Fri Aug 22, 1:41 am)
[PATCH take 2] pkt_sched: Fix qdisc list locking, Jarek Poplawski, (Fri Aug 22, 2:27 am)
Re: [PATCH] pkt_sched: Fix qdisc list locking, Herbert Xu, (Fri Aug 22, 3:14 am)
Re: [PATCH take 2] pkt_sched: Fix qdisc list locking, Herbert Xu, (Fri Aug 22, 3:15 am)
Re: [PATCH take 2] pkt_sched: Fix qdisc list locking, David Miller, (Fri Aug 22, 3:23 am)
Re: [PATCH take 2] pkt_sched: Fix qdisc list locking, David Miller, (Fri Aug 22, 3:28 am)
Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_l ..., Stephen Hemminger, (Sun Aug 24, 4:26 pm)
Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_l ..., Stephen Hemminger, (Sun Aug 24, 5:29 pm)
Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_l ..., Stephen Hemminger, (Tue Aug 26, 5:24 am)
Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_l ..., Stephen Hemminger, (Tue Aug 26, 5:50 am)
RE: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs. dev ..., Duyck, Alexander H, (Mon Sep 15, 4:44 pm)
Re: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs. dev ..., Herbert Xu, (Sat Sep 20, 11:38 pm)