Re: Kernel WARNING: at net/core/dev.c:1330 __netif_schedule+0x2c/0x98()

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: David Miller
Date: Saturday, July 26, 2008 - 2:18 am

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Fri, 25 Jul 2008 22:01:37 +0200


I think there might be an easier way, but we may have
to modify the state bits a little.

Every call into ->hard_start_xmit() is made like this:

1. lock TX queue
2. check TX queue stopped
3. call ->hard_start_xmit() if not stopped

This means that we can in fact do something like:

	unsigned int i;

	for (i = 0; i < dev->num_tx_queues; i++) {
		struct netdev_queue *txq;

		txq = netdev_get_tx_queue(dev, i);
		spin_lock_bh(&txq->_xmit_lock);
		netif_tx_freeze_queue(txq);
		spin_unlock_bh(&txq->_xmit_lock);
	}

netif_tx_freeze_queue() just sets a new bit we add.

Then we go to the ->hard_start_xmit() call sites and check this new
"frozen" bit as well as the existing "stopped" bit.

When we unfreeze each queue later, we see if it is stopped, and if not
we schedule it's qdisc for packet processing.

A patch below shows how the guarding would work.  It doesn't implement
the actual freeze/unfreeze.

We need to use a side-state bit to do this because we don't
want this operation to get all mixed up with the queue waking
operations that the driver TX reclaim code will be doing
asynchronously.

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b4d056c..cba98fb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -440,6 +440,7 @@ static inline void napi_synchronize(const struct napi_struct *n)
 enum netdev_queue_state_t
 {
 	__QUEUE_STATE_XOFF,
+	__QUEUE_STATE_FROZEN,
 };
 
 struct netdev_queue {
@@ -1099,6 +1100,11 @@ static inline int netif_queue_stopped(const struct net_device *dev)
 	return netif_tx_queue_stopped(netdev_get_tx_queue(dev, 0));
 }
 
+static inline int netif_tx_queue_frozen(const struct netdev_queue *dev_queue)
+{
+	return test_bit(__QUEUE_STATE_FROZEN, &dev_queue->state);
+}
+
 /**
  *	netif_running - test if up
  *	@dev: network device
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index c127208..6c7af39 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -70,6 +70,7 @@ static void queue_process(struct work_struct *work)
 		local_irq_save(flags);
 		__netif_tx_lock(txq, smp_processor_id());
 		if (netif_tx_queue_stopped(txq) ||
+		    netif_tx_queue_frozen(txq) ||
 		    dev->hard_start_xmit(skb, dev) != NETDEV_TX_OK) {
 			skb_queue_head(&npinfo->txq, skb);
 			__netif_tx_unlock(txq);
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index c7d484f..3284605 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3305,6 +3305,7 @@ static __inline__ void pktgen_xmit(struct pktgen_dev *pkt_dev)
 
 	txq = netdev_get_tx_queue(odev, queue_map);
 	if (netif_tx_queue_stopped(txq) ||
+	    netif_tx_queue_frozen(txq) ||
 	    need_resched()) {
 		idle_start = getCurUs();
 
@@ -3320,7 +3321,8 @@ static __inline__ void pktgen_xmit(struct pktgen_dev *pkt_dev)
 
 		pkt_dev->idle_acc += getCurUs() - idle_start;
 
-		if (netif_tx_queue_stopped(txq)) {
+		if (netif_tx_queue_stopped(txq) ||
+		    netif_tx_queue_frozen(txq)) {
 			pkt_dev->next_tx_us = getCurUs();	/* TODO */
 			pkt_dev->next_tx_ns = 0;
 			goto out;	/* Try the next interface */
@@ -3352,7 +3354,8 @@ static __inline__ void pktgen_xmit(struct pktgen_dev *pkt_dev)
 	txq = netdev_get_tx_queue(odev, queue_map);
 
 	__netif_tx_lock_bh(txq);
-	if (!netif_tx_queue_stopped(txq)) {
+	if (!netif_tx_queue_stopped(txq) &&
+	    !netif_tx_queue_frozen(txq)) {
 
 		atomic_inc(&(pkt_dev->skb->users));
 	      retry_now:
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index fd2a6ca..f17551a 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -135,7 +135,8 @@ static inline int qdisc_restart(struct Qdisc *q)
 	txq = netdev_get_tx_queue(dev, skb_get_queue_mapping(skb));
 
 	HARD_TX_LOCK(dev, txq, smp_processor_id());
-	if (!netif_subqueue_stopped(dev, skb))
+	if (!netif_tx_queue_stopped(txq) &&
+	    !netif_tx_queue_frozen(txq))
 		ret = dev_hard_start_xmit(skb, dev, txq);
 	HARD_TX_UNLOCK(dev, txq);
 
@@ -162,7 +163,8 @@ static inline int qdisc_restart(struct Qdisc *q)
 		break;
 	}
 
-	if (ret && netif_tx_queue_stopped(txq))
+	if (ret && (netif_tx_queue_stopped(txq) ||
+		    netif_tx_queue_frozen(txq)))
 		ret = 0;
 
 	return ret;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[GIT]: Networking, David Miller, (Sun Jul 20, 10:44 am)
Re: [GIT]: Networking, Arjan van de Ven, (Sun Jul 20, 10:59 am)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 4:52 pm)
Re: [GIT]: Networking, Linus Torvalds, (Sun Jul 20, 5:54 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 6:03 pm)
Re: [GIT]: Networking, Linus Torvalds, (Sun Jul 20, 6:07 pm)
Re: [GIT]: Networking, Alexey Dobriyan, (Sun Jul 20, 6:09 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 6:14 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 6:17 pm)
Re: [GIT]: Networking, Patrick McHardy, (Sun Jul 20, 6:20 pm)
Re: [GIT]: Networking, Alexey Dobriyan, (Sun Jul 20, 6:22 pm)
Re: [GIT]: Networking, Alexey Dobriyan, (Sun Jul 20, 7:40 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 7:48 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 10:11 pm)
Re: [GIT]: Networking, Alexander Beregalov, (Mon Jul 21, 2:48 am)
Re: [GIT]: Networking, Ben Hutchings, (Mon Jul 21, 3:16 am)
Re: [GIT]: Networking, Stefan Richter, (Mon Jul 21, 4:28 am)
Re: [GIT]: Networking, James Morris, (Mon Jul 21, 4:45 am)
Re: [GIT]: Networking, Alexey Dobriyan, (Mon Jul 21, 4:57 am)
Re: [GIT]: Networking, Patrick McHardy, (Mon Jul 21, 5:05 am)
Re: [GIT]: Networking, Ingo Molnar, (Mon Jul 21, 6:50 am)
Re: [GIT]: Networking, Stefan Richter, (Mon Jul 21, 7:15 am)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 8:27 am)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 8:35 am)
Re: [GIT]: Networking, Alexander Beregalov, (Mon Jul 21, 9:04 am)
Re: [GIT]: Networking, Linus Torvalds, (Mon Jul 21, 9:49 am)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 9:53 am)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 10:28 am)
Re: [GIT]: Networking, Linus Torvalds, (Mon Jul 21, 10:40 am)
[crash] kernel BUG at net/core/dev.c:1328!, Ingo Molnar, (Mon Jul 21, 11:23 am)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Linus Torvalds, (Mon Jul 21, 11:35 am)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Ingo Molnar, (Mon Jul 21, 11:46 am)
Re: [crash] kernel BUG at net/core/dev.c:1328!, David Miller, (Mon Jul 21, 12:00 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Stefan Richter, (Mon Jul 21, 12:20 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Ingo Molnar, (Mon Jul 21, 12:30 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Ingo Molnar, (Mon Jul 21, 12:44 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, David Miller, (Mon Jul 21, 1:11 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, David Miller, (Mon Jul 21, 1:20 pm)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 1:32 pm)
Re: [GIT]: Networking, Patrick McHardy, (Mon Jul 21, 1:33 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Stefan Richter, (Mon Jul 21, 2:26 pm)
[TCP bug] stuck distcc connections in latest -git, Ingo Molnar, (Tue Jul 22, 4:21 am)
Re: [TCP bug] stuck distcc connections in latest -git, David Newall, (Tue Jul 22, 6:45 am)
Re: [TCP bug] stuck distcc connections in latest -git, Ingo Molnar, (Tue Jul 22, 6:57 am)
Re: [TCP bug] stuck distcc connections in latest -git, David Newall, (Tue Jul 22, 7:54 am)
Re: [TCP bug] stuck distcc connections in latest -git, Ingo Molnar, (Tue Jul 22, 8:34 am)
Re: [TCP bug] stuck distcc connections in latest -git, Willy Tarreau, (Tue Jul 22, 2:12 pm)
Re: [TCP bug] stuck distcc connections in latest -git, Ingo Molnar, (Wed Jul 23, 1:26 am)
Re: [GIT]: Networking, David Miller, (Wed Jul 23, 4:42 pm)
Re: [regression] nf_iterate(), BUG: unable to handle kerne ..., Krzysztof Oledzki, (Thu Jul 24, 11:00 am)
Re: Kernel WARNING: at net/core/dev.c:1330 __netif_schedul ..., David Miller, (Sat Jul 26, 2:18 am)