Kernel behavior when returning NETDEV_TX_BUSY from hard_start_xmit

Previous thread: [PATCH 0/8] can: CAN network device driver interface and drivers by Wolfgang Grandegger on Thursday, February 19, 2009 - 12:01 pm. (21 messages)

Next thread: [PATCH 0/7] Add support to FCoE offload through net_device by Yi Zou on Thursday, February 19, 2009 - 12:49 pm. (1 message)
From: Tony Asleson
Date: Thursday, February 19, 2009 - 1:19 pm

I am running into the following scenario and I wanted to verify that it is
expected.

A user application does a single 8k UDP send.  While the Ethernet device
driver is accepting each skb via hard_start_xmit it temporarily runs out
of resources and calls netif_stop_queue and returns NETDEV_TX_BUSY.
When the driver catches up it does a netif_wake_queue to resume skb
transmission.  When this occurs during the UDP send, the Ethernet driver
does not get any of the remaining skb buffers and the UDP send is
incomplete.  However, the user application returns from the sendto
with success.

Because UDP has no guaranteed delivery this seems like an acceptable thing.
However, if a device driver imposes a limit to the number of skb buffers
it can handle in flight which is < the total number of skb buffers it takes
to complete.  The operation will never be successful.

For example, to support a single 64k UDP send, the device driver must be
able to accept ~45 skb buffers (1500 MTU) without returning NETDEV_TX_BUSY.
If the Ethernet device is unable to transmit faster than the arrival rate
of skb buffers the driver then needs to queue them up.

Is this expected behavior or am I missing something in my driver?  How
much should a device driver be expected to queue before calling
netif_stop_queue and ensure normal packet transmission?  Why doesn't
the network stack continue after the queue has been stopped?

Notes:
-Linux 2.6.16.21 kernel (sles 10) embedded PPC platform.
-Pseudo Ethernet device in question has ~2.8 MiB bandwidth.

Many thanks!
-Tony
--

From: Stephen Hemminger
Date: Thursday, February 19, 2009 - 2:26 pm

On Thu, 19 Feb 2009 14:19:10 -0600


Well written drivers never return TX_BUSY. They manage the queue
such that it is stopped when there is no space and only woken up
when there is space.
--

Previous thread: [PATCH 0/8] can: CAN network device driver interface and drivers by Wolfgang Grandegger on Thursday, February 19, 2009 - 12:01 pm. (21 messages)

Next thread: [PATCH 0/7] Add support to FCoE offload through net_device by Yi Zou on Thursday, February 19, 2009 - 12:49 pm. (1 message)