State of TCP segmentation offload to NIC in linux

Previous thread: Re: [ofa-general] RE: [PATCH 11/13] QLogic VNIC: Driver utility file - implements various utility macros by Ramachandra K on Thursday, May 1, 2008 - 10:01 am. (2 messages)

Next thread: Re: New driver "sfc" for Solarstorm SFC4000 controller. by Andrew Morton on Thursday, May 1, 2008 - 12:08 pm. (11 messages)
From: prodyut hazarika
Date: Thursday, May 1, 2008 - 10:56 am

Hi all,
I am trying to understand the current state of TCP segmentation
offload to Network cards. in Linux I have read posts that tells that
Intel e1000 driver supports TSO (TCP segmentation offload). This link
tells that TSO is supported in Linux:
http://lwn.net/Articles/9129/

 I see the define NETIF_F_TSO in the linux kernel, and I see that the
e1000 driver is setting this flag in its capabilities list. But for
TSO to work, TCP code should send the big segment (say 64K) directly
to the NIC, bypassing the IP layer. Else, the IP layer would fragment
the packets, and the whole purpose of TSO will be lost. I cannot see
any code in tcp layer which bypasses the IP layer and goes directly to
the NIC. It seems to me that the TCP/IP stack in Linux always does the
segmentation, since there is no code under the NETIF_F_TSO define
which bypasses the IP layer. Or am I missing something?

Can anyone please tell me whether the Linux TCP/IP stack support TSO
so that TCP segments bigger than MSS is segmented by the NIC, not the
software? If not, is there plans to support this, since this is
stateless offload?

Any pointers would be greatly appreciated.

Regards,
Prodyuthttp://lwn.net/Articles/9129/

 I see the define NETIF_F_TSO in the linux kernel, and I see that the
e1000 driver is setting this flag in its capabilities list. But for
TSO to work, TCP code should send the big segment (say 64K) directly
to the NIC, bypassing the IP layer. Else, the IP layer would fragment
the packets, and the whole purpose of TSO will be lost. I cannot see
any code in tcp layer which bypasses the IP layer and goes directly to
the NIC. It seems to me that the TCP/IP stack in Linux always does the
segmentation, since there is no code under the NETIF_F_TSO define
which bypasses the IP layer. Or am I missing something?

Can anyone please tell me whether the Linux TCP/IP stack support TSO
so that TCP segments bigger than MSS is segmented by the NIC, not the
software? If not, is there plans to support this, ...
From: Waskiewicz Jr, Peter P
Date: Thursday, May 1, 2008 - 12:39 pm

The IP layer does not need to be bypassed.  The TCP layer will set the
proper headers for TSO to work properly and continue sending the skb
down the stack.  IP will not fragment a packet on transmit.  The only
way an skb will be fragmented into smaller chunks is if gso_segment() is
called, which will call the proper upper-layer segmenter routine.  This
case only happens when a packet was coalesced by the TCP layer and sent
to the core stack, which then ran GSO (generic segmentation offload) in
net/core/dev.c, dev_hard_start_xmit(), which would be the case if the
underlying NIC didn't have TSO enabled, but still wanted to have the TCP

Yes, TSO is fully supported.  The default is 64k frames, and in the
upstream tree, netif_set_gso_max_size().  I know most 1 GbE and beyond
parts and Linux drivers fully support TSO.

I hope this helps.

Cheers,
-PJ Waskiewicz
--

From: prodyut hazarika
Date: Wednesday, May 7, 2008 - 2:26 pm

Thanks for your reply, Peter. I would appreciate if you could clarify

Which flag/header does the TCP layer set to indicate to IP that it
should not fragment a 64K segment, but rather offload the whole
segment to NIC? I tried to look at the TCP code in linux stack, but
could not figure the flag or the logic that sets the flag/header you
mentioned. The NIC driver must indicate that it is capable of doing
TSO by setting NETIF_F_TSO flag, but NETIF_F_TSO capability is not
checked in IP/TCP layer.

I would really appreciate if you could point me to the function in TCP
code in linux.

Thanks for your time,
Prodyut
--

From: Ben Hutchings
Date: Wednesday, May 7, 2008 - 2:51 pm

It's checked using sk_can_gso() (defined in <net/sock.h>) which

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
--

From: Waskiewicz Jr, Peter P
Date: Wednesday, May 7, 2008 - 3:23 pm

To add: the flag that sk_can_gso() checks is set on the socket creation,
based on the underlying netdevice's TSO capabilities.  There are a bunch
of flags in the sk struct that get set on TCP socket creation that are
then used elsewhere in the TCP stack.

Cheers,
-PJ
--

Previous thread: