data received but not detected

Previous thread: [PATCH] udp: sk_drops handling by Eric Dumazet on Tuesday, June 17, 2008 - 2:31 pm. (4 messages)

Next thread: none
From: Travis Stratman
Date: Tuesday, June 17, 2008 - 3:08 pm

Hello,

(I sent this earlier today but it doesn't look like it made it, I
apologize if it gets through multiple times)

I am working on an application that uses a fairly simple UDP protocol to
send data between two embedded devices. I'm noticing an issue with an
initial test that was written where datagrams are received but not seen
by the recvfrom() call until more data arrives after it. As of right now
the test case does not implement any type of lost packet protection or
other flow control, which is what makes the issue so noticeable.

The target for this code is a board using the Atmel AT91SAM9260 ARM
processor. I have tested with 2.6.20 and 2.6.25 on this board.

The test consists of a two applications with the following pseudo code
(msg_size = 127, 9003/9005 are the UDP ports used):

"client app"
while(1) {
    sendto(9003, &msg_size, 4bytes);
    sendto(9003, buffer, msg_size);
    recvfrom(9005, &msg_size, 4bytes);
    recvfrom(9005, buffer, msg_size);
}

"server app"
while(1) {
    recvfrom(9003, &msg_size, 4bytes);
    recvfrom(9003, buffer, msg_size);
    sendto(9005, &msg_size, 4bytes);
    sendto(9005, buffer, msg_size);
}

As long as the server is started first and no packets are lost or out of
order, the client and server should continue indefinitely. When run
between two boards on a local gigabit switch, the application will run
smoothly most of the time, but I periodically see delays of 30 seconds
or more where one of the applications is waiting for the second datagram
to arrive before sending the next packet. Wireshark shows that the data
was sent very shortly after the first datagram, and no packets are ever
lost, ifconfig reports no collisions, overruns, or errors.

When I run the application between two identical devices on a cross-over
cable, data is transferred for a few seconds after which everything
freezes until I send a ping between the two boards in the background.
This forces the communication to start up again for a few seconds ...
From: Stephen Hemminger
Date: Tuesday, June 17, 2008 - 3:27 pm

On Tue, 17 Jun 2008 17:08:58 -0500

I am unfamiliar with interrupts on the ARM. Are IRQ's level or edge triggered?
NAPI won't work if interrupts are edge-triggered.
--

From: Travis Stratman
Date: Tuesday, June 17, 2008 - 3:40 pm

Interrupts in this case are set to be level triggered. It has an
interrupt controller that allows them to be configured several ways. The
EMAC driver for the at91sam9260 is in drivers/net/macb.[ch]. Also note
that the 133 MHz x86 that I tested on was an STPC Elite (it also
displayed the same behavior).

Thanks,

Travis


--

From: Ben Greear
Date: Tuesday, June 17, 2008 - 3:31 pm

UDP packets can be lost anywhere..including in the receive buffer
after it has been received by the NIC.

You probably just need to write your code smarter to use non-blocking
IO and deal with packet loss.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

--

From: Travis Stratman
Date: Tuesday, June 17, 2008 - 3:58 pm

Thanks Ben.

I understand that there is no guarantee of anything with UDP, but it
seems to me that if there is a packet in the buffer (it shows up after
another packet comes in behind it) the system should know about it,
right?

The code will eventually deal with packet loss / retransmission (it is
actually a customer's application, not my own). Development was only
stopped at this point because this behavior was discovered. However, if
the final application behaves in the same way that things are going now,
the application would need to timeout on read, request retransmission,
receive the original packet (that was just stuck in the buffer
somewhere) and the retransmitted packet and decide which to toss every
couple of seconds. This is a whole lot more retransmissions than I would
expect to see on a cross-over cable, especially from receiving and
processing only two small packets at one pass.

If this is what's required I will relay that to the customer or
implement some type of workaround to force a poll or flush. However, if
there is possibly a bug or race condition that is not getting handled
properly it would be better to try and find it.

Thanks,

--

From: Ben Greear
Date: Tuesday, June 17, 2008 - 4:45 pm

Ahh, I see what you mean.

I'm afraid I don't know anything about your NIC driver, and it would
seem to be implicated.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

--

From: Travis Stratman
Date: Thursday, June 19, 2008 - 3:53 pm

I agree, but it also troubles me that the x86 board that I noticed the
same issue on uses the realtek (8139too) driver, so I'm not completely
convinced that the issue is at the NIC level.

I was able to do some more extensive testing today with the macb (atmel
Eternet MAC controller) driver and noticed that the
netif_rx_schedule_prep function is returning false at times in the
interrupt handler. In the code below, the printk shows up during heavy
traffic, though it only happens a handful of times. (The else block is
code that I have added to the driver while debugging).

if (status & MACB_RX_INT_FLAGS) {
    if (netif_rx_schedule_prep(dev)) {
    /*
     * There's no point taking any more interrupts
     * until we have processed the buffers
     */
        macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
        dev_dbg(&bp->pdev->dev, "scheduling RX softirq\n");
        __netif_rx_schedule(dev);
    } else {
        printk(KERN_ERR "%s: Driver bug: interrupt while in polling mode\n", dev->name);
        /* disable interrupts */
        macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
    }
polling is already enabled for the interface (though I haven't looked
much deeper than the inline for netif_rx_schedule_prep()).

I went through the poll function, and actually rewrote the whole thing
according to the guidelines in the NAPI documentation, and I can't see
anyway for it to get out of poll with interrupts enabled without first
removing itself from the polling list.

Can someone who knows more about this give me some more insight into
what might be happening here? I can post the poll function or a patch to
macb.c if it would be helpful.

Thanks,


--

From: Ben Greear
Date: Thursday, June 19, 2008 - 4:08 pm

If you run a sniffer on the machine that is dropping/delaying receiving
the pkt, you can probably determine whether it is a driver issue or some
other stack issue:

If you see the pkt in the sniffer, but not in the application, then
it's probably a udp stack issue or at least not the driver.
Otherwise, the driver must be holding onto the packet.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

--

From: James Chapman
Date: Sunday, June 22, 2008 - 2:16 am

I looked at macb.c and can see that it uses napi only for rx work, 
leaving tx interrupts enabled at all times. The interrupt handler reads 
the device interrupt status when a tx interrupt happens and may find rx 
bits also set. As a result, your netif_rx_schedule_prep() will sometimes 
return false because napi might be already scheduled. The code you have 
above (i.e. the "driver bug" case) is wrong.

The napi code in the in-tree version looks suspect because it seems to 
enable rx interrupts unconditionally regardless of whether napi rx 
processing is complete.

It might help to post a patch here showing all of your changes.


-- 
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

--

From: Travis Stratman
Date: Monday, July 7, 2008 - 2:56 pm

Thanks for the reply James.

That is somewhat confusing to me because once an rx interrupt is
detected and the rx interrupts are disabled the rx bits should not be
set in the interrupt status register until they are re-enabled again
the ISR is read and when the rx bits are tested and rx ints are disabled
for it to be there the next time around in the while(status) loop.

Correct, this is one of the reasons that I rewrote the driver poll

Did this earlier today, I should get a patch against 2.6.25 up tomorrow
which will be a little more useful.

Thanks!

Travis

--

From: James Chapman
Date: Tuesday, July 8, 2008 - 2:37 am

The rx and tx status are flagged in the same status register. The bits 
are set regardless of whether rx or tx interrupts are enabled in the 
device. So when you handle a tx interrupt, the interrupt routine will 
read the status register and may see rx bits also set.

You could mask the status register value that you read to ignore rx bits 
if rx interrupts are disabled (NAPI polled mode). But to be honest, I 
think it is simpler to handle rx _and_ tx work in the NAPI poll handler 
so you only get interrupts when not in NAPI polled mode. See tg3.c or 
e100.c for example.


-- 
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

--

From: Travis Stratman
Date: Tuesday, July 15, 2008 - 1:46 pm

I will take a look at modifying the driver to use NAPI for tx.

Thanks,

Travis

--

From: Evgeniy Polyakov
Date: Tuesday, June 17, 2008 - 11:28 pm

Hi.


Did you run wireshark on receiver or sender?
Check MIB stats if packet was dropped because of low mem or incorrect
checksumm or some other problematic fields in UDP header. Sending part
can see it perfectly correct, which will not be the issue on the
receiver. If packet was delivered to receiving host, udp input path is
rather simple so there are no places which can race with something and
thus lost the packet.

-- 
	Evgeniy Polyakov
--

From: Travis Stratman
Date: Thursday, June 19, 2008 - 4:10 pm

Initially, I had run wireshark on my PC and connected it to one of the
embedded boards (the issue still shows up in this case). I did some more
testing today where I ran tcpdump on both of the boards connected with a
cross-over cable until the application froze. What I was able to find
was that the first 1 or 2 hangups are corrected after 4 or 5 seconds
because the boards send an ARP request when data communication stops.
This causes communication to start up again. No packets are ever lost or
corrupted, they just don't appear to the application until something
else happens on the network.

Here is a snippet of the packet trace surrounding the hangup (these are
from the same session, but the clocks on the two boards were not set to
the same time):
(On the "server" -- sbc41):
22:53:57.763656 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
22:53:57.764000 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
22:53:57.764229 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
22:53:57.764387 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
22:54:01.034522 arp who-has sbc041.emacinc.com tell sbc042.emacinc.com
22:54:01.034642 arp reply sbc041.emacinc.com is-at 00:50:c2:0d:6e:00 (oui Unknown)
22:54:01.035585 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
22:54:01.035736 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
22:54:01.036095 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
22:54:01.036263 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
22:54:01.036793 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
--
22:54:01.803384 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
22:54:01.803773 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
22:54:01.803916 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
22:54:01.804274 IP sbc041.emacinc.com.3072 > ...
Previous thread: [PATCH] udp: sk_drops handling by Eric Dumazet on Tuesday, June 17, 2008 - 2:31 pm. (4 messages)

Next thread: none