The n-tuple filtering facility is half-baked at present. There is an interface to add filters but none to remove them! And ETHTOOL_GRXNTUPLE is not at all symmetric with ETHTOOL_SRXNTUPLE (which I complained about at the time it was added, to no avail). An ETHTOOL_RESET command with flag ETH_RESET_FILTER set could be defined to clear all the filters, but that's a big hammer to use, and I think that in general drivers should push the same configuration back to the hardware after resetting it for whatever reason. So far as I can work out, ixgbe clears all the filters when the filter table fills up. Is that true? Is this really the intended behaviour of manually set filters? I also see this in the ixgbe implementation: /* * Program the relevant mask registers. If src/dst_port or src/dst_addr * are zero, then assume a full mask for that field. Also assume that * a VLAN of 0 is unspecified, so mask that out as well. L4type * cannot be masked out in this implementation. * * This also assumes IPv4 only. IPv6 masking isn't supported at this * point in time. */ An IPv4 address of 0 is certainly valid, so this isn't a good rule. And in any case, such a rule should be specified *with the interface*, in <linux/ethtool.h>, not the implementation. This also implies that 'mask' specifies bits to be ignored, not bits to be matched. That also was not specified. Ben.` -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. --
It's a bit worse than that. Currently one can only append filters, not insert at a given position, as ethtool_rx_ntuple doesn't have an index field. For devices that use TCAMs, where position matters, it's quite an obstacle. It also means one cannot modify an existing filter by specifying --
It looks like drivers for devices that use TCAMs should implement the RXNFC interface instead. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. --
Ben, from ethtool manpage it sounds like RXNFC option defines the way the RSS hash should be calculated, while SRXNTUPLE is meant to control the destination Rx queue for a stream specified by a filter/filters. The semantics for a specification of the steam is also quite different. For instance, how do u define a rule to drop all packets with source IP address 192.168.10.200 by means of RXNFC? While with SRXNTUPLE it's straight forward. So, if I understood the semantics of both interfaces correctly, there is a very limited range of functionality where they may replace one another. Pls., correct me if I'm wrong. I also agree with Dimitris: what we have here is an offload of some Netfilter functionality to HW. Regardless the HW implementation (TCAM or not) if it's allowed to configure more than one rule for the same protocol the ordering of filtering rules is important: for instance if u change the order of applying the rules in the example below the result of the filtering for the traffic with both VLAN 4 and destination port 3000 will be different. ethtool -U ethX flow-type tcp4 vlan 4 action 0 ethtool -U ethX flow-type tcp4 dst-port 3000 action 3 By the way it's also unclear from the ethtool man page if it's allowed to configure more than one rule for the same protocol. If it's not then the above example is void... ;) However, if we want to define a proper filtering interface I think we shouldn't restrict the driver implementation from defining a set of rules for the same protocol, allowing not to though. So, I think that attaching an index to each rule could be a good idea - this would allow us both inserting rules at the desired positions in the filtering rule table and editing the existing rules. It's also unclear what is the relation between RXNFC and SRXNTUPLE. The last in general may override the decision made based on the hash result. So, it sounds like applying rules of SRXNTUPLE should come before applying the RSS logic and only if there was no match RSS ...
From: "Vladislav Zolotarov" <vladz@broadcom.com> It's not the same, this whole ordering thing you expect in netfilter land is simply not present in these hardware implementations. The hardware does a parallel TCAM match lookup, and whatever matches is used. Some hardware does link-level protocol lookups first, then L3/L4 later in the RX path right before computing the hash and selecting an RX queue. There really is no ordering available, so let's not pretend it can be used "just like" netfilter rules. As per the difference between the various ethtool facilities, this just represents the fact that whats available to offload differs per device. The best we can do is encapsulate commonality as best as we can, but each interface essentially represents what one major chipset provides. --
I think the match with the lowest index wins, which is why it's possible to specify the rule's index (location) with ETHTOOL_SRXCLSRLINS and why I think the interfaces are actually somewhat more flexible than any of the current implementations. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. --
From: Ben Hutchings <bhutchings@solarflare.com> Yeah you're probably right. --
Ben, practically, with the current ethtool userspace implementation it seems like there is no way to specify the CAM index of the rule in the n-tuple interface, is it? So, the decision on the index is up to the vendor thus creating an uncertainty space. And I guess it's exactly what Dimitris meant talking about the index - he said "a rule index", u say "a CAM index" while generally we are talking about the same thing. U r referring the ETHTOOL_SRXCLSRLINS but it has no user space interface yet and it's unclear when it will, while n-tuple is already there. We can't remove the existing user space interfaces - I agree. Then let's not adding the interfaces interfering with the existing ones. This immediately implies that ETHTOOL_SRXCLSRLINS shell never see light in a userland as a separate interface and n-tuple user interface should be properly extended to implement the missing ETHTOOL_SRXCLSRLINS functionality. Pls., comment. thanks, vlad --
So, u say that in scope of a single protocol all rules create a set which ordering is a vendor specific and the same configuration of n-tuple rules may generate different results for the same traffic on NICs from different vendors? Don't u think it's confusing from the user point of view? ;) --
By 'RXNFC interface' I mean ETHTOOL_{G,S}RXCLS* and not
Something like this, I think:
struct ethtool_rxnfc insert_rule = {
.cmd = ETHTOOL_SRXCLSRLINS,
.flow_type = IP_USER_SPEC,
.fs = {
.flow_type = IP_USER_SPEC,
.h_u.usr_ip4_spec = {
.ip4src = inet_aton("192.168.10.200"),
.ip_ver = ETH_RX_NFC_IP4
},
.m_u.usr_ip4_spec = {
.ip4dst = 0xffffffff,
.l4_4_bytes = 0xffffffff,
.tos = 0xff,
.proto = 0xff
},
.ring_cookie = RX_CLS_FLOW_DISC,
.location = 0,
}
};
Our hardware (and, I suspect, the ixgbe hardware) has hash tables for
specific types of matching. There is some control of precedence between
That's right.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
Aha. Ok. From the remarks in the upstream ethtool.h I see now that ethtool_rxnfc has quite wide configuration possibilities (including the above). I missed it before. ;) Ben, could u, pls., explain me then what's the difference between defining the rule as u wrote above on top of -N option (nfc) and defining the rule doing the same thing on top on -U (n-tuple) option and when I as a user should prefer one option to another? Are they expected to be implemented differently from FW/HW perspective? thanks, vlad P.S. I see that ethtool.h from the 2.6.36 tree already has the ethtool_rxnfc that would allow such a filtering definition however from the man page of the 2.6.36 version of the ethtool package it's unclear what should be a command line for such a configuration. Is it supported with the current ethtool version or maybe I'm missing something in a man page? --
The -N option modifies the hash function for all flows of a specific type (using ETHTOOL_SRXFH) whereas the -U option steers a specific flow or set of flows (using ETHTOOL_SRXNTUPLE). The implementation of the -U option could potentially be made to fallback to ETHTOOL_SRXCLSRLINS if It's not supported. Santwona Behera implemented the kernel side of this but so far as I know he never sent any patches for ethtool. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. --
Having said that, don't u think that it could be more user friendly to extend the ETHTOOL_SRXCLSRLINS interface to handle the lan_tag and user_def and drop the n-tuple interface at all? thanks, vlad --
No, we can't remove userland interfaces. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. --
Having said that, this particular interface is fairly broken...
$ cat test.c
#include <stddef.h>
#include <stdio.h>
#include <linux/ethtool.h>
int main(void)
{
printf("%zd\n", offsetof(struct ethtool_rx_flow_spec, ring_cookie));
return 0;
}
$ cc -m64 -Wall test.c
$ ./a.out
152
$ cc -m32 -Wall test.c
$ ./a.out
148
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
I think the mask would be 0 for don't care fields and 1 for care, so .m_u.usr_ip4_spec.ip4src = htonl(0xffffffff) .m_u.usr_ip4_spec.ip4dst = htonl(0) etc There's a lot of overlap between SRXCLSRLINS and SRXNTUPLE and neither is a superset. SRXCLSRLINS has the advantage of specifying position but SRXNTUPLE includes vlan and a device-specific field that are handy. Also for reporting rules GRXNTUPLE is more flexible than GRXCLSRULE as it lets the driver specify the information it reports. In fact I've been thinking of using SRXCLSRLINS and GRXNTUPLE for cxgb4 but haven't gotten It can be more involved than this. Our HW allows a rule to select a different part of the RSS table so you get a filter hit and still do RSS afterwards if you want. Current ethtool interfaces do not support this, --
That is definitely the opposite of what ixgbe and sfc do for ethtool_ntuple_rx_flow_spec, and I believe it is the opposite of what niu does for ethtool_rx_flow_spec. So does the rule specify an offset added to the output of the RSS hash and indirection table, or can it also select a different indirection table? Our current hardware also has a filter flag for the former behaviour... There are still plenty of bits to spare in 'action' and 'ring_cookie' so perhaps we could define a flag for this? Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. --
These are the values as our HW at least wants them. The care bits are 1 in You can partition the indirection table and then a rule can specify that matching packets should consult region X of the table. The hash value is --
