Re: [PATCH iptables] extension: add xt_cpu match

Previous thread: Re: [PATCH for-2.6.35] tun: avoid BUG, dump packet on GSO errors by Herbert Xu on Thursday, July 22, 2010 - 6:05 am. (2 messages)

Next thread: [RFC 0/1] netfilter: xtables: xt_condition inclusion with namespace fix by Luciano Coelho on Thursday, July 22, 2010 - 7:09 am. (8 messages)
From: Eric Dumazet
Date: Thursday, July 22, 2010 - 7:03 am

This match is a bit strange, being packet content agnostic...

Still, in some situations a CPU match permits a better spreading of
connections, or select targets only for a given cpu.

With Remote Packet Steering or multiqueue NIC and appropriate IRQ
affinities, we can distribute trafic on available cpus, per session.
(all RX packets for a given flow is handled by a given cpu)

Some legacy applications being not SMP friendly, one way to scale a
server is to run multiple copies of them.

Instead of randomly choosing an instance, we can use the cpu number as a
key so that softirq handler for a whole instance is running on a single
cpu, maximizing cache effects in TCP/UDP stacks.

Using NAT for example, a four ways machine might run four copies of
server application, using a separate listening port for each instance,
but still presenting an unique external port :

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
	-j REDIRECT --to-port 8080

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
	-j REDIRECT --to-port 8081

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
	-j REDIRECT --to-port 8082

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
	-j REDIRECT --to-port 8083


Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/linux/netfilter/Kbuild   |    1 
 include/linux/netfilter/xt_cpu.h |    8 +++
 net/netfilter/Kconfig            |    9 ++++
 net/netfilter/Makefile           |    1 
 net/netfilter/xt_cpu.c           |   65 +++++++++++++++++++++++++++++
 5 files changed, 84 insertions(+)

diff --git a/include/linux/netfilter/Kbuild b/include/linux/netfilter/Kbuild
index bb103f4..5c39a56 100644
--- a/include/linux/netfilter/Kbuild
+++ b/include/linux/netfilter/Kbuild
@@ -34,6 +34,7 @@ header-y += xt_helper.h
 header-y += xt_length.h
 header-y += xt_limit.h
 header-y += xt_mac.h
+header-y += xt_cpu.h
 header-y += xt_mark.h
 header-y += xt_multiport.h
 header-y += xt_osf.h
diff --git ...
From: Jan Engelhardt
Date: Thursday, July 22, 2010 - 7:19 am

That is not so strange after all, we have many packet agnostic matches: 
xt_time, xt_condition, xt_IDLETIMER, xt_iface.
So this little comment looks a bit redundant.

Or it seems that academia can't come up with enough new protocols in time that

Please take a read in "Writing Netfilter Modules" e-book :-)


Well the commands you already have presented in the commit log, and the 

Looks simple enough that it could do it in a single line,

	return (info->cpu == smp_processor_id()) ^ !!info->invert;
--

From: Eric Dumazet
Date: Thursday, July 22, 2010 - 8:18 am

Ok, let's do that, but I doubt sizeof(int) can be different than 4 on a
Linux 2.6 host right now.

I prefer not doing the !!info->invert, and do the check only once.

Thanks

[PATCH nf-next-2.6] netfilter: add xt_cpu match

In some situations a CPU match permits a better spreading of
connections, or select targets only for a given cpu.

With Remote Packet Steering or multiqueue NIC and appropriate IRQ
affinities, we can distribute trafic on available cpus, per session.
(all RX packets for a given flow is handled by a given cpu)

Some legacy applications being not SMP friendly, one way to scale a
server is to run multiple copies of them.

Instead of randomly choosing an instance, we can use the cpu number as a
key so that softirq handler for a whole instance is running on a single
cpu, maximizing cache effects in TCP/UDP stacks.

Using NAT for example, a four ways machine might run four copies of
server application, using a separate listening port for each instance,
but still presenting an unique external port :

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
        -j REDIRECT --to-port 8080

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
        -j REDIRECT --to-port 8081

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
        -j REDIRECT --to-port 8082

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
        -j REDIRECT --to-port 8083


Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/linux/netfilter/Kbuild   |    3 -
 include/linux/netfilter/xt_cpu.h |   11 +++++
 net/netfilter/Kconfig            |    9 ++++
 net/netfilter/Makefile           |    1 
 net/netfilter/xt_cpu.c           |   63 +++++++++++++++++++++++++++++
 5 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/include/linux/netfilter/Kbuild b/include/linux/netfilter/Kbuild
index bb103f4..1041a1d 100644
--- a/include/linux/netfilter/Kbuild
+++ b/include/linux/netfilter/Kbuild
@@ -19,12 +19,13 ...
From: Jan Engelhardt
Date: Thursday, July 22, 2010 - 8:39 am

Never say never. "long" already bit people in the past, and now we
have that CONFIG_COMPAT stuff.

If invert is the only flag, perhaps it makes sense to use __u8 

That works nicely indeed. Do you anticipate any future flags?

--

From: Eric Dumazet
Date: Thursday, July 22, 2010 - 9:24 am

I know pretty well the "long" problem, I received one of the first alpha
machine ever built in the world (DEC 3000 AXP, with a fast 133 MHz

Quite frankly it brings more problems than plain u32

- Possible security problems (padding bytes). Not applicable to
iptables.

- Some arches have slow byte/short accesses (21064 for example :) )

"int" is the natural type, fast on all arches.

- Given alignment requirements of iptables rules, using less than 32bits
here saves no ram.

But I dont care that much.

I even see compiler doesnt want to use a XOR instruction :

00000018 <cpu_mt>:
  18:	55                   	push   %ebp
  19:	8b 42 04             	mov    0x4(%edx),%eax
  1c:	64 8b 15 00 00 00 00 	mov    %fs:0x0,%edx
  23:	89 e5                	mov    %esp,%ebp
  25:	5d                   	pop    %ebp
  26:	39 10                	cmp    %edx,(%eax)
  28:	0f 94 c2             	sete   %dl
  2b:	0f b6 d2             	movzbl %dl,%edx
  2e:	3b 50 04             	cmp    0x4(%eax),%edx
  31:	0f 95 c0             	setne  %al
  34:	c3                   	ret    




--

From: Patrick McHardy
Date: Friday, July 23, 2010 - 4:00 am

Applied, thanks Eric.
--

From: Eric Dumazet
Date: Friday, July 23, 2010 - 6:43 am

Patrick,

Here is iptables extension for xt_cpu match.

I put same changelog than kernel one, tell me if its ok or not ;)

Thanks

[PATCH iptables] extension: add xt_cpu match

Kernel 2.6.36 supports xt_cpu match

In some situations a CPU match permits a better spreading of
connections, or select targets only for a given cpu.

With Remote Packet Steering or multiqueue NIC and appropriate IRQ
affinities, we can distribute trafic on available cpus, per session.
(all RX packets for a given flow are handled by a given cpu)

Some legacy applications being not SMP friendly, one way to scale a
server is to run multiple copies of them.

Instead of randomly choosing an instance, we can use the cpu number as a
key so that softirq handler for a whole instance is running on a single
cpu, maximizing cache effects in TCP/UDP stacks.

Using NAT for example, a four ways machine might run four copies of
server application, using a separate listening port for each instance,
but still presenting an unique external port :

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
        -j REDIRECT --to-port 8080

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
        -j REDIRECT --to-port 8081

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
        -j REDIRECT --to-port 8082

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
        -j REDIRECT --to-port 8083

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 extensions/libxt_cpu.c           |   98 +++++++++++++++++++++++++++++
 extensions/libxt_cpu.man         |   16 ++++
 include/linux/netfilter/xt_cpu.h |   11 +++
 3 files changed, 125 insertions(+)

diff --git a/extensions/libxt_cpu.c b/extensions/libxt_cpu.c
index e69de29..869998d 100644
--- a/extensions/libxt_cpu.c
+++ b/extensions/libxt_cpu.c
@@ -0,0 +1,98 @@
+/* Shared library add-on to iptables to add CPU match support. */
+#include <stdio.h>
+#include <netdb.h>
+#include <string.h>
+#include ...
From: Patrick McHardy
Date: Friday, July 23, 2010 - 7:13 am

Applied to the iptables-next branch, thanks Eric.
--

From: Jan Engelhardt
Date: Friday, July 23, 2010 - 9:46 am

I will never understand that sort of style mix logic. Why the
C99 initializer only on the sentinel?

{
	{.name = "cpu", .has_arg = true, .val = '1'},
	{NULL},




Linux.
--

From: Eric Dumazet
Date: Friday, July 23, 2010 - 10:30 am

Not sure what you mean. You want to save an empty string (1 byte long),


OK ;)

I'll provide a cleanup patch, not only to xt_cpu but all other iptables
modules that dont meet your coding style requirements ;)

Thanks


--

From: Jan Engelhardt
Date: Friday, July 23, 2010 - 10:53 am

Well nah I'm already on it myself, given Patrick has already imported the
patches.
--

Previous thread: Re: [PATCH for-2.6.35] tun: avoid BUG, dump packet on GSO errors by Herbert Xu on Thursday, July 22, 2010 - 6:05 am. (2 messages)

Next thread: [RFC 0/1] netfilter: xtables: xt_condition inclusion with namespace fix by Luciano Coelho on Thursday, July 22, 2010 - 7:09 am. (8 messages)