Has Objective-C ever been considered for kernel development? regards, BPC -
Why not Haskell nor Erlang instead ? :-D -
I heard of a bash compiler. That would enable development time rationalization and maximize the collaborative convergence of a community-oriented synergy. -
It used to be written in BCPL; or was that Multics? -
> BCPL was typeless, as was the successor B (between Bell Labs and GE we B isn't quite typeless. It has minimal inbuilt support for concepts like strings (although you can of course multiply a string by an array pointer ;)) It also had some elegances that C lost, notably case 1..5: the ability to do no zero biased arrays x[40]; x-=10; and the ability to reassign function names. printk = wombat; as well as stuff like free(function); Alan (who learned B before C, and is still waiting for P) -
Hey, the language we use, gcC has this too 8-) [acme@doppio net-2.6.25]$ find . -name "*.c" | xargs grep 'case.\+\.\.' | wc -l 400 [acme@doppio net-2.6.25]$ find . -name "*.c" | xargs grep 'case.\+\.\.' | head ./kernel/signal.c: default: /* this is just in case for now ... */ ./kernel/audit.c: case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG: ./kernel/audit.c: case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2: ./kernel/audit.c: case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG: ./kernel/audit.c: case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2: ./kernel/timer.c: * well, in that case 2.2.x was broken anyways... ./arch/frv/kernel/traps.c: case TBR_TT_TRAP2 ... TBR_TT_TRAP126: ./arch/frv/kernel/ptrace.c: case 0 ... PT__END - 1: ./arch/frv/kernel/ptrace.c: case 0 ... PT__END-1: ./arch/frv/kernel/gdb-stub.c: case GDB_REG_GR(1) ... GDB_REG_GR(63): [acme@doppio net-2.6.25]$ - Arnaldo -
Well, original C allowed you to do what you wanted with pointers (I used to teach that back when K&R was "the" C manual). Now people which about having pointers outside the array, which is a crock in practice, as long I had forgotten that, the function name was actually a variable with the entry point, say so in section 3.11. And as I recall the code, arrays were the same thing, a length ten vector was actually the vector and variable with the address of the start. I was more familiar with the B stuff, I wrote both the interpreter and the code generator+library for I had the BCPL book still on the reference shelf in the office, along with goodies like the four candidates to be Ada, and a TRAC manual. I too expected the next language to be "P". -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot --
Actually the standards had good reasons to bar this use, because many runtime environments used segmentation and unsigned segment offsets. On a B on Honeywell L66, so that may well have been a relative of your code generator ? --
Probably the Bell Labs one. I did an optimizer on the Pcode which caught jumps to jumps, then had separate 8080 and L66 code generators into GMAP on the GE and the CP/M assembler or the Intel (ISIS) assembler for 8080. There was also an 8085 code generator using the "ten undocumented instructions" from the Dr Dobbs article. GE actually had a contract with Intel to provide CPUs with those instructions, and we used them in the Terminet(r) printers. Those were the days ;-) -- Bill Davidsen <davidsen@tmr.com> "Woe unto the statesman who makes war without a reason that will still be valid when the war is over..." Otto von Bismark --
Flash http://www.lagmonster.info/humor/windowsrg.html -- J.A. Magallon <jamagallon()ono!com> \ Software is like sex: \ It's better when it's free Mandriva Linux release 2008.1 (Cooker) for i586 Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 -
No, it has not. Any language that looks remotely like an OO language
has not ever been considered for (Linux) kernel development and for
most, if not all, other operating systems kernels.
Various problems occur in an object oriented language. One of them
is garbage collection: it provokes asynchronous delays and, during
an interrupt or a system call for a real time task, the kernel cannot
wait. Another is memory overhead: all the magic that OO languages
provide take space in memory and Linux kernel is used in embedded
systems with very tight memory requirements.
Lots of people will think of better reasons why ObjC is not used...
Loïc Grenié
-
But are embedded systems not rapidly moving on. Turning to stare at the Which I'm looking forward to hear :) Thank you for your appropriate response. -- Regards BPC -
Here are a few reasons off the top of my head: 1. Adding extra unneeded complexity. Debugging would be harder. 2. Not many people can code ObjC when compared to the number of C coders. 3. If it ain't broken... Why fix it. The kernel works, right? Good. You can find a great explanation somewhere out there, I'm not sure who wrote it and the thing was explaining why C++ is not a great choice for the Linux kernel. Some things going against C++ will also go against ObjC. I cannot find it, but it is out there somewhere. I'm a newbie and I might be wrong, but the above is what I believe to be true. Karol Swietlicki -
Some embedded systems run on batteries, so the less ram they have to power the better, and the less cpu cycles that have to spend executing code the less power they consume. An ADSL modem on your desk doesn't have any of those worries, it just has to work and if doubling the ram cuts the development problems by a lot, then that might have been a worthwhile trade off. -- Len Sorensen -
This is a multi-part message in MIME format. --------------090404060204050609080100 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I have tried it in a toy kernel. Oskit style. The code reuse is very high specially with string ops and driver interfaces. Its also very easy to do unit testing with. My main problem was the quality of the compiler optimization. Its just not good enough. I think if the compiler can do the right kind of optimizations correctly then a low overhead OO language like objective-c can be used in a kernel. On the other hand its the automated testing part that really matters for me. Imagine adding features to linux week after week without ever getting a serious panic or two. And then getting a big performance boost Its all about optimizations. --=20 Democracy is about two wolves and a sheep deciding what to eat for dinner= =2E --------------090404060204050609080100 Content-Type: text/x-vcard; charset=utf-8; name="rogelio.vcf" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="rogelio.vcf" YmVnaW46dmNhcmQNCmZuOlJvZ2VsaW8gTS4gU2VycmFubyBKcg0KbjpNLiBTZXJyYW5vIEpy O1JvZ2VsaW8NCm9yZzpTTVNHIENvbW11bmljYXRpb25zIFBoaWxpcHBpbmVzO1RlY2huaWNh bCBEZXBhcnRtZW50DQphZHI6Ozs7Ozs7UmVwdWJsaWMgb2YgdGhlIFBoaWxpcHBpbmVzDQpl bWFpbDtpbnRlcm5ldDpyb2dlbGlvQHNtc2dsb2JhbC5uZXQNCnRpdGxlOlByb2dyYW1tZXIN CnRlbDt3b3JrOis2MzI3NTM0MTQ1DQp0ZWw7aG9tZTorNjMyOTUyNzAyNg0KdGVsO2NlbGw6 KzYzOTIwOTIwMjI2Nw0KeC1tb3ppbGxhLWh0bWw6RkFMU0UNCnZlcnNpb246Mi4xDQplbmQ6 dmNhcmQNCg0K --------------090404060204050609080100--
Well, I really would like to learn some things here, could we
keep this off-topic thread alive just a bit, please ?
(I know, I'm going to gain a troll's fame because I can't avoid this
I think BeOS was C++ and OSX is C+ObjectiveC (and runs on an iPhone).
Original MacOS (fron 6 to 9) was Pascal (and a mac SE was very near
to embedded hardware :) ).
I do not advocate to rewrite Linux in C++, but don't say a kernel written
C++ (and for what I read on other answer, nor ObjectiveC) has no garbage
collection. It does not anything you did not it to do. It just allows
you to change this
struct buffer *x;
x = kmalloc(...)
x->sz = 128
x->buff = kmalloc(...)
...
kfree(x->buff)
kfree(x)
to
struct buffer *x;
x = new buffer(128); (that does itself allocates x->buff,
because _you_ programmed it,
so you poor programmer don't forget)
...
delete x; (that also was programmed to deallocate
x->buff itself, sou you have one less
An vtable in C++ takes exactly the same space that the function
table pointer present in every driver nowadays... and probably
the virtual method call that C++ does itself with
thing->do_something(with,this)
like
push thing
push with
push this
call THING_vtable+indexof(do_something) // constants at compile time
is much more efficient that what gcc can mangle to do with
thing->do_something(with,this,thing)
push with
push this
push thing
get thing+offsetof(do_something) // not constant at compile time
dereference it
call it
(that is, get a generic field on a structure and use it as jump address)
In short, the kernel is object oriented, implements OO programming by
hand, but the compiler lacks the knowledge that it is object oriented
People usually complains about RTTI or exceptions, but benefits versus
memory space should be seriously considered (sure there is something
in current drivers to ask 'are ...struct test;
struct testVtbl
{
int (*fn1)(struct test *t, int x, int y);
int (*fn2)(struct test *t, int x, int y);
};
struct test
{
struct testVtbl *vtbl;
int x, y;
};
void testCall(struct test *t, int x, int y)
{
t->vtbl->fn1(t, x, y);
t->vtbl->fn2(t, x, y);
}
and
struct test
{
virtual int fn1(int x, int y);
virtual int fn2(int x, int y);
int x, y;
};
void testCall(struct test *t, int x, int y)
{
t->fn1(x, y);
t->fn2(x, y);
}
generate instruction-for-instruction identical code.
--
Nicholas Miell <nmiell@comcast.net>
-
This is not what vtables are. Think for a minute - all codepaths arriving to that point in your code will pick the address to call from the same location. Either the contents of that location is constant (in which case you could bloody well call it directly in the first place) *or* it has to somehow be reassigned back and forth, according to the value of this. The former is dumb, the latter - outright insane. The contents of vtables is constant. The whole point of that thing is to deal with the situations where we _can't_ tell which derived class this ->do_something() is from; if we could tell which vtable it is at compile time, we wouldn't need to bother at all. It's a tradeoff - we pay the extra memory access (fetch vtable pointer, then fetch method from vtable) for not having to store a slew of method pointers in each instance of base class. But the extra memory access is very much there. It can be further optimized away if you have several method calls for the same object next to each other (then vtable can be picked once), but it's still done at runtime. -
Yup, my mistake (that's why I said i will learn something). I was thinking
on non-virtual methods. For virtual ones you have to fetch the vtable
--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
-
True. C++ vtables have no performance advantage over C ->ops->function() calls. But they have no disadvantage either and they do offer many syntactic advantages (such as automatically casting the object type to the *correct* derived class. --
Well I am pretty sure the micro kernel of OS X is in C, and certainly the BSD layer is as well. So the only ObjC part would be the nextstep framework and other parts of the Mac GUI and other Mac APIs they provide, which all at some point probably end up calling down into the C But kmalloc is implemented by the kernel. Who implements 'new'? -- Len Sorensen --
Help yourself... as kmalloc() is a replacement for userspace glibc's
malloc, you can write your replacements for functions/operators in
libstdc++ (operators are just cosmetic, as many other features in C++)
In fact, for someone who dared to write a kernel C++ framework, the
very first function he has to write could be something like:
void* operator new(size_t sz)
{
return kmalloc(sz,GPF_KERNEL);
}
And could write alternatives like
operator new(size_t sz,int flags) -> x = new(GPF_ATOMIC) X;
operator new(size_t sz,MemPool& pl) -> x = new(pool) X;
If you are curious, this page http://www.osdev.org/wiki/C_PlusPlus
has some clues about what should you implement to get rid of
libstdc++.
--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov
--
To my recall: Never.
Some limited subset of C++ was tried, but was soon abandoned.
Overall the kernel data structures are done in objectish-manner,
although there are no strong type mechanisms being used.
Could the kernel be written in a limited subset[*] of ObjC ? Very likely.
Would it be worth the job ? Radical decrease in number of available
programmers...
*) Subset as enforcing the rule of not even indirectly using dynamic
memory allocation, when operating in interrupt state.
/Matti Aarnio
-
Objective-C is actually a pretty minimal wrapper around C; it was originally implemented as a C preprocessor. It generally does not have any kind of memory management, garbage collection, or anything else (although typically a "runtime" will provide those features). There are no first-class exceptions, so there would be nothing to worry about there (the exceptions used in GUI programs are built around the setjmp/longjmp primitives). Objective-C is also almost completely backwards-compatible with C, much more so than C++ ever was. As far as the runtime goes the kernel would be expected to write its own, the same way that it implements "kmalloc()" as part of a "C runtime". Since the runtime itself never does any implicit memory allocation, I think it would conceivably even be relatively safe for kernel usage. With that said, there is a significant performance penalty as all Objective-C method calls are looked up symbolically at runtime for every single call. For GUI programs where large chunks of the code are event-loops and not performance-sensitive that provides a huge amount of extra flexibility. In the kernel though, there are many codepaths where *every* *single* instruction counts; that could be a serious performance hit. Cheers, Kyle Moffett -
GACK! At least C++ has vtables. -hpa -
In a tight loop there is a way to do a single symbolic lookup and just call directly through a function pointer, but typically it isn't necessary for GUI programs and the like. The flexibility of being able to dynamically add new methods to an existing class (at least for desktop user interfaces) significantly outweighs the performance cost. Any performance-sensitive code is typically written in straight C anyways. Cheers, Kyle Moffett -
Write *those* *codepaths* in *C* or *assembly*. But only after you manage to measure a difference compared to the object-oriented systems language. [I really doubt there are that many of these; syscall entry/dispatch/exit, interrupt dispatch, context switch, what else?] --
Networking, block IO, page fault, ... But only the fast paths in these cases. A lot of the kernel is slow path code and could probably be written even in an interpreted language without much trouble. -Andi --
Even these (with the exception of the page fault path) are hardly "we care about a single instruction" material suggested above. Even with a million packets per second per core (does such a setup actually exist?) You have a few thousand cycles per packet. For block you'd need around 5,000 disks per core to reach such rates. The real benefits aren't in keeping close to the metal, but in high level optimizations. Ironically, these are easier when the code is a little more abstracted. You can add quite a lot of instructions if it allows you not to do some of the I/O at all. --
With 10Gbit/s ethernet working you start to care about every cycle. Similar with highend routing or in some latency sensitive network applications (e.g. in HPC). Another simple noticeable case is Unix sockets and your X server communication. And there are some special cases where block IO is also pretty critical. A popular one is TPC-* benchmarking, but there are also others and it looks likely in the future that this will become more critical While that's partly true -- cache misses are good for a lot of cycles -- it is not the whole truth and at some point raw code efficiency matters too. For example there are some CPUs who are relatively slow at indirect function calls and there are actually cases where this can be measured. -Andi --
If you have 10M packets/sec no amount of cycle-saving will help you. You need high level optimizations like TSO. I'm not saying we should True. And here, the hardware can cut hundreds of cycles by avoiding the Your reflexes are *much* better than mine if you can measure half a nanosecond on X. Here, it's scheduling that matters, avoiding large transfers, and avoiding ping-pongs, not some cycles on the unix domain socket. You already paid 150 cycles or so by issuing the syscall and thousands for And again the key is batching, improving cpu affinity, and caching, not That is true. But any self-respecting systems language will let you choose between direct and indirect calls. If adding an indirect call allows you to avoid even 1% of I/O, you save much more than you lose, so again the high level optimizations win. Nanooptimizations are fun (I do them myself, I admit) but that's not where performance as measured by the end user lies. -- error compiling committee.c: too many arguments to function --
A lot of applications don't and the user space networking schemes That's not about mouse/keyboard input, but about all X protocol communication between X clients and X server. The key is not large copies here That's not the whole story no. Batching etc are needed, but the It depends. Often high level (and then caching) optimizations are better bang for the buck, but completely disregarding the fast path work is a bad thing too. As an example see Christoph's recent work on the slub fastpath which makes a quite measurable difference on benchmarks. -Andi --
Huh? At 4 GHz, you have 400 cycles to process each packet. If you need to
route those packets, those cycles may just be what you need to lookup a
forwarding table and perform a few MMIO on an accelerated chip which will
take care of the transfer. But you need those cycles. If you start to waste
It just depends how many times a second it happens. For instance, consider
this trivial loop (fct is a two-function array which just return 1 or 2) :
i = 0;
for (j = 0; j < (1 << 28); j++) {
k = (j >> 8) & 1;
i += fct[k]();
}
It takes 1.6 seconds to execute on my athlon-xp 1.5 GHz. If, instead of
changing the function once every 256 calls, you change it to every call :
i = 0;
for (j = 0; j < (1 << 28); j++) {
k = (j >> 0) & 1;
i += fct[k]();
}
Then it only takes 4.3 seconds, which is about 3 times slower. The number
of calls per function remains the same (128M calls each), it's just the
branch prediction which is wrong every time. The very few nanoseconds added
at each call are enough to slow down a program from 1.6 to 4.3 seconds while
it executes the exact same code (it may even save one shift). If you have
such stupid code, say, to compute the color or alpha of each pixel in an
image, you will certainly notice the difference.
And such poorly efficient code may happen very often when you blindly rely
You are forgetting something very important : once you start stacking
functions to perform the dirty work for you, you end up with so much
abstraction that even new stupid code cannot be written at all without
relying on them, and it's where the problem takes its roots, because
when you need to write a fast function and you notice that you cannot
touch a variable without passing through a slow pinhole, your fast
function will remain slow whatever you do, and the worst of all is that
you will think that it is normally fast and that it cannot be written
Every ...On Mon, 3 Dec 2007 22:13:53 +0100, Willy Tarreau <w@1wt.eu> wrote:
But don't forget that OOP is just another way to organize your code,
and let the language/compiler do some things you shouldn't de doing,
like fill an vtable pointer, that are error prone.
And of course everything depends on what language you choose and how
you use it.
You could write an equally effcient kernel in languages like C++,
using C++ abstractions as a high level organization, where
the fast paths could be coded the right way; we are not talking about
C# or Java, where even a sum is a call to an overloaded method.
Its the difference between doing school-book push and pops to lists,
and suddenly inventing the splice operator...
--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov
--
It's very very hard to generate good C code because of the numerous ways objects get temporarily created, and the week aliasing rules (as with C). There are reasons that Fortran lives on (and no I'm not suggesting one should rewrite the kernel in Fortran ;)) and the fact its not really got pointer aliasing or "address of" operators and all the resulting optimsation problems is one of the big ones. Alan --
That is what I like of C++, with good placement of high level features
like const's and & (references) one can gain fine control over what
gets copied or not.
Try to write a Vector class that does ops with SSE without storing
temporals on the stack. Its a good example of how one can get low
level control, and gcc is pretty good simplifying things like u=v+2*w
and not putting anything on the stack, all in xmm registers.
The advantage is you onle has to be careful one time, when you write
--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov
--
But...if there's some way Linux can get "language improvements", is with new C standards/gccextensions/etc. It'd be nice if people tried to add (useful) C extensions to gcc, instead of proposing some random language :) --
I really doubt Linux spends 400 cycles routing a packet. Look what an skbuff looks like. A flood ping to localhost on a 2GHz system takes 8 microseconds, that's 16,000 cycles. Sure it involves userspace, but you're about two orders of magnitude off. And the localhost interface is nicely cached in L1 This happens very often in HPC, and when it does, it is often worthwhile to invest in manual optimizations or even assembly coding. Unfortunately it is very rare in the kernel (memcmp, raid xor, what else?). Loops with high iteration counts are very rare, so any attention you give to the loop body is not amortized over a large number Using an indirect call where a direct call is sufficient will also reduce the compiler's optimization opportunities. However, I don't see anyone recommending it in the context of systems programming. It is not true that the number of indirect calls necessarily increases if you use a language other than C. I don't understand. Can you give an example? There are two cases where abstraction hurts performance: the first is where the mechanisms used to achieve the abstraction (functions instead of direct access to variables, function pointers instead of duplicating the caller) introduce performance overhead. I don't think C has any advantage here -- actually a disadvantage as it lacks templates and is forced to use function pointers for nontrivial cases. Usually the abstraction penalty is nil with modern compilers. The second case is where too much abstraction clouds the programmer's A 100 byte program will print "hello world" on a UART and stop. A modern program will load a vector description of a font, scale it to the desired size, render it using anti aliasing and sub-pixel positioning, lay it out according to the language rules of whereever you live, and place it on a multi-megabyte frame buffer. Yes it needs hundreds of That is true, that is why we see a lot more microoptimizations than algorithmic ...
Hi Avi,
That's not what I wrote. I just wrote about doing forwarding table lookup
and MMIO so that dedicated hardware NICs can process the recv/send to the
correct ends. If you just need to scan a list of DMAed packets, look at
their destination IP address, lookup that IP in a table to find the output
NIC and destination MAC address, link them into an output list and waking
the output NIC up, there's nothing which requires more than 400 cycles
here. I never said that it was a requirement to pass through the existing
I don't see where you see a userspace (or I don't understand your test).
On traffic generation I often do from user space, I can send 630 k raw
ethernet packets per second from userspace on a 1.8 GHz opteron and PCI-e
NICs. That's 2857 cycles per packet, including the (small amount of)
Well, in my example above, everythin in the path of the send() syscall down
to the bare metal NIC is under high pressure in a fast loop. 30 cycles
already represent 1% of the performance! In fact, to modulate speed, I
Yes, the most common examples found today involve applications reading
data from databases. For instance, let's say that one function in your
program must count the number of unique people with the name starting
with an "A". It is very common to see "low-level" primitives to abstract
the database for portability purposes. One of such primitives will
generally be consist in retrieving a list of people with their names,
age and sex in one well-formated 3-column array. Many lazy people will
not see any problem in calling this one from the function described
above. Basically, what they would do is :
count_people_with_name_starting_with_a()
-> array[name,age,sex] = get_list_of_people()
-> while read_one_people_entry() {
alloc(one_line_of_3_columns)
read then parse the 3 fields
format_them_appropriately
}
-> create a new array "name2" by duplicating the "name" column
-> name3 = ...If you're writing a single-purpose program then there is justification
to micro-optimize it to the death. Write it in VHDL, even. But that
Having an interface to send multiple packets in one syscall would cut
Your optimized version is wrong. It counts duplicated names, while you
stated you needed unique names. Otherwise the sort_unique step is
completely redundant.
Databases are good examples of where the abstraction helps. If you had
hundreds of millions of records in your example, you'd connect to a
database, present it with an ASCII string describing what you want, upon
which it would parse it, compile it into an internal language against
the schema, optimize that and then execute it. Despite all that
abstraction it would win against your example because it would implement
the inner loop as
open index (by name)
seek to 'A'
while (current starts with 'A')
++count (taking care of the uniqueness requirement if
needed)
close index
Thus it would never see people who's name begins with 'W'. If the
database had a materialized view feature, and this particular query was
deemed important enough, it would optimize it to
open materialized view
read count
close materialized view
The database does all this while allowing concurrent reads and writes
and keeping your data in case someone trips on the power cord. You
If the abstraction if badly written, and further you cannot change it,
then of course it hurts. But if the abstraction is well written, or if
it can be fixed, then all is well. The problem here is not that
abstractions exist, but that you persist in using a broken API instead
That's life. The fact is that users demand features, and programmers
cater to them. If you can find a way to provide all those features
without the bloat, more power to you. The abstractions here are not the
cause of the bloat, they are the tool used to provide the features while
You don't need ...Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per second. (theoretical peak at 1514bytes/frame) Granted, installing such a device on a single CPU/single core machine is absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD Barcelona) it can still generate ~1M packets/s per core. Now assuming you're doing low-level (passive) filtering of some sort (frame/packet routing, traffic interception and/or packet analysis) using hardware assistance (TSO, complete TCP offloading, etc) is off the table and each and every cycle within netif_receive_skb (and friends) -counts-. I don't suggest that the kernel should be (re)designed for such (niche) applications but on other hand, if it works... - Gilboa --
Sigh... Sorry. Please ignore the broken math on my part. Make that 1.8M frames/second per card and ~100K packets/second per core. - Gilboa --
I was involved in a 10GBe project like you're describing not too long ago. Only the driver, and only a tight, lean, special purpose driver at that, was able to deal with line rate volumes. This was in a real appliance, where faster CPUs were not an option. In fact, not hardware changes were possible due to the issues with squeezing in the 10GBe nics. This project would have been impossible without the speed and deterministic behavior of th ekernel C environment. Casey Schaufler casey@schaufler-ca.com --
10GbE can't do 14M packets per second if the packets are 1514 bytes. At 10M packets per second you have less than 1000 bits per packet, which is far from 1514bytes. 10Gbps gives you at most 1.25GBps, which at 1514 bytes per packet works out to 825627 packets per second. You could reach ~14M packets per second with only the smallest packet size, which is rather unusual for high throughput traffic, since you waste almost all the bytes on overhead in that case. But you do want to be able to handle at least a million or two packets per second to do 10GbE. -- Len Sorensen --
... I corrected my math in the second email. [1] Never the less, a VOIP network (E.g. G729 and friends) can generate the maximum number of frames allowed on 10GbE Ethernet which is, AFAIR just below 15M -per- port. (~29M on a dual port card) While I doubt that any non-NPU based NIC can handle such a load, on mixed networks we're already seeing well-above 1M frames per port. - Gilboa [1] http://lkml.org/lkml/2007/12/3/69 --
C also requires a (very minimal) runtime. And I don't see how having a runtime disqualifies a language from being usable in a kernel; the runtime is just one more library, either supplied by the compiler or by Object orientation in C leaves much to be desired; see the huge number of void pointers and container_of()s in the kernel. --
While true, this isn't such a bad problem. A language really sucks when it tries to disallow something useful. Back in university I was forced to write system software in pascal. Simple pointer arithmetic became a 5-line piece of code. Imo the main advantage of C is simply that it doesn't get in the way. Jörn -- But this is not to say that the main benefit of Linux and other GPL software is lower-cost. Control is the main benefit--cost is secondary. -- Bruce Perens --
Well the majority of C syntax requires no runtime library. There are some system call like things that you often want that need a library (like malloc and such), but those aren't really part of C itself. Of course without malloc and printf and file i/o calls the program would probably be a bit boring. I have written some small C programs without a runtime, where the few things I needed where implemented in assembly As a programming language, C leaves much to be desired. -- Len Sorensen --
No. Kernel programming requires what is essentially assembly language with a lot of syntactic sugar, which C provides. Higher-level languages abstract away too much detail to be suitable for the sort of bit-perfect control you need when you're directly controlling bare metal. You can still use object-oriented programming techniques in C, and we do this all the time in the kernel, but we do so with more fine-grained explicit control than a language like Objective-C would give us. More to the point, if we tried to use Objective-C, we'd find ourselves needing to fall back to C-style explicitness so often that it wouldn't be worth the trouble. In other news, I hear Hurd boots again! -- Chris -
I somewhat disagree. Kernel programming requires and deserves the same care, rigor and eye to details as all other serious systems. Whilst performance is always a consideration, high-level languages give a reward in ease of expression and improved reliability, such that a notional performance cost is easily justified. Occasionally, precise bit-diddling or tight timing requirements might necessitate use of assembly; even so, a lot of bit-diddling can be expressed in high-level languages. Kernel programming might require a scintilla of assembly language, but the very vast majority of it should be written in a high-level language. There's an old joke that claims, "real programmers can write FORTRAN in any language." It's true. Object orientation is a style of programming, not a language, and while certain languages have intrinsic support for this style, objects, methods, properties and inheritance can be probably be written in any language. It's an issue of putting in care and eye to detail. Linux could be written in Objective-C, it could be written in Pascal, but it is written in plain C, with a smattering of assembler. Does it need to be more complicated than that? --
