"Can we please finish up this merge a little more before we freeze 2.6.24? The way we currently have leftovers of arch/i386/ and arch/x86_64/ is quite a nightmare and not how the other architectures were merged," Christoph Hellwig asked, leading to an insightful reply by Ingo Molnar. Ingo began by noting, "to answer that question one should first be aware of the fundamental code quality problems that the unified x86 architecture has inherited from the split i386 and x86_64 architectures." He then utilized the checkpatch script to generate a table of "coding style errors per one thousand lines of source code". In his table, arch/i386/ rated 77.3 errors per thousand lines of source, with arch/x86_64/ rating 96.0. The new unified arch/x86/ rated a lower but still very high 74.1. He summarized, "it is plainly obvious that the x86_64 and i386 architectures were in a dreadful state of code quality before the unification. Their code quality was almost an order of magnitude worse than that of the core kernel (!) - and their code quality was significantly worse than that of a couple of other, comparable architectures." Ingo continued:
"So to answer your question: full unification is no easy task and it is not automatic at all. The x86_64 tree has diverged from the i386 tree in the past 5 years due to their illogical, forced separation and a resulting bitrot. The two architectures have grown different sets of cleanliness problems and different sets of functions with arbitrary differences that often cover the same functionality. It's all compounded by the fact that the 64-bit code is in worse shape than the 32-bit - so it's not like we could just pick the 64-bit code and use that as the unified code. The 32-bit code is also used about 8-10 times more frequently than the 64-bit code. So there is no easy 'just unify it' path."
"The new maintainers of the x86 architecture (Thomas, Peter and me), and many other x86 developers are highly motivated to improve the x86 architecture's code quality and unify the heck out of it, and there are some real improvements in 2.6.24-rc1 already, but we _must_ be (and are) working on this carefully. So we do unifications on a case by case basis, with the highest priority being to not introduce 'unification regressions'. The x86 architecture is the most common Linux architecture after all - and users care much more about having a working kernel than they care about cleanups and unifications. So yes, we agree with you, but please be patient! :-) This cannot be realistically finished in v2.6.24, without upsetting the codebase."
From: Christoph Hellwig
Subject: Re: Linux v2.6.24-rc1
Date: Oct 24, 1:04 am 2007
On Tue, Oct 23, 2007 at 09:19:16PM -0700, Linus Torvalds wrote:
> In short, we just had an unusually large amount of not just x86 merges,
Btw, can we please finis up this merge a little more before we freeze
2.6.24? The way we currently have leftovers of arch/i386/ and arch/x86_64/
is quite a nightmare and not how the other architectures were merged.
Thomas, what again prevents us from just killing these leftovers?
-
From: Ingo Molnar
Subject: Re: Linux v2.6.24-rc1, x86 arch code quality, unifications
Date: Oct 24, 3:17 am 2007
* Christoph Hellwig <hch@infradead.org> wrote:
> On Tue, Oct 23, 2007 at 09:19:16PM -0700, Linus Torvalds wrote:
> > In short, we just had an unusually large amount of not just x86 merges,
>
> Btw, can we please finis up this merge a little more before we freeze
> 2.6.24? The way we currently have leftovers of arch/i386/ and
> arch/x86_64/ is quite a nightmare and not how the other architectures
> were merged.
>
> Thomas, what again prevents us from just killing these leftovers?
to answer that question one should first be aware of the fundamental
code quality problems that the unified x86 architecture has inherited
from the split i386 and x86_64 architectures.
To get objective and automated metrics about code quality, i've
constructed a table of "coding style errors per one thousand lines of
source code" numbers with the help of the latest checkpatch.pl. The
codebases i measured are the pre-merge i386 and x86_64 tree, the
post-merge arch/x86 unified architecture, and i've also added a handful
of other architectures and selected core subsystems, as comparison:
-------------------------------------------------------
| errors | lines of code | errors/KLOC
| | | (smaller is better)
--------------|------------|----------------|------------------------
arch/i386/ 5717 73865 77.3
arch/x86_64/ 2993 31155 96.0
arch/x86/ 8504 114654 74.1
..............|............|................|........................
arch/ia64/ 1779 64022 27.7
arch/mips/ 2110 94692 22.2
arch/sparc64/ 1387 49253 28.1
..............|............|................|........................
kernel/ 762 83540 9.1
kernel/time/ 15 4191 3.5
kernel/irq/ 1 2317 0.4
mm/ 464 46324 10.0
net/core 176 24413 7.2
..............|............|................|........................
a couple of observations. Firstly, it is plainly obvious that the x86_64
and i386 architectures were in a dreadful state of code quality before
the unification. Their code quality was almost an order of magnitude
worse than that of the core kernel (!) - and their code quality was
significantly worse than that of a couple of other, comparable
architectures. (we knew this when we started the x86 unification effort
- but i suspect it's even more apparent via the hard numbers in this
table.)
( Note: code metrics should be taken with a grain of salt, as they
often over-simplify the picture, but in this particular situation the
trends are clear and the numbers match my personal impressions of
code quality and robustness of these codebases. )
paradoxically the x86_64 architecture that had a _worse_ code quality
than the "legacy" 32-bit code - so much about the "newer code must be
better" misconception. The first, mechanic round of unifications thus
brought a net degradation in quality - but we've reversed that trend in
2.6.24-rc1 already, via unifications and cleanups, as it can be seen
from the table. (and we did that while adding new features like
high-resolution timers and dynticks to the x86-64bit architecture in
v2.6.24-rc1 - or the new IOMMU code. So the x86 architecture is not
standing still at all while the unification is going on.)
so to answer your question: full unification is no easy task and it is
not automatic at all. The x86_64 tree has diverged from the i386 tree in
the past 5 years due to their illogical, forced separation and a
resulting bitrot. The two architectures have grown different sets of
cleanliness problems and different sets of functions with arbitrary
differences that often cover the same functionality. It's all compounded
by the fact that the 64-bit code is in worse shape than the 32-bit - so
it's not like we could just pick the 64-bit code and use that as the
unified code. The 32-bit code is also used about 8-10 times more
frequently than the 64-bit code. So there is no easy "just unify it"
path.
The new maintainers of the x86 architecture (Thomas, Peter and me), and
many other x86 developers are highly motivated to improve the x86
architecture's code quality and unify the heck out of it, and there are
some real improvements in 2.6.24-rc1 already, but we _must_ be (and are)
working on this carefully. So we do unifications on a case by case
basis, with the highest priority being to not introduce "unification
regressions". The x86 architecture is the most common Linux architecture
after all - and users care much more about having a working kernel than
they care about cleanups and unifications. So yes, we agree with you,
but please be patient! :-) This cannot be realistically finished in
v2.6.24, without upsetting the codebase.
Ingo
-
[mid=352550,352591,352594,
Ingo
Don't you think he's done too much? :-)
- And he recieves a lot of flak for it
Yep, he's doing A LOT of the important kernel development, and a lot of what he gets back, is bitching from people that knows almost nothing about kernel development.
I, for one, would like to extend a BIG thanks to Ingo, for keeping our kernel fast, lean, properly working and clean-looking!!!!
People that know "almost
People that know "almost nothing" about kernel development - like Christoph Hellwig, you mean?
WTF? In which way has
WTF? In which way has Hellwig been doing the "CFS sucks", "Ingo sucks", "It's just because Linus likes him" crap?
I'm not talking about Hellwig's wish for a quick merge of the arch's (I'm pretty sure everyone wants that, including Ingo, there's just issues that makes it hard to rush), I'm talking about the tons of crap being dumped by Con's army of useful idiots, and the like.
: I'm talking about the tons
: I'm talking about the tons of crap being dumped by Con's army of useful idiots, and the like.
Wow, the elites of Ingo's army strike back!
*sigh*
How tedious, predictable, and boring.
I'd be curious to see, if Jeremy blocked out names for a week or two and just assign them generic names like "Person 1", "Person 2", "Person 3", etc., how many people could line up the names with the actual people without going back to LKML archives and lining up the quotes. Or, if he randomized it per article, if people could actually figure out who it was within the context of that article.
The kernel programmers are bright, opinionated, articulate and argumentative. I'd say they're also extremely good at what they do. It's certainly entertaining to watch, and rewarding to participate in when it happens to happen.
Down here in the threads, I sometimes feel like I'm watching monkeys fling poo at the zoo. Especially when the "Oh no! It's Ingo!" crowd shows up along with their counterparts.
--
Program Intellivision and play Space Patrol!
What are you talking about?
You're the only one to mention Chris Hellwig in this sense - the post you replied to was referring to reponses on CFS, not the original e-mail. There really isn't any disagreement I can see there anyway, plus the excerpts are out of context from the conversation anyway...
Not acceptable
That it has so many coding style errors is unacceptable.
x86 is the most used architecture. It should be the cleanest and best implemented.
An esoteric architecture
An esoteric architecture merits an esoteric implementation ...
32-bit Linux on 64-bit processors
Ingo Molnar: "The 32-bit code is also used about 8-10 times more
frequently than the 64-bit code."
I wonder for how long? Running a 32-bit OS on the newer processors from Intel and AMD is starting to be a bit like running 16-bit MS-DOS code on 386 and 486, back in the days when Linux was born. I agree it was probably painful in the beginning, but in the latest major distros the 64-bit mode is ok. In Mandriva, whose 64 -bit edition I use, they have also managed to make using legacy 32-bit binaries quite seamless. Even proprietary 32-bit browser plugins Just Work, thanks to nspluginwrapper. I predict that in a few years the kernel maintainers will find keeping the 32-bit variant around to be an unwanted legacy, and will want to split the architectures again...
I wonder for how
That's one of the reasons why this unification is important IMHO. According to most usage stats, x86_64 is primarily used on servers, with spotty desktop use. The i386 arch is mostly used on desktops with lots of server use as well. As a result of that the i386 arch has acquired more code robustness (via a more varied user-base and hardware-base) and has acquired more desktop features.
Treating i386 as a 'legacy step-child of x86_64' was IMHO a mistake - a more tested and more utilized code-base is always valuable because it's easier to add features to it. (as bugs get found faster)
If the new x86 architecture goes down in flames and becomes a major disaster then Andi should pick up the old split architectures and should continue maintaining them. If it works out fine (which is my guess) then Andi should admit his mistake quickly (nobody is perfect) and should join the effort. For Linux it's a win-win situation, as the interest in the x86 architecture and activities around it are more intense than ever before - so there's enough manpower available for all the possible eventualities.
One thing must be said, the kernel devs and Linus in particular certainly have balls and don't seem to be afraid of making difficult decisions and don't seem to be afraid of making complex code changes to back up those decisions. If only Microsoft were this flexible ;)
It will be very interesting to watch how this new architecture works out. Popcorn anyone?
The sum is bigger then the two separated trees...
How is it possible?
If I merge 73865 lines of code from i386 with 31155 lines of code from x86_64 (73865+31155=105020) how can I get 114654 lines of code? The merge has created over 9500 lines of code :(
Just a thought...
Bye
Piero
New features have been added
New features have been added in 2.6.24-rc1 to the 64-bit code: IOMMU, high resolution timers, dynticks, hpet enhancements.
Could it be....
...around 3200 instances of
#ifdef x86_64
...
#else /* 32-bit code */
...
#endif
perhaps? ;-)
(I kid! I kid!)
--
Program Intellivision and play Space Patrol!
Haha
I'm almost suspicious of that myself...
I've posted before.
I've posted before a good strategy to minimize problems, here in
http://kerneltrap.org/Linux/Discussing_the_x86_Merge
I'm waiting its good merge i386/x86-64 :)
* i386 assembly/architecture is very good for simple and small 32-bit processes.
[ Bandwith of its bus is 1 * X MegaWord32/s ].
* x86_64 assembly/architecture is very good for complex, multithreaded and big 64-bit processes.
Is good for the addressing of the kernel more than 4 or 8 GiB of RAM.
[ Bandwith of its bus is 0.5 * X MegaWord64/s. ]
To merge them both is very good for this ILP32/LP64 (or possibly ILP64) architecture.
My idea is to have one 64-bit kernel, many 32-bit processes (ELF32) and few 64-bits big processes (ELF64).
They are needed one 32-bit & 64-bit crosscompiler GCC (-march=athlon for applications and -march=k8 for 64-bit kernel and applications).
The map of many devices like from PCI, PCI-E, USB-2.0, AGP, Sound, Ethernet, etc. are still 32-bit.
Merged 32-bit (to ignore higher 32 bits) and 64-bit addressing have to be:
1) 0x0000'0000'0000'0000 .. 0x0000'0000'0000'0FFF is reserved 4 KiB page for NULL exception.
2) 0x0000'0000'0000'1000 .. 0x0000'0000'BFFF'FFFF is userspace for pure 32-bit tasks (3G/1G)(and for 64-bit tasks reserving it for 32-bit emulation/translation).
3) 0x0000'0000'C000'0000 .. 0x0000'0000'FFFF'FFFF is kernelspace for pure 32-bit tasks (3G/1G)(and for 64-bit tasks reserving it for 32-bit emulation/translation using its recovered 32-64-bit trampoline although the running kernel is 64-bit).
4) 0x0000'0001'0000'0000 .. 0x0000'7FFF'FFFF'FFFF is userspace for pure 64-bit tasks.
5) 0xFFFF'8000'0000'0000 .. 0xFFFF'FFFF'FFFF'FFFF is kernelspace for pure 64-bit tasks.
http://en.wikipedia.org/wiki/X86-64
Paint it with colors: Green for userspace and Red for kernelspace :)
Athlon64, Opteron, etc. and the old i486 are good chips, not yet the best.
EFFORTS in merging are important!
This table below is a bad report:
It requires many efforts to obtain the merging.
See this idea of this below exhaustive table to prevent errors of merging:
* "is merged" means "the 32-bit and 64-bit files are modified and merged resulting one unique file for both architectures".
* "is merged" is OK if it works for both machines i386 and x86-64.
Possible values for the report are: OK, Failed, Not-tested, No-such-file, ...
J.C. Pizarro
And interesting discussion of LP64/ILP64.
And there is an interesting discussion of LP64/ILP64 for the merge of 32-bit i386 and 64-bit x86-64:
http://kerneltrap.org/node/14159#comments
J.C. Pizarro
Hmmm...
(a) What are you smoking? I had a hard time making heads/tails of your post. It looked like an amalgam of other articles, with random bits changed.
(b) You are aware you can run a full 32-bit userspace under a 64-bit x86-64 kernel, right? That works today.
At any rate, an ILP64 version of Linux seems unlikely as long as there are ILP32 implementations around, as code that assumes 32-bit int is widespread. Code that assumes sizeof(int)==sizeof(void*) is thankfully much less prevalent due to the fact people moved to ANSI C some time ago. 32-bit int has a couple of orders of magnitude greater installed base in mainstream computing than did 16-bit int when mainstream started to shift to 32-bit.
(Note that I said "mainstream." There are billions of little embedded toys with 16-bit int.)
--
Program Intellivision and play Space Patrol!