Compiler Misoptimizations

Submitted by Jeremy
on October 25, 2007 - 12:23pm

"Basically, what the gcc developers are saying is that gcc is free to load and store to any memory location, so long as it behaves as if the instructions were executed in sequence," Nick Piggin noted, describing a linked discussion on the GCC development mailing list. He explained his concerns, "for x86, obviously the example above shows it can be miscompiled, but it is probably relatively hard to make it happen for a non trivial sequence. For an ISA with lots of predicated instructions like ia64, it would seem to be much more likely. But of course we don't want even the possibility of failures. The gcc guys seem to be saying to mark everything volatile that could be touched in a critical section. This is insane for Linux." Linus Torvalds reflected:

"Are you surprised? The gcc developers seem to have had a total disregard for what people want or need, and every time some code generation issue comes up, there's a lot of people on the list that do language-lawyering, rather than admit that there might be a problem.

"It's happened before, it will happen again. I don't think it's true of all gcc developers (or even most, I hope), but it's common enough. For some reason, compiler developers seem to be far enough removed from 'real life' that they have a tendency to talk in terms of 'this is what the spec says' rather than 'this is a problem'."


From: Nick Piggin
Subject: Is gcc thread-unsafe?
Date: Oct 24, 8:24 pm 2007

Hi,

Andi spotted this exchange on the gcc list. I don't think he's
brought it up here yet, but it worries me enough that I'd like
to discuss it.

Starts here
http://gcc.gnu.org/ml/gcc/2007-10/msg00266.html

Concrete example here
http://gcc.gnu.org/ml/gcc/2007-10/msg00275.html

Basically, what the gcc developers are saying is that gcc is
free to load and store to any memory location, so long as it
behaves as if the instructions were executed in sequence.

I guess that dynamically allocated memory and computed pointers
are more difficult for gcc to do anything unsafe with, because
it is harder to tell if a given function has deallocated the
memory. However even that could theoretically happen in future
if the compiler can work out the address comes from a global
variable or is not changed intermediately.

Linux makes extensive use of both trylocks and interruptible
locks (ie. which automatically result in divergant code paths,
one of which holds the lock, the other doesn't). However there
are also other code paths which will either hold a particular
lock or will not hold it, depending on context or some flags
etc. barrier() doesn't help.

For x86, obviously the example above shows it can be miscompiled,
but it is probably relatively hard to make it happen for a non
trivial sequence. For an ISA with lots of predicated instructions
like ia64, it would seem to be much more likely. But of course
we don't want even the possibility of failures.

The gcc guys seem to be saying to mark everything volatile that
could be touched in a critical section. This is insane for Linux.

Any thoughts?
-

From: Linus Torvalds Subject: Re: Is gcc thread-unsafe? Date: Oct 25, 7:55 am 2007 On Thu, 25 Oct 2007, Nick Piggin wrote: > > Andi spotted this exchange on the gcc list. I don't think he's > brought it up here yet, but it worries me enough that I'd like > to discuss it. Are you surprised? The gcc developers seem to have had a total disregard for what people want or need, and every time some code generation issue comes up, there's a lot of people on the list that do language-lawyering, rather than admit that there might be a problem. It's happened before, it will happen again. I don't think it's true of all gcc developers (or even most, I hope), but it's common enough. For some reason, compiler developers seem to be far enough removed from "real life" that they have a tendency to talk in terms of "this is what the spec says" rather than "this is a problem". Happily, at least in this kind of situation, threading is a real issue for other projects than just the kernel, so maybe it gets solved properly. But I have to admit that for the last five years or so, I've really wanted some other compiler team to come up with a good open-source compiler. Exactly due to issues like this (Q: "Gcc creates bogus code that doesn't work!" A: "It's not bogus, it's technically allowed by the language specs that don't talk about xyz, the fact that it doesn't work isn't our problem"). I think the OpenBSD people decided to actually do something about this, and I suspect it had *nothing* to do with license issues, and everything to do with these kinds of problems. I wish them all the luck, although personally I think LLVM is a much more interesting project. Linus -
From: Pekka Enberg Subject: Re: Is gcc thread-unsafe? Date: Oct 25, 8:12 am 2007 Hi, On 10/25/07, Linus Torvalds <torvalds@linux-foundation.org> wrote: > I think the OpenBSD people decided to actually do something about this, > and I suspect it had *nothing* to do with license issues, and everything > to do with these kinds of problems. I wish them all the luck, although > personally I think LLVM is a much more interesting project. The BSD people are adopting pcc [1] which is a rewritten version of some C compiler originally developed in the late 70s. And yeah, it's basically because they think gcc is becoming too painful to live with [2]. Pekka 1. http://pcc.ludd.ltu.se/ 2. http://www.thejemreport.com/mambo/content/view/369/ -

From: Andi Kleen
Subject: Re: Is gcc thread-unsafe?
Date: Oct 25, 12:15 am 2007

On Thursday 25 October 2007 05:24, Nick Piggin wrote:

> Basically, what the gcc developers are saying is that gcc is
> free to load and store to any memory location, so long as it
> behaves as if the instructions were executed in sequence.

This case is clearly a bug, a very likely code pessimization.
I guess it wasn't intentional, just an optimization that is useful
for local register values doing too much.

> I guess that dynamically allocated memory and computed pointers
> are more difficult for gcc to do anything unsafe with, because
> it is harder to tell if a given function has deallocated the
> memory. 

Often accesses happen without function calls inbetween.
Also I think newer gcc (not 3.x) can determine if a pointer
"escapes" or not so that might not protect against it.

> Any thoughts?

We don't have much choice: If such a case is found it has to be marked
volatile or that particular compiler version be unsupported.

It might be useful to come up with some kind of assembler pattern
matcher to check if any such code is generated for the kernel
and try it with different compiler versions.

-Andi
-

From: linux-os (Dick Johnson) Subject: Re: Is gcc thread-unsafe? Date: Oct 25, 4:58 am 2007 On Thu, 25 Oct 2007, Andi Kleen wrote: > On Thursday 25 October 2007 05:24, Nick Piggin wrote: > >> Basically, what the gcc developers are saying is that gcc is >> free to load and store to any memory location, so long as it >> behaves as if the instructions were executed in sequence. > > This case is clearly a bug, a very likely code pessimization. > I guess it wasn't intentional, just an optimization that is useful > for local register values doing too much. I don't think it is a BUG, but one should certainly be able to turn it off. Gcc is correct in that the 'C' language allows a lot of implimentation details that are not covered by the language. In other words, 'C' is not assembly-language. > >> I guess that dynamically allocated memory and computed pointers >> are more difficult for gcc to do anything unsafe with, because >> it is harder to tell if a given function has deallocated the >> memory. > > Often accesses happen without function calls inbetween. > Also I think newer gcc (not 3.x) can determine if a pointer > "escapes" or not so that might not protect against it. > >> Any thoughts? > > We don't have much choice: If such a case is found it has to be marked > volatile or that particular compiler version be unsupported. > > It might be useful to come up with some kind of assembler pattern > matcher to check if any such code is generated for the kernel > and try it with different compiler versions. > > -Andi Cheers, Dick Johnson Penguin : Linux version 2.6.16.24 on an i686 machine (5592.59 BogoMips). My book : http://www.AbominableFirebug.com/ _ **************************************************************** The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. -
From: Andi Kleen Subject: Re: Is gcc thread-unsafe? Date: Oct 25, 5:16 am 2007 On Thursday 25 October 2007 13:58:56 linux-os (Dick Johnson) wrote: > > On Thu, 25 Oct 2007, Andi Kleen wrote: > > > On Thursday 25 October 2007 05:24, Nick Piggin wrote: > > > >> Basically, what the gcc developers are saying is that gcc is > >> free to load and store to any memory location, so long as it > >> behaves as if the instructions were executed in sequence. > > > > This case is clearly a bug, a very likely code pessimization. > > I guess it wasn't intentional, just an optimization that is useful > > for local register values doing too much. > > > I don't think it is a BUG, Bug as in an optimization that makes the code slower than it was before. That is clearly a bug in a compiler. -Andi -

From: Arjan van de Ven
Subject: Re: Is gcc thread-unsafe?
Date: Oct 24, 8:46 pm 2007

On Thu, 25 Oct 2007 13:24:49 +1000
Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> Hi,
> 
> Andi spotted this exchange on the gcc list. I don't think he's
> brought it up here yet, but it worries me enough that I'd like
> to discuss it.
> 
> Starts here
> http://gcc.gnu.org/ml/gcc/2007-10/msg00266.html
> 
> Concrete example here
> http://gcc.gnu.org/ml/gcc/2007-10/msg00275.html
> 
> Basically, what the gcc developers are saying is that gcc is
> free to load and store to any memory location, so long as it
> behaves as if the instructions were executed in sequence.
> 


this optimization btw is a serious mis-optimization, it makes memory
more dirty and causes cachelines to become unshared.... I'm sure it
works great on microbenchmarks but it sucks bigtime for anything real
-

From: Nick Piggin Subject: Re: Is gcc thread-unsafe? Date: Oct 24, 8:58 pm 2007 On Thursday 25 October 2007 13:46, Arjan van de Ven wrote: > On Thu, 25 Oct 2007 13:24:49 +1000 > > Nick Piggin <nickpiggin@yahoo.com.au> wrote: > > Hi, > > > > Andi spotted this exchange on the gcc list. I don't think he's > > brought it up here yet, but it worries me enough that I'd like > > to discuss it. > > > > Starts here > > http://gcc.gnu.org/ml/gcc/2007-10/msg00266.html > > > > Concrete example here > > http://gcc.gnu.org/ml/gcc/2007-10/msg00275.html > > > > Basically, what the gcc developers are saying is that gcc is > > free to load and store to any memory location, so long as it > > behaves as if the instructions were executed in sequence. > > this optimization btw is a serious mis-optimization, it makes memory > more dirty and causes cachelines to become unshared.... I'm sure it > works great on microbenchmarks but it sucks bigtime for anything real Well that's exactly right. For threaded programs (and maybe even real-world non-threaded ones in general), you don't want to be even _reading_ global variables if you don't need to. Cache misses and cacheline bouncing could easily cause performance to completely tank in some cases while only gaining a cycle or two in microbenchmarks for doing these funny x86 predication things. I'm not sure about ia64 -- I _hope_ that for most of their predication stuff, they also predicate the stores, rather than just store unconditionally and rely on the source operand not changing in the case they didn't intend the memory to change. -

Another gcc gotcha that

Anonymous (not verified)
on
October 25, 2007 - 2:03pm

Another gcc gotcha that violates Principles of Least Surprise:

for (int i(0); i>=0; ++i) { }

This is now an infinite loop under the latest gcc with -O2 by design.

Of course if you compile

NullNix (not verified)
on
October 25, 2007 - 3:13pm

Of course if you compile with -Wall it will *tell* you that, and it was also mentioned in the release notes, oh and it does speed up some codebases significantly.

That's standard C for you

Mr_Z
on
October 25, 2007 - 6:32pm

C permits this all day long. Some devices implement saturating arithmetic for signed types. Others can signal a trap for signed overflow.

As for the specific thing being complained about... it's clear to me that C / C++ need some set of highly specific memory access intrinsics so that programmers can make clear precisely which accesses need to be ordered with respect to which, and where concrete accesses are expected.

Furthermore, those intrinsics need to be well defined in very specific use-cases, and scream bloody murder otherwise. For example, only one memory access intrinsic per sequence point.

Forget "abstract model." It's clear for certain problems, one needs the ability to concretely specify the sequence of accesses, and in a multiprocessor situation, the required ordering as viewed by the entire system. Maybe something like this (variants of which I've proposed before):

#include   /* C99 size-specific integer types */

uint8_t  _memrd_u8 (const uint8_t  *loc,       memacc_dep_t dep, ...);
uint16_t _memrd_u16(const uint16_t *loc,       memacc_dep_t dep, ...);
uint32_t _memrd_u32(const uint32_t *loc,       memacc_dep_t dep, ...);
uint64_t _memrd_u64(const uint64_t *loc,       memacc_dep_t dep, ...);
void*    _memrd_ptr(const void **restrict loc, memacc_dep_t dep, ...);

void     _memwr_u8 (uint8_t  val, uint8_t  *loc,          memacc_dep_t dep, ...);
void     _memwr_u16(uint16_t val, uint16_t *loc,          memacc_dep_t dep, ...);
void     _memwr_u32(uint32_t val, uint32_t *loc,          memacc_dep_t dep, ...);
void     _memwr_u64(uint64_t val, uint64_t *loc,          memacc_dep_t dep, ...);
void     _memwr_ptr(const void *val, void **restrict loc, memacc_dep_t dep, ...);

void     _mem_barrier(memacc_flags_t flags, memacc_dep_t dep, ...);

(And you can easily imagine signed variants of these.)

These would always generate the requested access when encountered, and would never be scheduled speculatively by the compiler. The "flags" expression should be a compile-time constant that indicates to the compiler the required memory ordering through the memory system (e.g. the required SMP memory barriers associated with the access), such as "after prior reads", "after prior writes", "before subsequent reads", "before subsequent writes", "do not cache", "write through", etc.

The "dep" ids also must be compile-time constants, for instance, a 32-bit int. Within a compilation unit (or compilation units, if an implementation allows making multiple units available to the compiler at one time), indicate specific dependencies among memXX operations. These indicate scheduling constraints on the compiler for a series of accesses. The first "dep" argument would indicate the ID associated with a specific access. The full list of "dep" IDs would indicate accesses this access must be ordered relative to. The ID 0 would indicate "any", thus replicating the current "volatile" semantics somewhat.

For example, suppose I have a memory mapped peripheral for which I can set up its registers in any order, but the last access initiates the activity. I might write:

     _memwr_u32(value1, reg[1], 1);
     _memwr_u32(value2, reg[2], 2);
     _memwr_u32(value3, reg[3], 3);
     _memwr_u32(value4, reg[4], 4);
     _mem_barrier(_MEM_WR_BEFORE|_MEM_WR_AFTER, 1, 2, 3, 4);
     _memwr_u32(value5, reg[5], 1, 2, 3, 4);
     _mem_barrier(_MEM_WR_BEFORE|_MEM_WR_AFTER, 1);

This code could even appear in a loop and it would still do the right thing. Values 1 through 4 would get committed to the peripheral before value 5, and on a future loop iteration, none of those writes would float above the write to 5.

Such a paradigm makes it next to impossible to have a disconnect between the programmer's intent and the compiler's interpretation. It also makes all the assumptions and dependencies explicit.

The benefit is that the compiler now can actually generate much better code for these accesses and it has a much greater likelihood of being correct. Furthermore, it gives even more explicit license for the compiler to use performance enhancing optimizations on more typical memory accesses in the hypothetical virtual C machine with impunity.

As the world goes increasingly multithreaded, the need for mechanisms like this will only grow.

Here's another example, Dekker's Algorithm:

uint8_t  f0 = 0, f1 = 0, turn = 0;

/* CPU 0 */
_memwr_u8(1, &f0, 0);
_mem_barrier(_MEM_WR_BEFORE|_MEM_RD_AFTER, 0);
while (_memrd_u8(&f1, 0) != 0)
{
    _mem_barrier(_MEM_RD_BEFORE|_MEM_RD_AFTER, 0);
    if (_memrd_u8(&turn, 0) != 0)
    {
        _mem_barrier(_MEM_RD_BEFORE|_MEM_WR_AFTER, 0);
        _memwr_u8(0, &f0, 0);
        _mem_barrier(_MEM_WR_BEFORE|_MEM_RD_AFTER, 0);
        do {
        } while (_memrd_u8(&turn) != 0);
        _mem_barrier(_MEM_RD_BEFORE|_MEM_WR_AFTER, 0);
        _memwr_u8(1, &f1, 0);
    }
    _mem_barrier(_MEM_WR_BEFORE|_MEM_RD_AFTER, 0);
}
/* following barrier, plus previous "RD_AFTER" barrier(s) fence in crit-section RD, WR at the top. */
_mem_barrier(_MEM_WR_AFTER, 0);  

/* critical section */
/* critical section */
/* critical section */

/* This barrier fences in all crit-section RD, WR at the bottom. */
_mem_barrier(_MEM_RD_BEFORE|_MEM_WR_BEFORE|_MEM_WR_AFTER, 0);
_memwr_u8(1, &turn, 0);
_mem_barrier(_MEM_WR_BEFORE|_MEM_WR_AFTER, 0);
_memwr_u8(0, &f0, 0);
_mem_barrier(_MEM_WR_BEFORE|_MEM_WR_AFTER, 0);

Note that I picked on Dekker's algorithm as it is an example of code that relies very heavily on strong memory ordering in order to execute correctly. In this example, I am pretty sure I used the minimally strong barriers necessary. For instance, the memory system can and should be allowed to float non-critical reads above the bottom (such as stack accesses), and non-critical writes above the top-most _mem_barrier (again, likely to be stack accesses from spills).

For example, allowing the compiler and the memory system to float non-crit-section reads up at the bottom allows those to schedule earlier and be serviced earlier, hiding cache miss latencies, etc. Very useful.

A downside of this approach is that all of the shared context protected by Dekker's Algorithm also must be accessed with _memrd/_memwr/_mem_barrier. In all likelihood, those accesses will have far fewer constraints on them, and so will be easier to read since there should be fewer (if any) _mem_barrier calls. The wrapper around all of the critical accesses provided by Dekker's Algorithm will fence them in at top and bottom.

Another aspect of this sort of proposal is that all bets are theoretically off if one of the variables gets written (or less likely, read) via normal means (e.g. ordinary variable accesses). So perhaps this also calls for a special storage class modifier, such as "shared" or "explicit" to enforce this constraint?

--
Program Intellivision and play Space Patrol!

already done

Anonymous (not verified)
on
October 26, 2007 - 12:07am

I fail to see why you need to introduce memory accessor functions when you can do the same thing with "volatile":

uint8_t _memrd_u8 (const uint8_t *loc, memacc_dep_t dep)
{return *(const volatile uint8_t *)loc;}

the memory barriers you will need nevertheless, but fortunately gcc people also realized the need for that:

__sync_synchronize();

Except that doesn't work

Mr_Z
on
October 26, 2007 - 11:09am

Volatile has too-strong an ordering guarantee for compiler scheduling (all volatile are strictly scheduled relative to all other volatile), and the standard still makes no statement about what constitutes a concrete access.

Just as BWK railed against "noalias," and "restrict" was adopted instead, I say we need something to replace volatile with more direct semantics, stricter rules, and more easily comprehended behavior. I don't want to "change volatile", and give people a false sense of security. I want accesses to "concrete, shared" objects to be policed, and I want concrete, easily-reasoned-about ways to describe my intent to the compiler, so that a discussion about a miscompilation doesn't hinge on a misplaced comma in section 4.3.2.1 in a spec somewhere.

Changing volatile in order to start policing things requires deprecating the existing volatile first. Also, it would put the implementation of _memrd, etc. outside the language. If they're required to be compiler intrinsics (or perhaps operators!), then it's clear it's a language extension.

In fact, I wouldn't mind if they added operators for this. Use the @ sign or the like. @ptr[offset] for concrete read/write, @ptr[offset](...) for read/write with dep IDs. Keep the _mem_barrier intrinsic unless you can think of something more concise. Something like:

    @ptr[0] = foo;     /* concrete write to ptr, scheduling dependence on "any" */
    foo     = @ptr[0];    /* concrete read from ptr, scheduling dependence on "any" */
    
    /* Four writes that must complete before a fifth write */
    @ptr[1](1) = data1;
    @ptr[2](2) = data2;
    @ptr[3](3) = data3;
    @ptr[4](4) = data4;
    _mem_barrier(MEM_WR_ABOVE|MEM_WR_BELOW, 1,2,3,4);
    @ptr[5](1,2,3,4) = data5;

Note that writes 1 thru 4 in that second example could schedule in any order and are not ordered by the memory system. The only ordering is between them and the fifth write.

There are significant advantages on some platforms for allowing the compiler the additional scheduling flexibility for those writes, especially when "data1" through "data5" are not simple expressions, and when there are "abstract" reads and writes around unrelated to the accesses with ordering constraints.

--
Program Intellivision and play Space Patrol!

Behaviour.

Leonardo L. (not verified)
on
October 25, 2007 - 10:23pm

This actually should be an infinite loop, if you think about integers as the 'Z' mathematic set, and that's what the compiler is doing. BTW, expecting this code to DoTheRightThing(tm) by overflowing is at least as dumb as expecting it to be portable.

prefix and postfix suck

Anonymous (not verified)
on
October 26, 2007 - 3:21am

prefix and postfix suck anyway

use
i += 1

that is the best way.

Have you ever programmed anything serious?

Mr_Z
on
October 26, 2007 - 10:55am

I'm serious... have you? Do you have a link? I wanna see how such a silly rule worked out for you.

--
Program Intellivision and play Space Patrol!

That will get optimised into

Nony mouse (not verified)
on
October 26, 2007 - 12:03pm

That will get optimised into i++

Actually...

Mr_Z
on
October 26, 2007 - 7:18pm

It's closer to ++i.

Consider the difference between these two pieces of code:

a = (b += 1);

and

a = b++;

:-)
----

Edit: Never mind precedence.... *++a vs. *a += 1
--
Program Intellivision and play Space Patrol!

Just fork the goddamn thing!

Jack Ripoff (not verified)
on
October 25, 2007 - 2:38pm

The solution is as simple as: fork the project!

The OpenBSD project has already done that.

That's what free software is all about: if something is wrong I can fix it because I'm free to do it.

OpenBSD never forked the GCC

Anonymous (not verified)
on
October 25, 2007 - 2:49pm

OpenBSD never forked the GCC and never will do this!
Spend your effort into the PCC if you realy want a rocking compiler....

I ran into GCC troubles myself by using bcopy.
I had 1 Prototype (include) and it nerved me soemthing is shitty. I made a 2nd Prototype (same application, for bcopy...) and everything was cool.. no warning, no error.. 2 Prototypes for bcopy.. no problem.....
Just "omg"...

So you had a prototype (for

Anonymous (not verified)
on
October 25, 2007 - 6:54pm

So you had a prototype (for bcopy?) and you had a feeling something was wrong, so you added another prototype for bcopy. You received no errors or warnings, and everything was fine.

Was there even a problem in the first place? Having multiple prototypes for the same function is allowed by the C standard.

Also, bcopy is a deprecated function. Why are you using it?

Of course they did!

Jack Ripoff (not verified)
on
November 1, 2007 - 6:39am

Linus needs to get real

Anonymous (not verified)
on
October 25, 2007 - 11:09pm

And to start working with the compiler, rather than against the compiler. "Trust the compiler. The compiler is your friend."

Really, it's not like the Linux kernel defines a strong in-kernel memory model either. Complaining about the compiler doing the wrong thing when the right thing isn't defined either by standard or by convention seems rather lame.

"Complaining about the

Anonymous (not verified)
on
October 26, 2007 - 3:30am

"Complaining about the compiler doing the wrong thing when the right thing isn't defined either by standard or by convention seems rather lame."

a) Standards often help but sometimes they create problems, i.e. in this case when they leave something undecided, or when their enforced route leads to other, different problems (FHS standard comes to my mind, it creates less structure of a system by enforcing it. Look at the .net standard, it makes a lot more sense to group stuff like System.SomeStuff than in bogus directories)
The thing is the linux kernel project is a BIG project. The guys there ARE good writing C.
Why is GCC fighting AGAINST them with new releases?

b) a convention does NOT help at all against a team that does NOT want to cater to your way at all!

c) I hope C will die one day. Not because it sucks but because it is sugar for Assembler code. :)

The FHS makes perfect sense for servers

Craig Ringer (not verified)
on
October 26, 2007 - 9:08am

The FHS makes perfect sense for servers, tiny systems, and basically everything but desktops.

You often want different parts of the root tree on different storage on a server, for example. /usr might well be read only, and can certainly be on very fast to read but very slow to write storage. /var needs to be on something with fast writes and reads. /tmp can be vm based, on volatile storage, etc.

On an embedded system, similar concerns exist. /usr can be straight off flash. /etc can be COW from flash. /tmp can be volatile. etc.

It's also useful for backups. For example, I know I can take daily incrementals of /etc on all my servers, and that'll get me a history of their configuration. I don't need to worry about this growing excessively, being full of temp files, etc. I know I don't need to bother backing /usr up at all except when I do a full bare-metal restore image ; changes there come straight from package management anyway.

So - I disagree in the strongest possible terms with your claim that the FHS is nonsensical. It's not designed for friendly logical groupings at an application level, it's designed for grouping at a storage level. You could group by application / functional unit as you describe if you could use consistent file system labels and some kind of indexing to keep track of which of the jumbles of files fell into which categories - but at present, no such service exists. The current groupings are also just easier.

I see no reason why programs should be grouped into directories in the file system. That's what the package management system is for - keeping track of all that - and it does so well. As such, even for desktops I see little point in worrying about the file system structure. Your data lives in /home, and you can drop stuff in /opt or /usr/local if you want. Beyond a symlink from /programs to /opt if you want to make that a bit more explicit, I see little to change.

--
Craig Ringer

Wait a tick, are you saying

Nony mouse (not verified)
on
October 26, 2007 - 12:05pm

Wait a tick, are you saying the GCC developers are not good at writing C ?

Language semantics

Mr_Z
on
October 27, 2007 - 9:38am

There are good reasons compiler writers ask for many of the weasel phrases that get into the language standard. It allows many classes of algorithm to be implemented for generating and optimizing the code (algorithms that would be impractical or impossible otherwise). These algorithms generate code with good overall performance. With stricter wording, the compiled result would often be much worse, and results in a net loss.

Certain very useful applications need stricter semantics for what are generally well contained subsets of the program. These useful programs cannot be written in C as specified. It turns out that typical compiler code generation strategies usually end up doing pretty much what the programmer expected most of the time. This has two effects.

One: It becomes possible to write these useful applications, even though they're not strictly correct. Pragmatism rules the roost, though, and large systems get built in this way.

Two: Programmers get attached to this language they're programming (which isn't strictly C), while compiler writers continually improve and change their compiler. At some point things break horribly.

At the nexus of these two effects are all the platitudes about "quality of implementation," "principle of least surprises," and all the grumbling about "language lawyering" and "making something useful."

I'm still a firm believer in giving the compiler as much flexibility as possible, and enabling the programmer to specify his intent to the greatest precision necessary to ensure a correct result. Implicit rules make for great shortcuts in the general case and are quite practical and pragmatic. But, when the last decimal point of details matters, it's important to allow exact specification of intent without resorting to assembly language.

--
Program Intellivision and play Space Patrol!

Let me add

Mr_Z
on
October 27, 2007 - 10:05am

The recent move toward PCC in the BSD camp strikes me as a move towards a simpler compiler for the sake of getting a compiler that compiles this other language—that looks like, but isn't quite C—that the system developers want to program.

C started out as a systems programming language with rather vague semantics, but owing to its simple compiler, had adequate (if mediocre) performance and unsurprising behavior most of the time. As C grew more popular and more complex, performance-oriented application code got written in C, and pressure was on the compiler to do better. Compilers started sprouting all sorts of optimization passes. The ambiguities of the C language got replaced with very tricky turns of phrase regarding such things as sequence points, and interesting outs coded in phrases beginning with modifiers such as "as if."

The systems programming guys—Torvalds, de Raadt et al.—don't really want to program in this "new" C. Most of their code doesn't benefit from the performance optimizations anyway. They want a language dialect that looks more like what the simpler compilers provided, simply because it's simple and easy to understand, and doesn't do surprising transformations to the code. That said, I do think they want the general performance levels offered by modern compilers. They just don't need the very last drop everywhere.

I personally argue it should be possible to add stuff to the C we have today to provide the required semantics where they're needed. Since the whole world is moving toward multiprocessors and multithreading, many of these things will be needed by a larger audience anyway. It'll be interesting to see where this goes. I say this in part because I actually get to live through this at work, on an entirely different set of platforms, with an entirely different compiler. This isn't a GCC problem, this is a compiler vs. applications vs. systems programmers problem.

--
Program Intellivision and play Space Patrol!

What about using

Anonymous (not verified)
on
October 26, 2007 - 8:15am

GCC developers do care !

Matthew J Fletcher (not verified)
on
October 30, 2007 - 8:54am

The GCC folks have had a big discussion about this and Ian Lance Taylor has prepared a patch (for all active branches) that stops the optimiser passing memory access boundaries.

http://gcc.gnu.org/ml/gcc/2007-10/msg00554.html

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.