Linux: Releasing With Known Regressions

Submitted by Jeremy
on April 27, 2007 - 12:32pm

Following the release announcement of the 2.6.21 Linux kernel [story], Adrian Bunk noted that he no longer planned to track regressions [story]. He explained, "if we would take 'no regressions' seriously, it might take 4 or 5 months between releases due to the lack of developer manpower for handling regressions. But that should be considered OK if avoiding regressions was considered more important than getting as quick as possible to the next two week regression-merge window."

Linus Torvalds disagreed with Adrian's view that increasing the length of the release cycle would improve stability, "regressions _increase_ with longer release cycles. They don't get fewer." He went on to add, "you are ignoring the reality of development. The reality is that you have to balance things. If you have a four-month release cycle, where three and a half months are just 'wait for reports to trickle in from testers', you simply won't get _anything_ done. People will throw their hands up in frustration and go somewhere else." He continued:

"Do you really think bugs get fixed faster just because there wasn't a release? Quite the reverse. Bugs get _found_ faster thanks to a release (simply because you tend to get more information thanks to more users), giving the stable people more information, causing the bugs to be able to be found and fixed _more_quickly_ in the stable release than if we had waited for four months to release 2.6.21."


From: Adrian Bunk [email blocked]
To: Linus Torvalds [email blocked]
Subject: Re: Linux 2.6.21
Date:	Thu, 26 Apr 2007 06:08:06 +0200

On Wed, Apr 25, 2007 at 08:29:28PM -0700, Linus Torvalds wrote:
>...
> So it's been over two and a half months, and while it's certainly not the 
> longest release cycle ever, it still dragged out a bit longer than I'd 
> have hoped for and it should have. As usual, I'd like to thank Adrian (and 
> the people who jumped on the entries Adrian had) for keeping everybody on 
> their toes with the regression list - there's a few entries there still, 
> but it got to the point where we didn't even know if they were real 
> regressions, and delaying things further just wasn't going to help.
>...


Number of different known regressions compared to 2.6.20 at the time
of the 2.6.21 release:
14

Number of different known regressions compared to 2.6.20 at the time
of the 2.6.21 release that were first reported in March or earlier:
8

Number of different known regressions compared to 2.6.20 at the time
of the 2.6.21 release with patches available at the time of the 2.6.21 
release [1]:
3

What I will NOT do:
Waste my time with tracking 2.6.22-rc regressions.


We have an astonishing amount of -rc testers, but obviously not the 
developer manpower for handling them.

If we would take "no regressions" seriously, it might take 4 or 5 months 
between releases due to the lack of developer manpower for handling 
regressions. But that should be considered OK if avoiding regressions 
was considered more important than getting as quick as possible to the 
next two week regression-merge window.

But releasing with so many known regressions is insulting for the many 
people who spent their time testing -rc kernels.


cu
Adrian

[1] http://lkml.org/lkml/2007/4/25/496

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


From: Adrian Bunk [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 14:58:02 +0200 A clarification: I am aware that my work had some effect, and I am aware that my work gets appreciated - there's no need for everyone to repeat this. The point is: I'm not satisfied with the result. Linus said 2.6.20 was a stable kernel. My impression was that at least two of the regressions from my 2.6.20 regressions list should have been fixed before 2.6.20. They have both been fixed through -stable, but I also remember a quite experienced kernel maintainer running into one of them after 2.6.20 was released and spending half a day tracking it down - and my answer was "known unfixed regression, first reported more than a month ago". There is a conflict between Linus trying to release kernels every 2 months and releasing with few regressions. Trying to avoid regressions might in the worst case result in an -rc12 and 4 months between releases. If the focus is on avoiding regressions this has to be accepted. And a serious delay of the next regression-merge window due to unfixed regressions might even have the positive side effect of more developers becoming interested in fixing the current regressions for getting their shiny new regressions^Wfeatures faster into Linus' tree. 0 regressions is never realistic (especially since many regressions might not be reported during -rc), but IMHO we could do much better than what happened in 2.6.20 and 2.6.21. These are just my personal opinions, and other people consider the resulting 2.6.20 and 2.6.21 kernels OK. I'm not satisfied with the result, and the world won't stop turning when I'm not tracking 2.6.22-rc regressions. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed
From: Linus Torvalds [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 08:47:26 -0700 (PDT) On Thu, 26 Apr 2007, Adrian Bunk wrote: > > There is a conflict between Linus trying to release kernels every > 2 months and releasing with few regressions. No. Regressions _increase_ with longer release cycles. They don't get fewer. The fact is, we have a -stable series for a reason. The reason is that the normal development kernel can work in three ways: (a) long release cycles, with two subcases: (a1) huge changes (ie a "long development series". This is what we used to have. There's no way to even track the regressions, because things just change too much. (a2) keep the development limited, just stretch out the "stabilization phase". This simply *does*not*work*. You might want it to work, but it's against human psychology. People get bored, and start wasting their time discussing esoteric scheduler issues which weren't regressions at all. (b) Short and staggered release cycle: keep changes limited (like a2), but recognize when it gets counter-productive, and cut a release so that the stable team can continue with it, while most developers (who wouldn't have worked on the stable kernel _anyway_) don't get frustrated. And yes, we've gone for (b). With occasional "I'm not taking any half-way scary things at _all_" releases, like 2.6.20 was. > Trying to avoid regressions might in the worst case result in an -rc12 > and 4 months between releases. If the focus is on avoiding regressions > this has to be accepted. No. You are ignoring the reality of development. The reality is that you have to balance things. If you have a four-month release cycle, where three and a half months are just "wait for reports to trickle in from testers", you simply won't get _anything_ done. People will throw their hands up in frustration and go somewhere else. > And a serious delay of the next regression-merge window due to unfixed > regressions might even have the positive side effect of more developers > becoming interested in fixing the current regressions for getting their > shiny new regressions^Wfeatures faster into Linus' tree. No. Quite the reverse. If we have a problem right now > 0 regressions is never realistic (especially since many regressions > might not be reported during -rc), but IMHO we could do much better than > what happened in 2.6.20 and 2.6.21. 2.6.20 was actually really good. Yes, it had some regressions, but I do believe that it was one of the least buggy releases we've had. The process _worked_. 2.6.21 was much less pleasant, but the timer thing really was > I'm not satisfied with the result, and the world won't stop turning when > I'm not tracking 2.6.22-rc regressions. True. However, it's sad that you feel like you can't bother to track them. They were _very_ useful. The fact that you felt they weren't is just becasue I think you had unrealistic expectations, and you think that the stable people shouldn't have to have anything to do. You're maintaining 2.6.16 yourself - do you not see what happens when you decide that "zero regressions" is the target? You have to stop development. And while that may sound like a good thing at any particular time, it's a total *disaster* in the long run (not even very long, actually: in the two-to-three release cycle kind of run), because while you are in a "regression fix" mode, people still go on developing, and you're just causing problems for the _next_ release by holding things up too long. That's the *real* reality: 5 to 7 _million_ lines of diffs in a release every two to three months. Do you really think those changes stop just because of a release process? No. If you drag out the releases to be 4+ months, you'll just have 10-15 million lines of changes instead (or, more likely, you'll have developers who can't be bothered any more, and you may have just 2 million lines, and three years later you have a kernel that isn't relevant any more. Look at any of the other Unixes). In other words, there's a _reason_ we have staggered development. We have the "crazy development trees" (aka -mm and various other trees), we have the "development tree" (aka Linus' tree), and we have the -stable tree. If the stable tree has a dozen known issues that they'll have to sort out over the next two months, that's *fine*. That's kind of the point of the stable tree. And you would helpe them with the 2.6.22-stable releases if you'd maintain that list. Even if it is _designed_ not to go down to zero. I suspect that you got overly optimistic from the fact that 2.6.20 really _was_ an easy release. It was designed that way. You feel that it was bad or average, but that's actually because of _your_ unrealistic expectations, not becasue there was anything wrong with 2.6.20. Linus
From: Adrian Bunk [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 18:59:50 +0200 On Thu, Apr 26, 2007 at 08:47:26AM -0700, Linus Torvalds wrote: > > > On Thu, 26 Apr 2007, Adrian Bunk wrote: > > > > There is a conflict between Linus trying to release kernels every > > 2 months and releasing with few regressions. > > No. > > Regressions _increase_ with longer release cycles. They don't get fewer. > > The fact is, we have a -stable series for a reason. The reason is that the > normal development kernel can work in three ways: > > (a) long release cycles, with two subcases: > (a1) huge changes (ie a "long development series". This is what we > used to have. There's no way to even track the regressions, > because things just change too much. > (a2) keep the development limited, just stretch out the > "stabilization phase". This simply *does*not*work*. You might > want it to work, but it's against human psychology. People > get bored, and start wasting their time discussing esoteric > scheduler issues which weren't regressions at all. > (b) Short and staggered release cycle: keep changes limited (like a2), > but recognize when it gets counter-productive, and cut a release so > that the stable team can continue with it, while most developers (who > wouldn't have worked on the stable kernel _anyway_) don't get > frustrated. <SCNR> They get frustrated because they focussed on developing new features instead of fixing regressions, and now it takes longer until their new features get merged because noone fixed the regressions... </SCNR> > And yes, we've gone for (b). With occasional "I'm not taking any half-way > scary things at _all_" releases, like 2.6.20 was. > > > Trying to avoid regressions might in the worst case result in an -rc12 > > and 4 months between releases. If the focus is on avoiding regressions > > this has to be accepted. > > No. You are ignoring the reality of development. The reality is that you > have to balance things. If you have a four-month release cycle, where I'm not saying it always have to be 4 months. > three and a half months are just "wait for reports to trickle in from > testers", you simply won't get _anything_ done. People will throw their > hands up in frustration and go somewhere else. "wait for reports to trickle in from testers" is exactly the opposite of our problem. I started the regression lists originally to prove the fairy tale "noone tests -rc kernels" some kernel developers spread as wrong. Look at the facts: 8 out of 14 regressions in my current list were reported in March or earlier. And for many regressions fixed it took several weeks until debugging by a kernel developer was started. We do not lack testers for getting bug reports quickly. We lack developer manpower for debugging the many regression reports. >... > > 0 regressions is never realistic (especially since many regressions > > might not be reported during -rc), but IMHO we could do much better than > > what happened in 2.6.20 and 2.6.21. > > 2.6.20 was actually really good. Yes, it had some regressions, but I do > believe that it was one of the least buggy releases we've had. The process > _worked_. In the country of the blind the one-eyed man is king... > 2.6.21 was much less pleasant, but the timer thing really was > > > I'm not satisfied with the result, and the world won't stop turning when > > I'm not tracking 2.6.22-rc regressions. > > True. However, it's sad that you feel like you can't bother to track them. > They were _very_ useful. The fact that you felt they weren't is just > becasue I think you had unrealistic expectations, and you think that the > stable people shouldn't have to have anything to do. > > You're maintaining 2.6.16 yourself - do you not see what happens when you > decide that "zero regressions" is the target? You have to stop > development. And while that may sound like a good thing at any particular > time, it's a total *disaster* in the long run (not even very long, > actually: in the two-to-three release cycle kind of run), because while > you are in a "regression fix" mode, people still go on developing, and > you're just causing problems for the _next_ release by holding things up > too long. > > That's the *real* reality: 5 to 7 _million_ lines of diffs in a release > every two to three months. Do you really think those changes stop just > because of a release process? No. If you drag out the releases to be 4+ > months, you'll just have 10-15 million lines of changes instead (or, more > likely, you'll have developers who can't be bothered any more, and you may > have just 2 million lines, and three years later you have a kernel that > isn't relevant any more. Look at any of the other Unixes). There's not a realistic chance for 0 regressions, and 4 months was a worst case, not the average case. But I am not happy with the current state of released kernels. > In other words, there's a _reason_ we have staggered development. We have > the "crazy development trees" (aka -mm and various other trees), we have > the "development tree" (aka Linus' tree), and we have the -stable tree. If > the stable tree has a dozen known issues that they'll have to sort out > over the next two months, that's *fine*. That's kind of the point of the > stable tree. And all the people who have to upgrade to 2.6.21 for getting an important security fix run into a dozen known (and many unknown) regressions. I don't think that's fine. > And you would helpe them with the 2.6.22-stable releases if you'd maintain > that list. Even if it is _designed_ not to go down to zero. > > I suspect that you got overly optimistic from the fact that 2.6.20 really > _was_ an easy release. It was designed that way. You feel that it was bad > or average, but that's actually because of _your_ unrealistic > expectations, not becasue there was anything wrong with 2.6.20. If we had the developer manpower to get each reported regression debugged and fixed [1] within three weeks, 2.6.21 might be in the shape I would have liked it to be today. But there are the three interdependent variables time, developer manpower and quality. And few developer manpower and few time results in a lower quality of the release I'm not happy with. Life has taught me that sometimes I'm right, sometimes I'm wrong, and sometimes both sides have a possible solution. We might agree to disagree, and you are the one who's opinion counts. I can only say that I am not happy with the result, and that I do therefore not spend my time on maintaining regression lists for 2.6.22 - and maintaining such lists is not something special noone else could do equally well. > Linus cu Adrian [1] "fixed" can also be e.g. "patch reverted" or "not a bug" -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed
From: Alan Cox [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 21:50:15 +0100 > They get frustrated because they focussed on developing new features > instead of fixing regressions, and now it takes longer until their new > features get merged because noone fixed the regressions... I would disagree: They get frustrated because they are blocked on some small regression which is stopping a ton of other fixed including features people need (like new hardware support) from being released. The "no regressions" model doesn't really work when you ask about the greater good of the userbase. The goal of no regressions is great and the regression lists for ATA were certainly very helpful but the greater good comes first. Alan
From: Linus Torvalds [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 10:20:46 -0700 (PDT) On Thu, 26 Apr 2007, Adrian Bunk wrote: > > They get frustrated because they focussed on developing new features > instead of fixing regressions, and now it takes longer until their new > features get merged because noone fixed the regressions... I agree. That's part of it. But part of it is not just the "it's 2 months until the next release", part of it is also very much a "nothing has happened in the normal kernel for the last 8 weeks, this is boring, so I'll do my own exciting stuff". So one _fundmanetal_ issue is that all the people who aren't directly involved with a particular regression are simply bored. And bored is not good. You want people productive - and that meas that you want a active development kernel that they can work with, since they aren't going to help with the regressions anyway. This is why the -stable tree is so useful. It's not only that users want a stable tree - it allows people who do *not* have regressions on their plate to not be stuck twiddling their thumbs - they can be on the regular kernel. > I'm not saying it always have to be 4 months. I'm saying that four months wouldn't even have *helped* in the case of 2.6.21. Do you really think bugs get fixed faster just because there wasn't a release? Quite the reverse. Bugs get _found_ faster thanks to a release (simply because you tend to get more information thanks to more users), giving the stable people more information, causing the bugs to be able to be found and fixed _more_quickly_ in the stable release than if we had waited for four months to release 2.6.21. The two last weeks of 2.6.21-rc were almost entirely "wasted", apart from getting the e1000 issue at least resolved (which was the reason for that delay, so I'm not complaining - I'm just saying that not a lot of people actually were able to _help_ with regressions during that time, and for some of them, we might well be better off with more information about the issue). Did we fix other bugs? Yes. There was one long-time bug (since 2.6.15 or something) that happened to come in during that time, and we had some cleanups, we had MIPS bugs, we found some networking issues etc etc. But the amount of combined effort people put on it was pretty weak. > "wait for reports to trickle in from testers" is exactly the opposite of > our problem. I disagree. Quite often, having 5 people report the same thing is actually more useful (because you see a pattern) than having one known regression that you don't know _why_ that regression happened. And that's the case we had for most of them. You have things like the maintainer (see Oliver's reply, for example) simply unable to reproduce it, and needing more information. It *does*not*matter* that the original report may be old. If you need more information, you need more information, and a two-month-old report isn't any better just because it's two months old. At some point, you need to say: we're not making progress, need to release it, that might get us *off* this stuck situation. That's the part you seem unable to accept. You think that "we have a listed regression" means that you should be able to fix it. Not so. We *often* need more information. > But I am not happy with the current state of released kernels. So you're going to help exactly how? By stopping to help? Or kvetching about developers that can't figure out why something regressed. Sure, that makes tons of sense sense, Adrian. NOT. Linus
From: Adrian Bunk [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 19:48:48 +0200 On Thu, Apr 26, 2007 at 10:20:46AM -0700, Linus Torvalds wrote: > On Thu, 26 Apr 2007, Adrian Bunk wrote: >... > > But I am not happy with the current state of released kernels. > > So you're going to help exactly how? By stopping to help? Or kvetching > about developers that can't figure out why something regressed. > > Sure, that makes tons of sense sense, Adrian. > > NOT. It is my time, and it's therefore my decision what I consider to make sense spending it for. Instead of continuing our discussion it makes more sense that we simply accept that we disagree regarding when a kernel is ready for being released instead of repeating the same arguments in a lengthy discussion. > Linus cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed
From: Diego Calleja [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 20:13:25 +0200 El Thu, 26 Apr 2007 13:02:28 -0400, Chuck Ebbert [email blocked] escribió: > Problem is, not enough developers pay attention to the -stable > series. Adrian, maybe you could shift your attention there and > stop trying to track the bleeding edge? From my humble POV, it's a problem that all this discussion was generated on what Adrian does or stop doing. Apparently, unless Adrian posts his list of know regressions, most of the people doesn't look at the bugzilla at all. Maybe it'd be useful to create a per-release bug tracker in the bugzilla or collect them into one of the a kernel.org's wiki, to make easier to follow the current state of all the "important" regressions.
From: Linus Torvalds [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 11:42:22 -0700 (PDT) On Thu, 26 Apr 2007, Diego Calleja wrote: > > From my humble POV, it's a problem that all this discussion was generated > on what Adrian does or stop doing. Apparently, unless Adrian posts his > list of know regressions, most of the people doesn't look at the bugzilla > at all. Maybe it'd be useful to create a per-release bug tracker in the > bugzilla or collect them into one of the a kernel.org's wiki, to make easier > to follow the current state of all the "important" regressions. Any web-based interface is a no-no. It's one reason I don't use bugzilla a lot. If I can't get it by email, it doesn't exist, as far as I'm concerned. I bet that's true even of a lot of people who are more "web oriented" than I am. They may look at webpages, but getting notified by email is still the wakeup call. There's a difference between "active and directed pushing to the involved people" and "the resource exists, that people could look at". So it would have to be more than just a wiki or a bugzilla entry. It would have to have that weekly email status thing, and I think that it needs to have some human who tries to find messages on the kernel mailing list too, and make a first-level judgement on the bugs. Adrian was doing a good job. But it doesn't necessarily need somebody with intimate knowledge of the kernel. In fact, almost everybody who *does* have intimate knowledge tends to have so in a very specific area (nobody knows everything - and that very much includes people like me and Andrew too) and maybe be skewed in other ways too, so a "generalist" is probably more useful than somebody who is a "deep coder" in some subsystem. And it almost certainly doesn't have/prefer to be _one_ person. I suspect that this is something where it actually might be better to have some collection of people interested in it, and yes, perhaps editing a wiki is part of the process, but with at least that "automated email" thing going on in additin (and it needs to go to the people involved, not just the kernel mailing list - so part of it is not just gathering the reports themselves, but also gathering target addresses from MAINTAINERS files and perhaps git logs etc). And yes, it's quite possibly a good way to get into kernel development - it definitely helps to know about programming, but as mentioned, I don't think it is something where you even need to know specifically about *kernel* programming per se. For example, I don't think it was an accident that Adrian (who has been involved in kernelnewbies, janitors and the trivial tree) was the one who picked it up. That's exactly the kind of involvement that the regression tracking is all about! (In fact, I think regression tracking might be "easier" to get into than actually getting into some of the janitorial projects, exactly because it's less about coding. The fact that a person who tracks regressions might then *also* indirectly get into the code itself would just be a big additional bonus!) So yes, some automation can almost certainly help (especially if there are multiple people involved), but I think a lot of it is that "human judgement" and ability to group things and communicate. Are there any kernel janitors/newbies/mentors that can think of people who would want to do something like this? Linus
From: Diego Calleja [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 22:41:48 +0200 El Thu, 26 Apr 2007 11:42:22 -0700 (PDT), Linus Torvalds escribió: > I bet that's true even of a lot of people who are more "web oriented" than > I am. They may look at webpages, but getting notified by email is still > the wakeup call. There's a difference between "active and directed pushing Bugzilla sucks quite a lot at email, but you can answer emails and they get into the bugzilla database; and there're two mailing lists (listed in Documentation/HOWTO) that send notifications about every new bug added/modified- I know it's not the perfect email interface every hacker wants, but it's better than nothing. I suggested some time ago that it'd be useful to send every new bug notification from bugme-new to the LKML (and/or other lists). The volume should not be so high to make it so annoying that it makes it unuseful, and at least it makes the bugzilla-haters aware of the bugs reported, and since bugzilla tracks the answers to emails and the reporter email address is in the email, it makes easier for bugzilla-haters to ask for more data and try to fix the problem, without starting any browser. I can understand Adrian's resign. Bugzilla is crap, but there're users reporting bugs there and willing to cooperate to fix them, and they're not getting listened. There're even a few description of patches (ie: "line 6 in foo.c is wrong and it breaks our testing, it should read like this:") that have been sitting there for *years* and not getting merged. I guess that Adrian tried to canalize the important regressions to the hackers, and he got tired of apparently being the only one that cares about getting them fixed. So I, or anyone else, could try to do Adrian's job. But if Adrian (a guy that sends patches to make global functions static 8) got tired of doing that job, I suspect that I, or anyone else would also got tired of it even sooner. There're other big projects with probably more bug reports than linux, they don't work this way, and they look more succesful in their bug handling. So in my humble opinion there's a problem, about how the whole bug reporting/fixing process works. With the current linux development model, a good bug reporting/fixing process doesn't looks optional, since it's important to fix bugs ASAP to get the fixes into -stable. The fix may go as further as "writing our own bug tracking software" in the same way git fixed other development issues, or it may be a human issue, or a mix of the two.
From: Linus Torvalds [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 14:13:08 -0700 (PDT) On Thu, 26 Apr 2007, Diego Calleja wrote: > > Bugzilla sucks quite a lot at email, but you can answer emails and they get > into the bugzilla database; and there're two mailing lists (listed in > Documentation/HOWTO) that send notifications about every new bug > added/modified- I know it's not the perfect email interface every hacker > wants, but it's better than nothing. No, it's *not* better than nothing. The thing is, these reports MUST NOT go to "everybody". If they do, that is actually *worse* than nothing, because people will just ignore them entirely, since they aren't "directed". The emails need to be directed to the appropriate parties, not go to everybody. There is nobody who is interested in seeing all regressions, except perhaps me and Andrew. Most *real* developers (as opposed to people like me, who are integrators, not "real developers") want to be notified about problems in *their* area, and if it's just automation that sends out everything, it just dilutes the value of the thing, to the point where people will ignore it even for the cases when they happen to be related to what they do. > I suggested some time ago that it'd be useful to send every new bug > notification from bugme-new to the LKML (and/or other lists). I don't know a lot of developers who actually read LKML. I know a lot of people who look for interesting subject lines and interesting people, but read LKML in the sense of reading everything? Not likely. That's why I think Adrian did a great job: he took the "noise" and made it somethng worth looking at! And part of that is very much to make it directred to only relevant parties (yes, they *also* got cc'd to linux-kernel, but people would get them in their personal mailboxes and *not* feel like it was just noise that didn't matter to them!) > I can understand Adrian's resign. Bugzilla is crap, but there're users > reporting bugs there and willing to cooperate to fix them, and they're > not getting listened. I personally refuse to have anything at all with bugzilla. The interface is so horrible that it's just not worth my time. I know there are a few people who use it productively, but I'm always amazed that they can do that. The *big* problem with bugzilla is that it's such a "detail-oriented" thing. It's fine if you have *one* bug that you're tracking. But whenever that's not the case, it's almost totally useless. Let me put it another way: I would never use a source control system that forces me to look at my 22,000 files one at a time. I think such a system is fundamentally broken, because it makes it impossible to get the big picture ("what changed in the last week" kind of thing). The same is true of bugzilla: if you *know* which bug you're looking at, it's good. For anythign else, it's almost worse than useless, exactly because there is no way to get an overview. > There're even a few description of patches (ie: "line > 6 in foo.c is wrong and it breaks our testing, it should read like this:") > that have been sitting there for *years* and not getting merged. .. and you claim that this shows that developers don't listen. I'd say it shows the exact *opposite*: the users don't listen. There's a lot mroe users than developers, and bugzilla is pretty much designed to let the users "report and forget", which is exactly the *wrong* thing to do, because it puts the onus on the developer. (I've said this before, but I'll say it again: one thing that would already make bugzilla better is to just always drop any bug reports that are more than a week old and haven't been touched. It wouldn't need *much* touching, but if a reporter cannot be bothered to say "still true with current snapshot" once a week, then it shouldn't be seen as being somehow up to those scare resources we call "developers" to have to go through it). So there are probably things that bugzilla could do to become more useful, but I don't see that happening. We'd need either a smarter/better bugzilla, or somebody who actually turns noise into real information. Adrian did that (although in fairness to others, other people definitely do it too. Dave Jones, for example. Very useful). > So I, or anyone else, could try to do Adrian's job. But if Adrian (a guy > that sends patches to make global functions static 8) got tired > of doing that job, I suspect that I, or anyone else would also got > tired of it even sooner. I do agree - one of the problems with the job is not that it's thankless (I think we've had at least ten kernel developers, very much including me, talking about how _useful_ it is), but there is definitely a lack of glamour and probably interest. I think it could be more interesting if part of the job was doing the tools. Tools *are* important. Most of my actual _development_ for the last couple of years has been on "git", not the kernel, but I think I was more productive that way, so I don't think that's wasted time at all. So yes, automation would be a good idea, but I don't think bugzilla is it. > There're other big projects with probably more bug reports than linux, > they don't work this way, and they look more succesful in their bug > handling. Well, one thing to keep in mind is that the kernel really does have a *lot* more development going on that most other projects. I don't think you'll find another project that has about six megabytes of diffs every release (every two months). That's really one of the fundamental issues - things really *happen* in the kernel. A *lot* of things. You can't take a breather - I can do "stabilization releases" every once in a while, and Andrew can kick out patches he decides aren't ready to be merged rather than maintain them in his tree (and he does do that), but the kernel simply tends to have a different *scale* than other projects. And almost all hard bugs are about hardware interactions. Drivers. Big iron. Things like that - ie unlike something like a compiler, you can seldom say "this test-case crashes". Yes, that does happen for the kernel too, but those are the *easy* bugs. Those generally get fixed in a day or two. So I really don't think you can compare to "other projects". They simply don't have these issues. Linus
From: Jan Engelhardt [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 10:35:25 +0200 (MEST) On Apr 25 2007 20:29, Linus Torvalds wrote: > >If the goal for 2.6.20 was to be a stable release (and it was), the goal >for 2.6.21 is to have just survived the big timer-related changes and >some of the other surprises [...] So it's been over two and a half >months, and while it's certainly not the longest release cycle ever, it >still dragged out a bit longer than I'd have hoped for and it should >have. I really appreciate the lot of -rcs, especially if there are so many intrusive changes/regressions. Like Andrew, I have a feeling that it gets buggier, but at least, it seems to be made up every ... two releases. 2.6.16 was a good one, .18, and .20. I suppose the bug/regression distribution between [2.6.16-2.6.17, 2.6.17-2.6.18] was biased like [70, 50]. About 2.6.21 - will see, rc has been to my liking. Since a picture says more than a thousand words, have http://jengelh.hopto.org/GFX0/kernel-rating-2.6.21.png (The last kernel to have only 5 -rcs was 2.6.14 - interesting) Happy hacking, Jan
From: Linus Torvalds [email blocked] Subject: Re: Linux 2.6.21 Date: Thu, 26 Apr 2007 09:40:26 -0700 (PDT) On Thu, 26 Apr 2007, Jan Engelhardt wrote: > > I really appreciate the lot of -rcs, especially if there are so many > intrusive changes/regressions. Like Andrew, I have a feeling that it > gets buggier, but at least, it seems to be made up every ... two > releases. I wouldn't say that, but yes, there is at least *some* tendency to not merge scary stuff after a painful release. For example, I can certainly say that after 2.6.21, I'm likely to be very unhappy merging something that isn't "obviously safe". I knew the timer changes were potentially painful, I just hadn't realized just *how* painful they would be (we had some SATA/IDE changes too, of course, it's not all just about the timers, those just ended up being more noticeable to me than some of the other things were). > About 2.6.21 - will see, rc has been to my liking. I actually hope that 2.6.21 isn't even all that bad, despite all the worries about it. And I may be complaining about the problems the timers caused, but it was definitely something that was not only worth it, it was overdue - and those NO_HZ issues had been brewing literally for years. So considering issues like that, I think we're actually doing fairly well. One of the bigger issues is that I think -mm (and I'm pretty sure Andrew will agree with me on this) has really had a rather spotty history. It's been unstable enough at times that I suspect people have largely stopped testing it, with just the most die-hard testers running -mm. So -mm is still very useful just because *Andrew* tests it, and finds all kinds of issues with it, but I literally suspect that Andrew himself is personally a big part of that, which is kind of wasteful - we should be able to spread out the pain more. Andrew is also too damn polite when something goes wrong ;) So we should have somebody like Christoph running -mm, and when things break, we'll just sic Christoph on whoever broke it, and teach people proper fear and respect! As it is, I think people tend to send things to -mm a bit *too* eagerly, because there is no downside - Andrew is a "cheap date" testing-wise, and always puts out ;) Linus

Related Links: