Following a recent merge request, Linus Torvalds stressed that he was serious about not wanting to merge any big changes after the merge window closes, "get the changes in before -rc1, or just *wait*. If they aren't ready before the merge window opens, they simply shouldn't be merged at all." Jeff Garzik reiterated, "once -rc1 is out there, that means the focus should be on stabilizing the existing codebase. Pushing a big driver update means that effort must restart from scratch. We just don't want to go down that road, which a big reason for the merge window in general." Further when it was noted that the recent changes were heavily tested by the vendor, Jeff stressed the importance of community testing:
"Take a lesson from when I was on Linus's shit-list... twice: Twice, Intel submitted an e1000 update after the merge window closed. Twice, they claimed the driver passed their quite-exhaustive internal testing. And twice, the most popular network driver broke for large masses of users because I took a hardware vendor's word on testing rather than rely on the testing PROVEN to flush out problems: public linux kernel testing.
"I'm not singling out Intel, there are plenty of other hardware vendors that repeat the exact same pattern."
From: James Bottomley [email blocked]
Subject: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Sat, 04 Aug 2007 12:31:43 -0500
This is mainly bug fixes ... there's one or two features completions
that have been delayed pending ack and review to do with bsg (headers
and passthrough) but these are really required to complete already
upstream code.
The patch is available here:
master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git
The short changelog is:
Boaz Harrosh (6):
aha152x: use data accessors and !use_sg cleanup
aha152x: Fix check_condition code-path
aha152x: Clean Reset path
aha152x: preliminary fixes and some comments
aha152x: use bounce buffer
aha152x: fix debug mode symbol conflict
David Miller (1):
ESP: Revert ESP_BUS_TIMEOUT back down to 250
FUJITA Tomonori (6):
initialize shost_data to zero
mptsas: add SMP passthrough support via bsg
bsg: update sg_io_v4 structure
ibmvscsi: use shost_priv
ibmvscsi: remove unnecessary map_sg check
zfcp: convert to use the data buffer accessors
James Bottomley (4):
sd: disentangle barriers in SCSI
aic7xxx: cap maxsync according to correct card limits
mpt fusion: make logging a global sysfs parameter
libsas: fix build dependencies on libata
James Smart (9):
lpfc : scsi command accessor fix for 8.2.2
lpfc 8.2.2 : Change version number to 8.2.2
lpfc 8.2.2 : Style cleanups
lpfc 8.2.2 : Miscellaneous Bug Fixes
lpfc 8.2.2 : Miscellaneous management and logging mods
lpfc 8.2.2 : Rework the lpfc_printf_log() macro
lpfc 8.2.2 : Attribute and Parameter splits for vport and physical port
lpfc 8.2.2 : Fix locking around HBA's port_list
lpfc 8.2.2 : Error messages and debugfs updates
Jeff Garzik (1):
gdth: remove redundant PCI stuff
Mark Fortescue (1):
qlogicpti: Some cosmetic changes
Matthew Wilcox (1):
dpt_i2o: convert to SCSI hotplug model
Matthias Kaehlcke (1):
st: Use mutex instead of semaphore
Salyzyn, Mark (1):
aacraid: prevent panic on adapter resource failure
Seokmann Ju (1):
qla2xxx: fix panic caused by previous patch
and the diffstat:
block/bsg.c | 10
drivers/message/fusion/mptbase.c | 17
drivers/message/fusion/mptsas.c | 126 ++++++
drivers/s390/scsi/zfcp_fsf.c | 5
drivers/s390/scsi/zfcp_qdio.c | 41 --
drivers/scsi/aacraid/linit.c | 4
drivers/scsi/aha152x.c | 169 ++++----
drivers/scsi/aha152x.h | 2
drivers/scsi/aic7xxx/aic7xxx_core.c | 22 +
drivers/scsi/dpt_i2o.c | 132 +++---
drivers/scsi/dpti.h | 9
drivers/scsi/esp_scsi.h | 2
drivers/scsi/gdth.c | 48 +-
drivers/scsi/gdth.h | 6
drivers/scsi/hosts.c | 2
drivers/scsi/ibmvscsi/ibmvscsi.c | 39 --
drivers/scsi/libsas/Kconfig | 3
drivers/scsi/lpfc/lpfc.h | 72 ++-
drivers/scsi/lpfc/lpfc_attr.c | 423 +++++++++++++++-------
drivers/scsi/lpfc/lpfc_crtn.h | 28 -
drivers/scsi/lpfc/lpfc_ct.c | 243 ++++++------
drivers/scsi/lpfc/lpfc_debugfs.c | 595 ++++++++++++++++++++++++++++---
drivers/scsi/lpfc/lpfc_debugfs.h | 2
drivers/scsi/lpfc/lpfc_els.c | 679 ++++++++++++++++--------------------
drivers/scsi/lpfc/lpfc_hbadisc.c | 539 ++++++++++++----------------
drivers/scsi/lpfc/lpfc_hw.h | 14
drivers/scsi/lpfc/lpfc_init.c | 284 +++++++--------
drivers/scsi/lpfc/lpfc_logmsg.h | 10
drivers/scsi/lpfc/lpfc_mbox.c | 20 -
drivers/scsi/lpfc/lpfc_mem.c | 32 +
drivers/scsi/lpfc/lpfc_nportdisc.c | 162 +++-----
drivers/scsi/lpfc/lpfc_scsi.c | 413 ++++++++++-----------
drivers/scsi/lpfc/lpfc_sli.c | 423 +++++++++++-----------
drivers/scsi/lpfc/lpfc_sli.h | 10
drivers/scsi/lpfc/lpfc_version.h | 4
drivers/scsi/lpfc/lpfc_vport.c | 164 +++++---
drivers/scsi/lpfc/lpfc_vport.h | 2
drivers/scsi/qla2xxx/qla_os.c | 14
drivers/scsi/qlogicpti.c | 50 +-
drivers/scsi/scsi_lib.c | 17
drivers/scsi/sd.c | 14
drivers/scsi/st.c | 16
drivers/scsi/st.h | 3
include/linux/bsg.h | 13
include/scsi/scsi_driver.h | 2
include/scsi/sd.h | 2
46 files changed, 2837 insertions(+), 2050 deletions(-)
James
From: Linus Torvalds [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Mon, 6 Aug 2007 17:51:57 -0700 (PDT)
On Sat, 4 Aug 2007, James Bottomley wrote:
>
> This is mainly bug fixes ... there's one or two features completions
> that have been delayed pending ack and review to do with bsg (headers
> and passthrough) but these are really required to complete already
> upstream code.
James, this is the last time *ever* I apply patches from you after -rc1.
You used to have serious problems with the merge window, but for a few
releases you then seemed to "get it" and got on with the program.
But now it's back to "anythign goes", apparently. And I'm going to take a
hard-line approach with you now.
For SCSI merges, if I don't get the first pull request in the FIRST week
of the merge window, don't bother sending one later, unless it's pure
fixes and regressions.
And after -rc1, I don't want to see crap like this:
46 files changed, 2837 insertions(+), 2050 deletions(-)
because that simply is *not* appropriate after -rc1, much less -rc2.
So I pulled, but I wanted to make it very clear that I'm very unhappy with
you right now, and you're on my shit-list for the next few releases. Get
the changes in before -rc1, or just *wait*. If they aren't ready before
the merge window opens, they simply shouldn't be merged at all.
Linus
From: James Bottomley [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Mon, 06 Aug 2007 22:55:41 -0500
On Mon, 2007-08-06 at 17:51 -0700, Linus Torvalds wrote:
>
> On Sat, 4 Aug 2007, James Bottomley wrote:
> >
> > This is mainly bug fixes ... there's one or two features completions
> > that have been delayed pending ack and review to do with bsg (headers
> > and passthrough) but these are really required to complete already
> > upstream code.
>
> James, this is the last time *ever* I apply patches from you after -rc1.
>
> You used to have serious problems with the merge window, but for a few
> releases you then seemed to "get it" and got on with the program.
>
> But now it's back to "anythign goes", apparently. And I'm going to take a
> hard-line approach with you now.
>
> For SCSI merges, if I don't get the first pull request in the FIRST week
> of the merge window, don't bother sending one later, unless it's pure
> fixes and regressions.
Confused ... you did get the first pull request in the first week. That
was this:
Subject:
[GIT PATCH] first SCSI merge for
2.6.22
Date:
Sun, 15 Jul 2007 10:24:17
-0500
190 files changed, 21725 insertions(+), 26337 deletions(-)
Then there was the last piece before the merge window closed:
Subject:
[GIT PATCH] final piece of the SCSI
merge for 2.6.22
Date:
Sun, 22 Jul 2007 13:28:53
-0500
74 files changed, 3649 insertions(+), 1295 deletions(-)
> And after -rc1, I don't want to see crap like this:
>
> 46 files changed, 2837 insertions(+), 2050 deletions(-)
>
> because that simply is *not* appropriate after -rc1, much less -rc2.
>
> So I pulled, but I wanted to make it very clear that I'm very unhappy with
> you right now, and you're on my shit-list for the next few releases. Get
> the changes in before -rc1, or just *wait*. If they aren't ready before
> the merge window opens, they simply shouldn't be merged at all.
OK ... that's arguable. This one is larger than I like because of the
lpfc bug fix patch ... I accept I need to do a better job getting these
into the merge window via the scsi-misc tree. So I will accept the "too
big" criticism and try to manage the driver maintainers better.
However, I won't accept the "not bug fixes only" criticism at -rc1. The
problem is that we're trying to stabilise a new feature: bsg.
Unfortunately, the closure of the merge window was really the first time
anyone got to play with all of these features together. The non-bug fix
changes around bsg have been trying to achieve stability. The problem
is that there were a few fairly problematic pieces: dependence on
non-modular SCSI; SG header layout and driver implementation. What we
really don't want is to have a problematic API baked in stone because we
can't do anything other than bug fix updates once the merge window
closes.
The real root cause of all of this is that there's no tree I can
persuade all the interested parties to test that includes all of these
features. In spite of the fact they've all been incubating in -mm for
at least 3 months, no-one apparently tested all the features together
until 2.6.23-rc1 was released, so then we're scrambling to address the
issues as they arise.
I really, *really* think we need a pre-release tree that consists of all
the upstream targetted features (i.e. all of the for the next merge
window git trees) and nothing else. -mm doesn't really satisfy this,
because it has so much other stuff that the people I need to get testing
this don't trust it. The lack of a tree like this that we could have
persuaded people to test for the last month is what's causing us to
scramble like this at the closure of the merge window.
James
From: Linus Torvalds [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Mon, 6 Aug 2007 21:01:46 -0700 (PDT)
On Mon, 6 Aug 2007, James Bottomley wrote:
>
> Confused ... you did get the first pull request in the first week.
Here's the problem. Let me repeat it again:
> > And after -rc1, I don't want to see crap like this:
> >
> > 46 files changed, 2837 insertions(+), 2050 deletions(-)
It DOES NOT MATTER if I get a first pull request in the first week, if
that pull request is purely cosmetic, and is followed by stuff that
*should* have been in the merge window four weeks afterwards.
> OK ... that's arguable.
There's nothing arguable at all about it.
If you have 5000 lines of changes, that's not a "bugfix" any more. That's
a big damn change, and it should have happened in the merge window. Or if
it doesn't make it in time, in the *next* merge window.
Linus
From: James Smart [email blocked]
To: Linus Torvalds [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Tue, 07 Aug 2007 09:12:21 -0400
In defense of my maintainer, who was working on my behalf! ...
The lpfc mods were the bulk of the +/- counts. We batch our bug fixes
together and then push to James as a large lump. Unfortunately, we had
a change that changed logging from a base object to a subobject. Although
not risky, it did account for a lot of +/- changes. The way we pushed
to James, did not allow for him to easily segment one set of changes
from the other. Emulex will change this behavior, hopefully making this
easier on James to keep you happy.
However, I take issue with looking at line counts as the sole basis
for what's appropriate or not. It can be argued that some bug fixes may be
larger in scope than others, or patch batching so that the bug fix count is
higher will skew this perception. I also believe that more "lesser" bugfixes
should be allowed in an earlier -rc? than later, so a hard-and-fast rule for
line counts seem odd. Also - what's a bug fix ? There are many things
which are not "features" but are necessities for diagnosis or support of the
larger change. Some of these you simply don't find in time to make sure they
are in place for the -rc1 merge. Do you hold off on them, or do you make a
choice based risk/reward based on where the -rc is ? I vote for the latter.
I realize that the Linux kernel is such a beast overall that you must have
some simple guidelines, but basing it solely on numbers is a very bad pitfall.
-- james s
From: Jeff Garzik [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Tue, 07 Aug 2007 12:13:36 -0400
James Smart wrote:
> However, I take issue with looking at line counts as the sole basis
> for what's appropriate or not. It can be argued that some bug fixes may be
> larger in scope than others, or patch batching so that the bug fix count is
> higher will skew this perception. I also believe that more "lesser"
> bugfixes
> should be allowed in an earlier -rc? than later, so a hard-and-fast rule
> for
> line counts seem odd. Also - what's a bug fix ? There are many things
> which are not "features" but are necessities for diagnosis or support of
> the
> larger change. Some of these you simply don't find in time to make sure
> they
> are in place for the -rc1 merge. Do you hold off on them, or do you make a
> choice based risk/reward based on where the -rc is ? I vote for the latter.
> I realize that the Linux kernel is such a beast overall that you must have
> some simple guidelines, but basing it solely on numbers is a very bad
> pitfall.
It's straightforward engineering math: the more LOC that changed, the
more important it is to /not/ stuff it into a stabilization release,
because of the greater potential for breaking stuff and negating all the
existing testing so far.
Once -rc1 is out there, that means the focus should be on stabilizing
the existing codebase. Pushing a big driver update means that effort
must restart from scratch. We just don't want to go down that road,
which a big reason for the merge window in general.
If you miss the merge window, tough cookies :) You gotta deal with it
just like I do, and everyone else does.
Remember -- the more disciplined we all are with the merge window, the
more likely it is that a release can be stabilized quickly, and thus,
the more quickly we will reach the next merge window.
In contrast, increasing violations of the merge window mean increasing
time between releases.
Jeff
From: Andrew Morton [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Tue, 7 Aug 2007 00:14:29 -0700
On Mon, 06 Aug 2007 22:55:41 -0500 James Bottomley wrote:
> The real root cause of all of this is that there's no tree I can
> persuade all the interested parties to test that includes all of these
> features. In spite of the fact they've all been incubating in -mm for
> at least 3 months, no-one apparently tested all the features together
> until 2.6.23-rc1 was released, so then we're scrambling to address the
> issues as they arise.
I pulled git-scsi-misc on July 19 and there was no bsg code in there at
all. I pulled again on July 20 and all the bsg code was in mainline. So
it appears that the bsg code went mailing-list -> mainline in less than 24
hours, so there wasn't a lot of opportunity for -mm testing there.
A lot of the stupid it-doesn't-compile stuff would have been fixed in -mm,
but more substantial problems might not have been picked up. But one can
say that about anything.
> I really, *really* think we need a pre-release tree that consists of all
> the upstream targetted features (i.e. all of the for the next merge
> window git trees) and nothing else.
That *is* -mm. The vast majority of -mm is the 75-odd subsystem trees.
What you're suggesting amounts to omitting some of those trees for test
purposes (I think). If so, which ones?
Now it coud be argued that subsystem maintainers should run two trees in
the last 2.6.x-rcN phase: one tree for 2.6.x+1 and one tree for 2.6.x+2.
Then someone could pull all that together as the "Linus tree in a month,
minus insufficiently baked stuff" tree. But frankly, I don't expect that
people will want to do that, nor will they be able to do it reliably.
Plus, an *amazing* amount of stuff turns up in the git trees which was
committed just a few days prior to the merge window opening, or even after
it opening. eg, bsg which was, afaict, first committed to the scsi tree
eleven days after the 2.6.22 release.
> -mm doesn't really satisfy this,
> because it has so much other stuff that the people I need to get testing
> this don't trust it.
Right. 75-odd developers need to stop committing bugs to their devel
trees. Interesting project ;)
> The lack of a tree like this that we could have
> persuaded people to test for the last month is what's causing us to
> scramble like this at the closure of the merge window.
Nope. The scramble is caused by subsystem maintainers jamming stuff into
mainline at the last minute so they don't have to sit on it for the next
two months.
Look. If we're serious about this then the rule needs to be something like
If it wasn't committed to your tree *at least* two weeks prior to the
2.6.x merge window opening, it shouldn't go into 2.6.x.
People are not presently observing this sort of discipline by a metric
mile. And I'm not sure that we should, really.
I don't think it's terribly bad to whack half-baked things (bsg ;)) into
mainline during the merge window, as long as a) we're sure that we want the
feature in Linux and b) we're confident that we can get it fixed up within
a couple of months. Two months is a long time.
But that's just me, and it is not the approach which Linus wants taken.
From: Jeff Garzik [email blocked]
To: Andrew Morton [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Tue, 07 Aug 2007 11:24:47 -0400
Andrew Morton wrote:
> On Mon, 06 Aug 2007 22:55:41 -0500 James Bottomley wrote:
>> I really, *really* think we need a pre-release tree that consists of all
>> the upstream targetted features (i.e. all of the for the next merge
>> window git trees) and nothing else.
>
> That *is* -mm. The vast majority of -mm is the 75-odd subsystem trees.
Not quite.
-mm is git trees plus an amazing amount of random patches that turned up
on LKML, a lot of which is not destined for kernel release 2.6.(X+1) or
2.6.(X+2),
> Plus, an *amazing* amount of stuff turns up in the git trees which was
> committed just a few days prior to the merge window opening, or even after
> it opening.
Yes :( That's a tough problem to solve, too.
Deadlines always motivate people, and so -- as in almost every other
software project I've worked with -- everybody seems to submit their
work on the day of the deadline.
Realistically, for the merge window to work perfectly, each step down
the maintainership ladder needs to have time to review and integrate the
changes destined for that merge window. Ideally, people would do all
this work beforehand, so that each step up the ladder has time prior to
merge window for review and testing.
But that's just not software engineers as we know them ;-)
>> The lack of a tree like this that we could have
>> persuaded people to test for the last month is what's causing us to
>> scramble like this at the closure of the merge window.
>
> Nope. The scramble is caused by subsystem maintainers jamming stuff into
> mainline at the last minute so they don't have to sit on it for the next
> two months.
Indeed. Particularly in this case, where bsg didn't really grace -mm at
all.
> Look. If we're serious about this then the rule needs to be something like
>
> If it wasn't committed to your tree *at least* two weeks prior to the
> 2.6.x merge window opening, it shouldn't go into 2.6.x.
>
> People are not presently observing this sort of discipline by a metric
> mile. And I'm not sure that we should, really.
My goal AT A MINIMUM with netdev and libata is to get stuff in at least
one -mm release prior to merge window opening (though preferably a
longer lead time than that). Of course, reality intrudes, but that's my
goal.
And I think it's a reasonable goal to push upon others (but I'm biased:))
Jeff
From: Jeff Garzik [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Tue, 07 Aug 2007 12:06:49 -0400
James Bottomley wrote:
> OK ... that's arguable. This one is larger than I like because of the
> lpfc bug fix patch ... I accept I need to do a better job getting these
> into the merge window via the scsi-misc tree. So I will accept the "too
> big" criticism and try to manage the driver maintainers better.
>
> However, I won't accept the "not bug fixes only" criticism at -rc1. The
> problem is that we're trying to stabilise a new feature: bsg.
Just so we don't lose the forest for the trees...
Not trying to put words in Linus's mouth, but it seems to me he wasn't
complaining specifically about bsg. "style cleanups", "cosmetic
cleanups", ancient ISA driver polishing (1542, my gdth patch) are
definitely not "bug fix only" material.
The lpfc update was probably the biggest thing, LOC-wise. And even
though that was mostly bug fixes -- and notably NOT 100% fixes -- it is
big enough to warrant integration testing and exposure prior to
mainline. Definitely merge-window-open material AFAICS.
Jeff
From: James Smart [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Tue, 07 Aug 2007 12:27:35 -0400
Jeff Garzik wrote:
> The lpfc update was probably the biggest thing, LOC-wise. And even
> though that was mostly bug fixes -- and notably NOT 100% fixes -- it is
> big enough to warrant integration testing and exposure prior to
> mainline. Definitely merge-window-open material AFAICS.
FYI - it is integrated and tested prior to mainline, by Emulex (and who
else *really* tests it close to the degree we do ?). We do so, as a whole,
weeks ahead of the submit to the maintainer. Usually, there's only a couple
of small api changes that are picked up when we merge into the maintainers
pool. And most of these are caught by us prior anyway as we package the
patchsets and ensure the integration into the maintainers pool is smooth.
-- james s
From: Jeff Garzik [email blocked]
Subject: Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2
Date: Tue, 07 Aug 2007 12:34:54 -0400
James Smart wrote:
> Jeff Garzik wrote:
>> The lpfc update was probably the biggest thing, LOC-wise. And even
>> though that was mostly bug fixes -- and notably NOT 100% fixes -- it
>> is big enough to warrant integration testing and exposure prior to
>> mainline. Definitely merge-window-open material AFAICS.
>
> FYI - it is integrated and tested prior to mainline, by Emulex (and who
> else *really* tests it close to the degree we do ?). We do so, as a whole,
> weeks ahead of the submit to the maintainer. Usually, there's only a couple
> of small api changes that are picked up when we merge into the maintainers
> pool. And most of these are caught by us prior anyway as we package the
> patchsets and ensure the integration into the maintainers pool is smooth.
This is a highly common pattern, and unfortunately you get the highly
common Linux response:
In Linux we never ever assume a driver is working simply because the
hardware vendor tested it. A decade of real world experience PROVES
precisely the opposite -- getting code out into the world early and
often repeatedly turned up problems not seen in hardware vendor's testing.
Take a lesson from when I was on Linus's shit-list... twice: Twice,
Intel submitted an e1000 update after the merge window closed. Twice,
they claimed the driver passed their quite-exhaustive internal testing.
And twice, the most popular network driver broke for large masses of
users because I took a hardware vendor's word on testing rather than
rely on the testing PROVEN to flush out problems: public linux kernel
testing.
I'm not singling out Intel, there are plenty of other hardware vendors
that repeat the exact same pattern.
It's quite simply impossible for a hardware vendor to test all the weird
combinations in the field. Our test lab -- the Internet -- is the one
we trust.
Jeff
The most popular network driver?
Is really e1000 the most popular network driver? From more than a hundred of computers here I can remember only two or three with e1000. For servers from my point of view it seems that the most widely used is tg3, not e1000.
-Yenya
pretty popular
The tg3 was pretty popular (do you have a lot of older dells?), but the e1000 is definitely a better chipset (and thus more likely to be added on--most of the add-in NICs I've seen and used are Intels). And the e1000 was (and is) embedded in a lot of Intel motherboards. At any rate, systems that ship with a broadcom chipset are far more likely to use the bnx2 driver than the tg3 driver these days.
The fact that intel has kept a stable interface for so long is an advantage not to be underestimated, nor is their commitment to maintaining their driver. I've got some older syskonnect cards (which actually outperform the intels on small packets) that just don't work anymore due to driver rot. (The older sk98lin driver is essentially unmaintained and doesn't integrate well in 2.6 kernels, and the new skge driver is focused on newer syskonnect embedded chipsets and has issues with the older cards.) Similarly, I've had major issues with the tg3 driver on older embedded NICs (slow, high-latency performance). Intel just does a better job of supporting their stuff, and they've been making cards that use that driver for what, a decade now? I wouldn't be surprised if it was the most popular NIC out there on linux systems.