Re: [PATCH] I/O space boot parameter

Previous thread: [PATCH] - Altix: ioremap vga_console_iobase by John Keller on Tuesday, March 20, 2007 - 11:50 am. (1 message)

Next thread: EXT3 problem in 2.6.21-rc4 by Rafał Bilski on Tuesday, March 20, 2007 - 11:57 am. (2 messages)
From: Daniel Yeisley
Date: Tuesday, March 20, 2007 - 9:18 am

It has been mentioned before that large systems with a lot of PCI buses
have issues with the 64k I/O space limit.  The ES7000 has a BIOS option
to either assign I/O space to all adapters, or only to those that need
it.  A list of supported adapters that don't need it is kept in the
BIOS.  When this option is used, the kernel sees the BARs on the
adapters and still tries to assign I/O space (until it runs out).  I've
written a patch to implement a boot parameter that tells the kernel not
to assign I/O space if the BIOS hasn't.  

Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
---

diff -Naur linux-2.6.20-org/Documentation/kernel-parameters.txt linux-2.6.20-new/Documentation/kernel-parameters.txt
--- linux-2.6.20-org/Documentation/kernel-parameters.txt	2007-02-04 13:44:54.000000000 -0500
+++ linux-2.6.20-new/Documentation/kernel-parameters.txt	2007-03-05 21:35:15.000000000 -0500
@@ -1259,6 +1259,7 @@
 				This sorting is done to get a device
 				order compatible with older (<= 2.4) kernels.
 		nobfsort	Don't sort PCI devices into breadth-first order.
+		noiospace	Do not allocate I/O space unless the BIOS has done so.
 
 	pcmv=		[HW,PCMCIA] BadgePAD 4
 
diff -Naur linux-2.6.20-org/drivers/pci/pci.c linux-2.6.20-new/drivers/pci/pci.c
--- linux-2.6.20-org/drivers/pci/pci.c	2007-02-04 13:44:54.000000000 -0500
+++ linux-2.6.20-new/drivers/pci/pci.c	2007-03-06 00:58:52.000000000 -0500
@@ -20,6 +20,7 @@
 #include "pci.h"
 
 unsigned int pci_pm_d3_delay = 10;
+unsigned int noiospace = 0;
 
 /**
  * pci_bus_max_busnr - returns maximum PCI bus number of given bus' children
@@ -1168,6 +1169,8 @@
 		if (*str && (str = pcibios_setup(str)) && *str) {
 			if (!strcmp(str, "nomsi")) {
 				pci_no_msi();
+			} else if (!strcmp(str, "noiospace")) {
+				noiospace = 1;
 			} else {
 				printk(KERN_ERR "PCI: Unknown option `%s'\n",
 						str);
diff -Naur linux-2.6.20-org/drivers/pci/pci.h linux-2.6.20-new/drivers/pci/pci.h
--- linux-2.6.20-org/drivers/pci/pci.h	2007-02-04 ...
From: Greg KH
Date: Tuesday, March 20, 2007 - 11:00 am

How prelevant are machines like this?  And why are the BARs on these

pci_no_iospace perhaps?  "noiospace" isn't the best named global
variable...

thanks,

greg k-h
-

From: Daniel Yeisley
Date: Tuesday, March 20, 2007 - 10:25 am

I don't have any sales numbers, but I can tell you that our current
systems can have up to 64 PCI buses.  

I've been working with Emulex cards, and my understanding is that the
BARs on the devices aren't wrong, but we can't allocate 4k of I/O space
for each one.  So we maintain a list in the BIOS of devices that don't
actually need I/O space and then don't assign it.  I've tested an a
x86_64 system with 20+ adapters and saw all the disks attached without
any problems.

I've changed the patch with the suggested variable name change.

Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
---

diff -Naur linux-2.6.20-org/Documentation/kernel-parameters.txt linux-2.6.20-new/Documentation/kernel-parameters.txt
--- linux-2.6.20-org/Documentation/kernel-parameters.txt	2007-02-04 13:44:54.000000000 -0500
+++ linux-2.6.20-new/Documentation/kernel-parameters.txt	2007-03-05 21:35:15.000000000 -0500
@@ -1259,6 +1259,7 @@
 				This sorting is done to get a device
 				order compatible with older (<= 2.4) kernels.
 		nobfsort	Don't sort PCI devices into breadth-first order.
+		noiospace	Do not allocate I/O space unless the BIOS has done so.
 
 	pcmv=		[HW,PCMCIA] BadgePAD 4
 
diff -Naur linux-2.6.20-org/drivers/pci/pci.c linux-2.6.20-new/drivers/pci/pci.c
--- linux-2.6.20-org/drivers/pci/pci.c	2007-02-04 13:44:54.000000000 -0500
+++ linux-2.6.20-new/drivers/pci/pci.c	2007-03-20 13:38:43.000000000 -0400
@@ -20,6 +20,7 @@
 #include "pci.h"
 
 unsigned int pci_pm_d3_delay = 10;
+unsigned int pci_no_iospace = 0;
 
 /**
  * pci_bus_max_busnr - returns maximum PCI bus number of given bus' children
@@ -1168,6 +1169,8 @@
 		if (*str && (str = pcibios_setup(str)) && *str) {
 			if (!strcmp(str, "nomsi")) {
 				pci_no_msi();
+			} else if (!strcmp(str, "noiospace")) {
+				pci_no_iospace = 1;
 			} else {
 				printk(KERN_ERR "PCI: Unknown option `%s'\n",
 						str);
diff -Naur linux-2.6.20-org/drivers/pci/pci.h linux-2.6.20-new/drivers/pci/pci.h
--- ...
From: Greg KH
Date: Tuesday, March 20, 2007 - 1:26 pm

Ah.  Others are working on providing a fix for this too, but it is being
done in the drivers themselves, not in the pci core.  Look in the
linux-pci mailing list archives for those patches (I don't think they
every went into mainline for some reason, but I might be wrong...)

I suggest you work with those developers, as they have the same issue
that you are trying to solve here.

thanks,

greg k-h
-

From: Daniel Yeisley
Date: Wednesday, March 21, 2007 - 6:37 am

I have seen some patches that make the drivers I/O port free here:
http://lkml.org/lkml/2006/2/26/261

I checked and they still aren't in the mainline.  

I don't know that it matters though because I see all the disks attached
to the system regardless of whether or not the adapters get I/O space.
The real issue I have is with all the error messages I get at boot.  I
see 40+ messages that say "PCI: Failed to allocate I/O
resource..." (from setup-res.c) when the kernel tries to allocate the
I/O space and can't.  The modules load fine.  I see all the disks just
fine.  But that many error messages tends to raise concerns and causes
support calls from customers.

Dan

-

From: Greg KH
Date: Wednesday, March 21, 2007 - 4:57 pm

If this isn't an issue for functionality, why not fix your BIOS then?

And doesn't the above linked patch set also solve your issue with the
noise in the syslog?

thanks,

greg k-h
-

From: Daniel Yeisley
Date: Thursday, March 22, 2007 - 8:08 am

I'm not sure what there is to fix in the BIOS.  It can't assign more

I did try the patches and the modules load and don't request I/O space,
although I only see 17 of 29 disks (I'll have to look into that more).
I still get the "Failed to allocate I/O..." messages (long before the

-

From: Eric W. Biederman
Date: Tuesday, March 20, 2007 - 1:05 pm

It is a machine scaling issue, and largely caused because we have
only one PCI-IO space on x86.  Since devices are increasingly moving
to mmaped I/O it becomes less of an issue but there are still legacy
bits and pieces.

The bridges should be able to correctly have a no pci io region,
by setting base > than limit.  Having bridges that can map pci io
but don't have anything behind them is common.

The concept of having devices that have I/O bars that we should not
assign I/O space to I find a little weird.  I guess we can detect
this case by simply looking to see if the bridge maps the address
assigned to the bar.

Ideally the way to handle this case is to not that the BAR is not
valid (but it could be) and not attempt to fix this until the driver
tries to use the BAR.

The approach where we don't allocate a bar if the BIOS doesn't sounds
like a hack, and we still need code in the kernel to detect that the
BAR has an invalid value and that we can't use it.

I have seen so many different api's for mapping the region behind a
bar that I'm a little fuzzy.  Is my recollection correct that we
have enough flexibility in the current API that we can detect an
invalid address and allocate a valid one if needed?

I do know we have a semi common case that is related to this where
the BIOS does not allocate a resource because it knows we don't
need it, and the kernel being more generic decides to allocate it
just in case.  At which point the kernel reprograms the hardware
to allocate the resource and then we have problems because the
kernel didn't have enough knowledge to reprogram the root complex
(northbridge) correctly.

So if we can delay our fixups to when we really need them the changes
of linux working on a machine where peculiar things are happening
are greater.

Eric
-

Previous thread: [PATCH] - Altix: ioremap vga_console_iobase by John Keller on Tuesday, March 20, 2007 - 11:50 am. (1 message)

Next thread: EXT3 problem in 2.6.21-rc4 by Rafał Bilski on Tuesday, March 20, 2007 - 11:57 am. (2 messages)