ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/ - I found some time to look into some writeback problems in fs/fs-writeback.c. The results were ugly. There are a pile of fixes here but more work (mainly testing) needs to be done. There's some new debug code in there which could be very expensive if there are a lot of dirty inodes in the machine (quadratic behaviour). If the machine seems to be affected by this, the debugging may be disabled with echo 0 > /proc/sys/fs/inode_debug - Added an i386 early-startup development tree, as git-newsetup.patch ("H. Peter Anvin" <hpa@zytor.com>) - Brought back git-sas.patch (Darrick J. Wong <djwong@us.ibm.com>). It got lost quite some time ago. Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo "subscribe mm-commits" | mail majordomo@vger.kernel.org - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers ...
On Tue, 15 May 2007 20:19:14 -0700 If CONFIG_SCSI=y && CONFIG_ATA=n, == ERROR: "ata_sas_slave_configure" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_port_disable" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_sas_port_init" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_sas_port_stop" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_sas_port_start" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_sas_port_alloc" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_noop_qc_prep" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_tf_to_fis" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_noop_dev_select" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_tf_from_fis" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_host_init" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_sas_queuecmd" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_sas_port_destroy" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_scsi_ioctl" [drivers/scsi/libsas/libsas.ko] undefined! ERROR: "ata_qc_complete" [drivers/scsi/libsas/libsas.ko] undefined! make[1]: *** [__modpost] Error 1" == This error comes. -Kame -
Yes, but it seems wrong to disable all of libsas if !ATA. Only sas_ata.o should depend on that. Darrick, is there any point in me carrying this tree? It doesn't appear to be a hotbed of activity... -
Nope. I haven't worked on those bits of code in quite a while, since a number of scsi/libata reorganizations were discussed at the storage summit that would make a fair amount of the sas_ata code unnecessary (or candidates for reworking). --D
On Tue, 15 May 2007 20:19:14 -0700,
Doesn't build on s390 when selecting the md menu:
drivers/built-in.o(.text+0x4423e): more undefined references to
`dma_map_page' follow
This is caused by the following in drivers/md/Kconfig:
menuconfig MD
bool "Multiple devices driver support (RAID and LVM)"
depends on BLOCK
select ASYNC_TX_DMA
help
Support multiple physical spindles through a single logical device.
Required for RAID and logical volume management.
ASYNC_TX_DMA is defined in drivers/dma/Kconfig, which has
menu "DMA Engine support"
depends on !S390
but unfortunately ASYNC_TX_DMA depends neither on the menu nor
on !S390. (I think it was just an unknown symbol on s390 before
Martin's Kconfig rework, so I could build older -mm kernels.)
Currently, the only md stuff depending on ASYNC_TX_DMA is MD_RAID456
(which means it doesn't work on s390 anymore, which is bad enough).
With the select statement, no md stuff can be build on s390 at all (and
I really don't see why ASYNC_TX_DMA should be forced upon all md
users)...
-
The rationale for the 'select' here was to attempt to prevent user
I agree it should not be forced on all users, I will push the following
change:
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 4a1b77e..fd29a54 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -8,7 +8,6 @@ menu "Multi-device support (RAID and LVM)"
config MD
bool "Multiple devices driver support (RAID and LVM)"
- select ASYNC_TX_DMA
help
Support multiple physical spindles through a single logical
device.
Required for RAID and logical volume management.
@@ -109,7 +108,8 @@ config MD_RAID10
config MD_RAID456
tristate "RAID-4/RAID-5/RAID-6 mode"
- depends on BLK_DEV_MD && ASYNC_TX_DMA
+ depends on BLK_DEV_MD
+ select ASYNC_TX_DMA
---help---
A RAID-5 set of N drives with a capacity of C MB per drive
provides
the capacity of C * (N - 1) MB, and protects against a failure
However this still will not allow s390 to build MD_RAID456. This
dependency is in place because the xor.o object has moved from
drivers/md to drivers/dma. The goal of the interface is to support
using offload engines when they are present, and use software routines
(like xor_block) when engines are not available. In other words, the
intent is that DMA_ENGINE=n && ASYNC_TX_DMA=y is a valid configuration.
Can we rework the !S390 change to the DMA_ENGINE menu? It seems to me
that S390 should follow the ARM example and only enable the driver menus
they want in arch/s390/Kconfig, no?
...
On a closer look, it seems async_tx should be its own directory like
crypto... I'll post the incremental changes to the md-accel git tree
for review.
Dan
-
Getting this on both x86 and x86_64 boxes, they are the older boxen so likely older compilers: CC arch/x86_64/boot/memory.o arch/i386/boot/memory.c: In function `detect_memory': arch/i386/boot/memory.c:32: error: can't find a register in class `DREG' while reloading `asm' Seems to come from git-netsetup, but that tree isn't pulled into your git version of -mm so I can't be more specific. -apw -
With the patch, elm3b6 from test.kernel.org builds and boots. It's x86_64. elm3b132 which is x86 fails with CC arch/i386/boot/video-bios.o HOSTCC arch/i386/boot/tools/build AS arch/i386/boot/compressed/head.o CC arch/i386/boot/compressed/misc.o OBJCOPY arch/i386/boot/compressed/vmlinux.bin LD arch/i386/boot/setup.elf ld:arch/i386/boot/setup.ld:47: syntax error make[1]: *** [arch/i386/boot/setup.elf] Error 1 make[1]: *** Waiting for unfinished jobs.... GZIP arch/i386/boot/compressed/vmlinux.bin.gz include/asm/processor.h: In function `native_get_debugreg': include/asm/processor.h:531: warning: asm operand 0 probably doesn't match constraints include/asm/processor.h: In function `native_set_debugreg': include/asm/processor.h:558: warning: asm operand 0 probably doesn't match constraints LD arch/i386/boot/compressed/piggy.o LD arch/i386/boot/compressed/vmlinux make: *** [bzImage] Error 2 05/16/07-17:27:44 Build the kernel. Failed rc = 2 05/16/07-17:27:44 build: kernel build Failed rc = 1 Failed and terminated the run I haven't checked yet if that has anything to do with git-newsetup or not. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... as well as binutils version number (it appears that your version of -hpa -
SOrry, had to wait for the machine to come idle: elm3b132:~# gcc -v Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux Thread model: posix gcc version 3.3.4 (Debian 1:3.3.4-3) elm3b132:~# dpkg -l | grep binutil ii binutils 2.14.90.0.7-8 The GNU assembler, linker and binary utiliti -apw -
On Wed, 16 May 2007 18:40:47 +0100
ASSERT(_end <= 0x8000, "Setup too big!")
static inline unsigned long native_get_debugreg(int regno)
{
unsigned long val = 0; /* Damn you, gcc! */
switch (regno) {
case 0:
asm("movl %%db0, %0" :"=r" (val)); break;
case 1:
asm("movl %%db1, %0" :"=r" (val)); break;
case 2:
asm("movl %%db2, %0" :"=r" (val)); break;
case 3:
asm("movl %%db3, %0" :"=r" (val)); break;
case 6:
asm("movl %%db6, %0" :"=r" (val)); break;
case 7:
asm("movl %%db7, %0" :"=r" (val)); break;
default:
--> BUG();
}
return val;
}
weird.
There are no significant changes in processor.h relative to 2.6.22-rc1.
If the file-n-line aren't screwed up, it's disliking
#define BUG() \
do { \
asm volatile("1:\tud2\n" \
".pushsection __bug_table,\"a\"\n" \
"2:\t.long 1b, %c0\n" \
"\t.word %c1, 0\n" \
"\t.org 2b+%c2\n" \
".popsection" \
: : "i" (__FILE__), "i" (__LINE__), \
"i" (sizeof(struct bug_entry))); \
for(;;) ; \
} while(0)
It built and ran 2.6.22-rc1-git4 happily.
-
With these two patches in combination, previously failing machines elm3b6 (x86_64 on test.kernel.org) and a modern x86 built a kernel and booted correctly. elm3b132 and elm3b132 (x86 numaq on test.kernel.org) built with these patches but silently fail on boot with no output via earlyprintk. According to test.kernel.org, this failure occurs with git-newsetup reverted so it is a separate problem. Thanks -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -
Ok, I've been following up on this failure on elm3b132/3. I moved forward to v2.6.22-rc2-mm1 and that also fails. I ran a bisection on the git-newsetup patch in as in -mm and basically it came down to the first patch, ie. any and all of this tree stops the boot. I just tried reproducing git-newsetup boot failures with the latest version of your tree (369f16fdd423d79640c4390915e6ab71189cb497) and that also fails. Fails in this context is hard boot failure after loading the kernel and before anything is printed. I also added a printf to the top of main() in boot/main.c and it doesn't come out, not that I really know if that means it got there or not. Any suggestions how to debug this puppy? -apw -
Thanks to Peter for all his encouragement off list. I cannot claim to have sorted this one out, I do however understand why my experiences and Mels did not seem consistent. Basically I am getting inconsistent results with different machines. I started my debug on a machine where 2.6.22-rc2 which worked and 2.6.22-rc2+newsetup which did not. I debugged the latter and managed to prove that it was in fact getting all the way to the kernel decompressor, and then crashing hard. The gzip image in memory was intact and yet it did not decrypt correctly, the first about 60% was intact, the remainder was damaged. Suspecting that this was an "uncompress in place" overlap problem I moved the compressed kernel way up out of the way and this then booted successfully. Experimenting I was able to get it to boot by increasing the overlap 'gap' from 32KB's to 64KB's. I was able to use the same patch to boot 2.6.22-rc2-mm1 on the same problems machines. However, this same overlap change did not fix another similar machine (the one in the TKO grid). I think that my debugging says that newsetup got the compressed kernel and decompressor into memory ok and execution passed to it normally. But I cannot figure out where the corruption is coming from. I tried annotating the gzip decompressor to see if the input and output buffers were overlapping at any time and that debug said no (unsure how reliable that is). And yet at some point the output image is munched up. One last piece of information. The decompressor also always seems to get to the end of the input stream in exactly the right place without reporting any kind of error, that is with exactly 8 bytes left over for the length and crc checks. Which given the context sensitive nature of the algorithm tends to imply the input stream was ok for the whole duration of the decompress. Yet the output stream is badly broken. Anyone got any wacky suggestions ... -apw -
It definitely sounds like a memory clobber of some sort. Usual suspects, in addition to the input/output buffers you already looked at, would be the heap and the stack. Finding where the stack pointer lives would be my first, instinctive guess. -hpa -
The stack seems to be where it should be and seems to stay pretty much in the same place as it should. Adding checks for the heap also seem to stay within bounds. I've tried making the stack and the heap 64k to no effect. Moving the kernel to other places in memory seems to kill the decode completely during gunzip() which may be a hint I am not sure. This thing is trying to ruin my mind. -apw -
Yours and mine both. Seems like *something* is clobbering memory, but what and why is a mystery. The fact that putting the kernel in a higher point in memory is a good indication that this clobber is at a relatively high address. How much RAM does this machine have? -hpa -
This is as 12GB machine. 3 numa nodes. I checked out the location of the IDT and GDT and both seem sane, in the 9xxxx range below the kernel destination. I also note that on another machine of this type, one Node only in that case some of the "did work" cases do not work. Also when I applied some of my patches on the top "working" cases stopped working. So whatever it is is definatly related to the shape of the kernel to be loaded. Very confusing. -apw -
Ok, in fact when the kernel is moved elsewhere in the address space it will decode properly. There was a check in there for not loading at the right address which was catching me out ... as errors do not show up as we have no serial support. Doh. Once I had gotten this decoding at other addresses I simply tried moving the base address for the kernel elsewhere. I am able to successfully boot the kernel at 16MB and 256MB. This seems like something outside the decoder scribbling. I would not normally recommend moving the base address of the kernel. However, this problem at least so far has only shown up on the NUMA-Q platform which is at best described as a very small volume sub-architecture. There are areas in which it differers from mainstream BIOS and we are no longer able to get details of these differences. We actually have no proof as yet this is or is not a NUMA-Q specific problem. For instance these machines tend to run less modules and more builtin stuff than the average due to an owner dislike of modules. So we could have a lurking kernel size issue or similar. I am therefore proposing change the base address for NUMA-Q only (patch to follow this email). And that we remain aware of the issue and on the lookout for similar breakage on mainstream x86 platforms. At least with this patch we can get wider testing on the rest of the kernel. -apw -
Observed same problem with gcc version 3.4.4 20050721 (Red Hat 3.4.4-2) and binutils-2.15.92.0.2-15 and the above patch fixes it. Regards, Bharata. -- "Men come and go but mountains remain" -- Ruskin Bond. -
Hi, After installation the new mm1 kernel, My system can not boot, the rc1 kernel works ok. The cursor just blinks after appearing "Bios data check successful" message. what do you think about this? -
"Bios data check successful" is not a message that comes from Linux, nor from the boot loader. Since you have left absolutely zero details about your system or anything else, there isn't much anyone can do about it. -hpa -
It sounds vagely similar to the silent failure on elm3b132. I'm still bisecting this on the side. It's taking an age because the target machine is so slow and using a faster machine with a different compiler does not reproduce the problem. I don't think it's git-newsetup that is the problem though for what that's worth. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -
Hi, My cpu is Intel(R) Pentium(R) D CPU 2.80GHz, below are the lspci output and kernel -------lspci----------- 00:00.0 Host bridge: Intel Corporation 945G/GZ/P/PL Express Memory Controller Hub (rev 02) 00:01.0 PCI bridge: Intel Corporation 945G/GZ/P/PL Express PCI Express Root Port (rev 02) 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) 00:1f.2 IDE interface: Intel Corporation 82801GB/GR/GH (ICH7 Family) Serial ATA Storage Controller IDE (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 01:00.0 VGA compatible controller: ATI Technologies Inc RV380 [Radeon X600 (PCIE)] 01:00.1 Display controller: ATI Technologies Inc RV380 [Radeon X600] 03:08.0 Ethernet controller: Intel Corporation 82801G (ICH7 Family) LAN Controller (rev 01) ------------config------------- # # Automatically generated make config: don't edit # Linux kernel version: 2.6.22-rc1-mm1 # Fri May 18 15:55:20 ...
Could you please try booting with "vga=ask", and see if you get the video mode selection menu? -hpa -
Hi, I tried the vga option , and the selection menu appeared, then I select 0(80x25) and nothing happened. -
OK. Could you put printf's in the setup code (especially arch/i386/boot/main.c) to see how far it runs before it dies? -hpa -
I add some debug info to main.c, the result is that the kernel stopped in query_edd(); Then I use kernel argument edd=off, the kernel booted happilly. I will read the edd.c to see what happened. do you have some suggestion? -
I've got this in dmesg: BUG: at /local/xslaby/xxx/mm/page-writeback.c:829 __set_page_dirty_nobuffers() [<c010531e>] dump_trace+0x1ce/0x200 [<c010536a>] show_trace_log_lvl+0x1a/0x30 [<c0106012>] show_trace+0x12/0x20 [<c0106086>] dump_stack+0x16/0x20 [<c015566d>] __set_page_dirty_nobuffers+0x11d/0x130 [<c0155690>] redirty_page_for_writepage+0x10/0x20 [<c01938fc>] __block_write_full_page+0x20c/0x330 [<c0193b0a>] block_write_full_page+0xea/0x100 [<c0196c82>] blkdev_writepage+0x12/0x20 [<c015539e>] __writepage+0xe/0x30 [<c01558c2>] write_cache_pages+0x222/0x340 [<c0155a03>] generic_writepages+0x23/0x30 [<c0155a3e>] do_writepages+0x2e/0x50 [<c018decb>] __writeback_single_inode+0x8b/0x470 [<c018e75b>] generic_sync_sb_inodes+0x24b/0x470 [<c018e9a7>] sync_sb_inodes+0x27/0x30 [<c018ec33>] writeback_inodes+0xb3/0xe0 [<c01560f2>] wb_kupdate+0x82/0xf0 [<c015660b>] pdflush+0xeb/0x1b0 [<c0132e72>] kthread+0x42/0x70 [<c0104d4b>] kernel_thread_helper+0x7/0x1c ======================= BUG: at /local/xslaby/xxx/mm/page-writeback.c:829 __set_page_dirty_nobuffers() [<c010531e>] dump_trace+0x1ce/0x200 [<c010536a>] show_trace_log_lvl+0x1a/0x30 [<c0106012>] show_trace+0x12/0x20 [<c0106086>] dump_stack+0x16/0x20 [<c015566d>] __set_page_dirty_nobuffers+0x11d/0x130 [<f8b1fc5b>] nfs_updatepage+0x7b/0x200 [nfs] [<f8b156df>] nfs_commit_write+0x2f/0x50 [nfs] [<c0150911>] generic_file_buffered_write+0x2a1/0x660 [<c0150f52>] __generic_file_aio_write_nolock+0x282/0x520 [<c0151252>] generic_file_aio_write+0x62/0xd0 [<f8b15def>] nfs_file_write+0xef/0x1c0 [nfs] [<c01715e0>] do_sync_write+0xd0/0x110 [<c0171e04>] vfs_write+0x94/0x130 [<c017248d>] sys_write+0x3d/0x70 [<c01040e8>] syscall_call+0x7/0xb [<b7eb7b3e>] 0xb7eb7b3e ======================= regards, -- http://www.fi.muni.cz/~xslaby/ Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 ...
Do you have any messages before this one? Seems like it is probably metadata, This one is NFS, setting the page dirty while it is not uptodate. Trond, is this because NFS keeps track of dirty regions of the page with private data? It might make sense to avoid this warning if PagePrivate is set... would that fix the NFS case? -- SUSE Labs, Novell Inc. -
No other messages before that. Bazillion through-nfs stacks after this... regards, -- http://www.fi.muni.cz/~xslaby/ Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E -
Does this patch fix NFS? -- SUSE Labs, Novell Inc.
Ah... You put an extra WARN_ON(!PageUptodate(page)). err=-ENOCOFFEE, I missed that... So yes, in order to avoid having to read the page in when we just want to write data, NFS does this kind of tracking. I dunno if your fix to change it to !PagePrivate(page) && !PageUptodate(page) is right though. It will indeed fix the NFS case, but the block system uses PagePrivate() pretty extensively for its own nefarious ends (tracking page buffers). Trond -
I think that's OK: the block layer is similarly happy to mark a !uptodate page dirty if it has buffers, for similar reasons... Anyway, it won't use this particular path when buffers are attached, and I've put similar debugging stuff in the set_page_dirty_buffers part. -- SUSE Labs, Novell Inc. -
The first Oops is not NFS: it is some block file system, however the problem is the same. The crux of the matter would appear to be that some task is changing the page_mapping() of random pages while the page lock is held by another task. Do you see the same thing in mainline? Trond -
This might be related [ 97.740021] BUG: at /home/devel/linux-mm/mm/page-writeback.c:829 __set_page_dirty_nobuffers() [ 97.748632] [<c0105276>] dump_trace+0x63/0x1eb [ 97.753275] [<c0105418>] show_trace_log_lvl+0x1a/0x30 [ 97.758521] [<c010605a>] show_trace+0x12/0x14 [ 97.763042] [<c01060f7>] dump_stack+0x16/0x18 [ 97.767590] [<c01677b3>] __set_page_dirty_nobuffers+0xfe/0x16e [ 97.773598] [<c0167833>] redirty_page_for_writepage+0x10/0x12 [ 97.779491] [<c01a473a>] __block_write_full_page+0x1dc/0x335 [ 97.785328] [<c01a495c>] block_write_full_page+0xc9/0xd1 [ 97.790799] [<c01a781a>] blkdev_writepage+0x12/0x14 [ 97.795829] [<c01674ea>] __writepage+0xe/0x29 [ 97.800350] [<c01679b8>] write_cache_pages+0x183/0x29a [ 97.805683] [<c0167af1>] generic_writepages+0x22/0x2a [ 97.810929] [<c0167b1c>] do_writepages+0x23/0x34 [ 97.815702] [<c019f0a3>] __writeback_single_inode+0x245/0x472 [ 97.821632] [<c019f7e6>] generic_sync_sb_inodes+0x347/0x4cc [ 97.827379] [<c019f98b>] sync_sb_inodes+0x20/0x24 [ 97.832247] [<c019fb93>] writeback_inodes+0x79/0xc2 [ 97.837296] [<c0168173>] wb_kupdate+0x7a/0xdb [ 97.841833] [<c01686a0>] pdflush+0xf1/0x189 [ 97.846173] [<c0137d41>] kthread+0x3b/0x62 [ 97.850461] [<c0104e3f>] kernel_thread_helper+0x7/0x10 http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc1-mm1/mm-dmesg http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc1-mm1/mm-config Regards, Michal -- Michal K. K. Piotrowski Kernel Monkeys (http://kernel.wikidot.com/start) -
This one as well I guess : [14649.407909] BUG: at mm/page-writeback.c:829 __set_page_dirty_nobuffers() [14649.407945] [<c0156bb3>] __set_page_dirty_nobuffers+0x9a/0x104 [14649.407976] [<c018cfea>] __block_write_full_page+0x1b7/0x2f1 [14649.407999] [<e8ba89b4>] ext3_get_block+0x0/0xd0 [ext3] [14649.408039] [<c018d1f2>] block_write_full_page+0xce/0xd6 [14649.408054] [<e8ba743a>] walk_page_buffers+0x4d/0x67 [ext3] [14649.408072] [<e8ba89b4>] ext3_get_block+0x0/0xd0 [ext3] [14649.408096] [<e8ba9f52>] ext3_ordered_writepage+0xdc/0x189 [ext3] [14649.408115] [<e8ba7454>] bget_one+0x0/0x7 [ext3] [14649.408142] [<c01569cf>] __writepage+0xb/0x26 [14649.408153] [<c0156d88>] write_cache_pages+0x161/0x274 [14649.408166] [<c01569c4>] __writepage+0x0/0x26 [14649.408187] [<e8beaa03>] rtl8139_interrupt+0x3cd/0x3d7 [8139too] [14649.408217] [<c01c97c3>] __next_cpu+0x15/0x26 [14649.408229] [<c011b561>] find_busiest_group+0x1c9/0x54a [14649.408251] [<c0156eba>] generic_writepages+0x1f/0x27 [14649.408263] [<c0156eee>] do_writepages+0x2c/0x34 [14649.408275] [<c018845d>] __writeback_single_inode+0x1c3/0x3aa [14649.408295] [<c0188235>] __check_dirty_inode_list+0x21/0x86 [14649.408321] [<c0188a46>] generic_sync_sb_inodes+0x267/0x3a8 [14649.408347] [<c0188f49>] writeback_inodes+0x63/0xaa [14649.408355] [<c0132db8>] autoremove_wake_function+0x0/0x35 [14649.408368] [<c015777a>] pdflush+0x0/0x1a3 [14649.408377] [<c01574c1>] wb_kupdate+0x7f/0xe3 [14649.408410] [<c0157887>] pdflush+0x10d/0x1a3 [14649.408425] [<c0157442>] wb_kupdate+0x0/0xe3 [14649.408440] [<c0132ce6>] kthread+0x3b/0x61 [14649.408447] [<c0132cab>] kthread+0x0/0x61 [14649.408455] [<c0104a27>] kernel_thread_helper+0x7/0x10 [14649.408473] ======================= [24270.804919] BUG: at mm/page-writeback.c:829 __set_page_dirty_nobuffers() [24270.804955] [<c0156bb3>] __set_page_dirty_nobuffers+0x9a/0x104 [24270.804986] [<c018cfea>] __block_write_full_page+0x1b7/0x2f1 [24270.805014] [<e8ba89b4>] ...
Almost every time when I try to run this script I hit a bug. I'm wondering why... http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc1-mm1/test_mount_fs.sh [ 6666.713016] kernel BUG at /home/devel/linux-mm/include/linux/mm.h:288! [ 6666.719690] invalid opcode: 0000 [#1] [ 6666.723397] PREEMPT SMP [ 6666.725999] Modules linked in: xfs loop pktgen ipt_MASQUERADE iptable_nat nf_nat autofs4 af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd soundcore intel_agp agpgart snd_page_alloc i2c_i801 ide_cd cdrom rtc unix [ 6666.776026] CPU: 0 [ 6666.776027] EIP: 0060:[<c01693ec>] Not tainted VLI [ 6666.776028] EFLAGS: 00010202 (2.6.22-rc1-mm1 #3) [ 6666.788519] EIP is at put_page+0x44/0xee [ 6666.792491] eax: 00000001 ebx: c549f728 ecx: c04b27e0 edx: 00000001 [ 6666.799345] esi: 00000000 edi: 00000080 ebp: d067e9e0 esp: d067e9c8 [ 6666.806208] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [ 6666.812104] Process mount (pid: 9419, ti=d067e000 task=d00a4070 task.ti=d067e000) [ 6666.819486] Stack: d8980180 00000080 d067e9f0 d8980180 00000000 00000080 d067e9f0 fdc8eda3 [ 6666.828103] fffffffc d8980180 d067ea20 fdc8f7ff fdc9b425 fdc96e5c 00080000 00000000 [ 6666.836635] c549dfd0 00000200 ffffffff cd44b8e0 00002160 cd44b8e0 d067ea30 fdc78937 [ 6666.845253] Call Trace: [ 6666.847939] [<fdc8eda3>] xfs_buf_free+0x41/0x61 [xfs] [ 6666.853247] [<fdc8f7ff>] xfs_buf_get_noaddr+0x10c/0x118 [xfs] [ 6666.859231] [<fdc78937>] xlog_get_bp+0x65/0x69 [xfs] [ 6666.864412] [<fdc79e87>] xlog_write_log_records+0x73/0x20d [xfs] [ 6666.870654] [<fdc7a174>] xlog_clear_stale_blocks+0x153/0x15b [xfs] [ 6666.877075] [<fdc7a546>] ...
static inline int put_page_testzero(struct page *page)
{
VM_BUG_ON(atomic_read(&page->_count) == 0);
return atomic_dec_and_test(&page->_count);
Looks like XFS did a free of an already-freed page. There are a couple of
likely suspects in git-xfs.patch.
Does mainline do this?
-
I haven't seen that one. I expect that it will be the noaddr buffer allocation
Yeah - that trace implies a memory allocation failure when allocating
log buffer pages and the cleanup looks like it does a double free
of the pages that got allocated. Patch attached below that should fix
I assume that the thread doing the mount got killed by the BUG and so the
normal error handling path on log mount failure was not executed and hence the
uuid for the filesystem never got removed from the table used to detect
multiple mounts of the same filesystem....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
---
fs/xfs/linux-2.6/xfs_buf.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-05-11 16:03:26.000000000 +1000
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c 2007-05-17 11:53:40.293585132 +1000
@@ -323,9 +323,16 @@ xfs_buf_free(
for (i = 0; i < bp->b_page_count; i++) {
struct page *page = bp->b_pages[i];
- if (bp->b_flags & _XBF_PAGE_CACHE)
+ /* handle noaddr allocation failure case */
+ if (!page)
+ break;
+
+ if (bp->b_flags & _XBF_PAGE_CACHE) {
ASSERT(!PagePrivate(page));
- page_cache_release(page);
+ page_cache_release(page);
+ } else {
+ __free_page(page);
+ }
}
_xfs_buf_free_pages(bp);
}
@@ -766,6 +773,8 @@ xfs_buf_get_noaddr(
goto fail;
_xfs_buf_initialize(bp, target, 0, len, 0);
+ bp->b_flags |= _XBF_PAGES;
+
error = _xfs_buf_get_pages(bp, page_count, 0);
if (error)
goto fail_free_buf;
@@ -773,15 +782,14 @@ xfs_buf_get_noaddr(
for (i = 0; i < page_count; i++) {
bp->b_pages[i] = alloc_page(GFP_KERNEL);
if (!bp->b_pages[i])
- goto fail_free_mem;
+ goto fail_free_buf;
}
- bp->b_flags |= _XBF_PAGES;
error = _xfs_buf_map_pages(bp, ...Yes. xfs_buf_get_noaddr calls xfs_buf_free to free a buffer when something fails. But this is wrong - we want to call xfs_buf_deallocate before we setup the page list, and if a page allocation fails we want to do out own freeing of just the pages we allocated and call _xfs_buf_free_pages. Currently we do our own freeing _and_ call xfs_buf_free which leads to this double free. Signed-off-by: Christoph Hellwig <hch@lst.de> Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-05-17 09:34:44.000000000 +0200 +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c 2007-05-17 09:36:53.000000000 +0200 @@ -792,8 +792,9 @@ xfs_buf_get_noaddr( fail_free_mem: while (--i >= 0) __free_page(bp->b_pages[i]); + _xfs_buf_free_pages(bp); fail_free_buf: - xfs_buf_free(bp); + xfs_buf_deallocate(bp); fail: return NULL; ---end quoted text--- -
Hi Christoph,
I applied your patch and I get another oops
[ 261.491499] XFS mounting filesystem loop0
[ 261.501641] Ending clean XFS mount for filesystem: loop0
[ 261.507698] SELinux: initialized (dev loop0, type xfs), uses xattr
[ 261.567441] XFS mounting filesystem loop0
[ 261.573931] allocation failed: out of vmalloc space - use vmalloc=<size> to increase size.
[ 261.582935] xfs_buf_get_noaddr: failed to map pages
[ 261.592478] Ending clean XFS mount for filesystem: loop0
[ 261.618543] SELinux: initialized (dev loop0, type xfs), uses xattr
[ 261.691563] XFS mounting filesystem loop0
[ 261.698927] allocation failed: out of vmalloc space - use vmalloc=<size> to increase size.
^^^^^^^^^^^^^^^^^^^^
interesting
[ 261.724829] xfs_buf_get_noaddr: failed to map pages
[ 261.734049] Ending clean XFS mount for filesystem: loop0
[ 261.741069] SELinux: initialized (dev loop0, type xfs), uses xattr
[ 261.978728] XFS mounting filesystem loop0
[ 262.205863] xfs_buf_get_noaddr: failed to map pages
[ 262.212523] Ending clean XFS mount for filesystem: loop0
[ 262.218084] SELinux: initialized (dev loop0, type xfs), uses xattr
[..]
[ 265.842566] xfs_buf_get_noaddr: failed to map pages
[ 265.848267] xfs_buf_get_noaddr: failed to map pages
[ 265.856480] Ending clean XFS mount for filesystem: loop0
[ 265.862260] SELinux: initialized (dev loop0, type xfs), uses xattr
[ 265.921288] XFS mounting filesystem loop0
[ 265.927123] xfs_buf_get_noaddr: failed to map pages
[ 265.932575] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 265.942886] printing eip:
[ 265.945665] fdc8e82a
[ 265.948818] *pde = 00000000
[ 265.952378] Oops: 0002 [#1]
[ 265.955241] PREEMPT SMP
[ 265.957868] Modules linked in: xfs loop ipt_MASQUERADE iptable_nat nf_nat autofs4 af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables ...Yeah, looks like a vmalloc leak is occurring. I haven't noticed it before because: VmallocTotal: 137427898368 kB VmallocUsed: 3128272 kB VmallocChunk: 137424770048 kB It takes a long time to leak enough vmapped space to run out on ia64... That tends to imply we have a mapped buffer being leaked somewhere. Interestingly, I don't see a memory leak so we must be freeing the memory associated with the buffer, just not unmapping it first. Not sure how that can happen yet..... mount xfsVmallocUsed: 177808 kB unmount xfs mount xfsVmallocUsed: 178080 kB unmount xfs mount xfsVmallocUsed: 178352 kB unmount xfs mount xfsVmallocUsed: 178624 kB unmount xfs mount xfsVmallocUsed: 178896 kB unmount xfs mount xfsVmallocUsed: 179168 kB unmount xfs mount xfsVmallocUsed: 179440 kB unmount xfs mount xfsVmallocUsed: 179712 kB unmount xfs mount xfsVmallocUsed: 179984 kB Looks like we're leaking 272kB of vmalloc space on each mount/unmount Groan - ASSERT(0) is the error handling there for debug kernels. If we fail to allocate an iclogbuf on a non-debug kernel, it will panic like this. I'll deal with that later.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -
I've found what is going on here - kmem_alloc() is decidedly more
forgiving than manually built page arrays and vmap/vunmap. Prior
to this change we wouldn't have even leaked memory....
Christoph - this is an interaction with xfs_buf_associate_memory();
I'm not sure what it is doing is at all safe now that it never gets
passed kmem_alloc()d memory - it works for the log recovery case
because we use it in pairs - once to shorten the buffer and then once
to put it back the way it was.
But that doesn't work for the log buffers (we never return them to their
original state) and the log wrap case looks to work mostly by accident
now (and could posibly lead to double freeing pages)....
It seems that what we really need with the new code is a xfs_buf_clone()
operation followed by trimming the range to what the secondary I/O needs
to span. This would work for the log buffer case as well. Your thoughts?
In the meantime, the following patch appears to fix the leak.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
---
fs/xfs/xfs_log.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2007-05-21 19:51:18.000000000 +1000
+++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2007-05-21 19:57:30.960084657 +1000
@@ -1457,7 +1457,7 @@ xlog_sync(xlog_t *log,
} else {
iclog->ic_bwritecnt = 1;
}
- XFS_BUF_SET_PTR(bp, (xfs_caddr_t) &(iclog->ic_header), count);
+ XFS_BUF_SET_COUNT(bp, count);
XFS_BUF_SET_FSPRIVATE(bp, iclog); /* save for later */
XFS_BUF_ZEROFLAGS(bp);
XFS_BUF_BUSY(bp);
-
xfs_buf_associate_memory is a mess. My original plan was to get rid of it, but I kept that out to keep that patchset small and easily reviable, but it seems like that was a mistake. My plan is the following: - xlog_bread and thus the whole buffer I/O path grows an iooffset paramater that specifies at which offset into the buffer we start the actual I/O. That gets rid of all the xfs_buf_associate_memory memory uses in the log recovery code - add a buffer clone operation as suggested by you above, and use the offset in xlog_sync aswell. until then you patch below looks fine. -
Perhaps a new field in the xfs_buf structure - that way call paths don't need to grow extra parameters and potentially increase stack usage. The read path tends to be at the top of the stack I don't want to have to introduce a mempool just for one xfs_buf per filesystem, so this would need to be able to take a xfs_buf (log->l_xbuf) that it clones to.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -
I have some patches to unwind the buffer I/O path, it's a little Yes. Note that we currently do a non-mempooled allocated for the page array, which this would cure aswell. -
Thatd be unfortunate - there are very few iclog buffers relative to every other metadata buffer, so growing the struct for all of those too would not be ideal (I remember Steve going on pagebuf shrinking exercises in the distant past, to fit more of em in memory at once, I can't remember what benchmark in particular he was using though). cheers. -- Nathan -
Hi David, After a few minutes of mount/umount cycle everything seems to be ok, problem fixed. Regards, Michal -- Michal K. K. Piotrowski Kernel Monkeys (http://kernel.wikidot.com/start) -
LZO build fails on allyesconfig: lib/built-in.o: In function `lzo1x_1_compress': lib/lzo/minilzo.c:724: multiple definition of `lzo1x_1_compress' fs/built-in.o:fs/reiser4/plugin/compress/minilzo.c:1307: first defined here ld: Warning: size of symbol `lzo1x_1_compress' changed from 1541 in fs/built-in.o to 244 in lib/built-in.o lib/built-in.o: In function `lzo1x_decompress': lib/lzo/minilzo.c:885: multiple definition of `lzo1x_decompress' fs/built-in.o:fs/reiser4/plugin/compress/minilzo.c:1466: first defined here ld: Warning: size of symbol `lzo1x_decompress' changed from 1047 in fs/built-in.o to 678 in lib/built-in.o make: *** [.tmp_vmlinux1] Error 1 make: Target `all' not remade because of errors. --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** -
Looks like reiser4 contains a copy of minilzo used as some kind of compression plugin. It can be dropped in favour of the version in lib/lzo/, they'll be compatible. Andrew: Do you want a patch to remove it from reiser4? Richard -
Convert Reiser4 to use lzo implementation in lib/lzo/ instead of
including its own copy of minilzo.
Signed-off-by: Richard Purdie <rpurdie@openedhand.com>
---
[I've removed the deletion of minilzo.* and lzoconf.h from the LKML
version of this mail since its not very interesting]
fs/reiser4/Kconfig | 1
fs/reiser4/Makefile | 1
fs/reiser4/plugin/compress/Makefile | 1
fs/reiser4/plugin/compress/compress.c | 22
fs/reiser4/plugin/compress/lzoconf.h | 216 ---
fs/reiser4/plugin/compress/minilzo.c | 1967 ----------------------------------
fs/reiser4/plugin/compress/minilzo.h | 70 -
7 files changed, 10 insertions(+), 2268 deletions(-)
Index: linux-2.6.21/fs/reiser4/Kconfig
===================================================================
--- linux-2.6.21.orig/fs/reiser4/Kconfig 2007-05-16 18:46:01.000000000 +0100
+++ linux-2.6.21/fs/reiser4/Kconfig 2007-05-16 18:49:09.000000000 +0100
@@ -3,6 +3,7 @@ config REISER4_FS
depends on EXPERIMENTAL
select ZLIB_INFLATE
select ZLIB_DEFLATE
+ select LZO
select CRYPTO
help
Reiser4 is a filesystem that performs all filesystem operations
Index: linux-2.6.21/fs/reiser4/Makefile
===================================================================
--- linux-2.6.21.orig/fs/reiser4/Makefile 2007-05-16 18:46:01.000000000 +0100
+++ linux-2.6.21/fs/reiser4/Makefile 2007-05-16 20:35:48.000000000 +0100
@@ -70,7 +70,6 @@ reiser4-y := \
plugin/crypto/cipher.o \
plugin/crypto/digest.o \
\
- plugin/compress/minilzo.o \
plugin/compress/compress.o \
plugin/compress/compress_mode.o \
\
Index: linux-2.6.21/fs/reiser4/plugin/compress/Makefile
===================================================================
--- linux-2.6.21.orig/fs/reiser4/plugin/compress/Makefile 2007-05-16 18:46:01.000000000 +0100
+++ linux-2.6.21/fs/reiser4/plugin/compress/Makefile 2007-05-16 18:48:42.000000000 +0100
@@ -2,5 +2,4 @@ ...Sent. I also noticed that reiser4 is using lzo1x_decompress(), not lzo1x_decompress_safe(). The unsafe version is open to buffer overflows through malicious data since it performs no validation of where it writes output to. I'm not sure whether thats acceptable in filesystem code, I'd suspect not? Fixing it is a case of s/lzo1x_decompress(/lzo1x_decompress_safe(/ in fs/reiser4/plugin/compress/compress.c... Richard -
Ok, we will consider safe decompression, moreover, as I remember, it doesn't lead to sensible performance drop.. Thanks for this point, -
In 2.6.20.9 I can change trippoints: echo "105:100:100:78:70:40:30" > /proc/acpi/thermal_zone/TZ0/trip_points echo 10 > /proc/acpi/thermal_zone/TZ0/polling_frequency Then I got: cat /proc/acpi/thermal_zone/TZ0/* <setting not supported> cooling mode: active polling frequency: 10 seconds state: active[2] temperature: 45 C critical (S5): 105 C active[0]: 78 C: devices=3D0xdf415a40 active[1]: 70 C: devices=3D0xdf4159dc active[2]: 40 C: devices=3D0xdf41598c active[3]: 30 C: devices=3D0xdf41593c cat /proc/acpi/fan/*/* status: off status: off status: on status: on And fan turns on. In 2.6.22-rc1-mm1: echo "105:100:100:78:70:40:30" > /proc/acpi/thermal_zone/TZ0/trip_points bash: echo: write error: B=C4=B9=C2=82=C3=84=C2=85d wej=C4=B9=C2=9Bcia/wy= j=C4=B9=C2=9Bcia (input/output error) rutek:/home/maciek# cat /proc/acpi/thermal_zone/TZ0/* <setting not supported> polling frequency: 10 seconds state: ok temperature: 45 C critical (S5): 256 C active[0]: 78 C: devices=3D0xc1827a40 active[1]: 70 C: devices=3D0xc18279dc active[2]: 60 C: devices=3D0xc182798c active[3]: 50 C: devices=3D0xc182793c rutek:/home/maciek# cat /proc/acpi/fan/*/* status: off status: off status: off status: off Fan turns on when temperature is over 50*C. (want: 30) A read this: http://article.gmane.org/gmane.linux.acpi.devel/22750 But I don't have colling_policy, but only colling_mode: ls /proc/acpi/thermal_zone/TZ0/ cooling_mode polling_frequency state temperature trip_points Its bug or feature? Config, acpidump, dmesg: http://www.unixy.pl/maciek/download/kernel/2.6.22-rc1-mm1/ --=20 Maciej Rutecki www.unixy.pl Kernel ...
Committed to mainline May 10: Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=11ccc0... Commit: 11ccc0f249cb01a129f54760b8ff087f242935d4 Parent: de46c33745f5e2ad594c72f2cf5f490861b16ce1 Author: Len Brown <len.brown@intel.com> AuthorDate: Mon Apr 30 22:36:01 2007 -0400 Committer: Len Brown <len.brown@intel.com> CommitDate: Mon Apr 30 22:36:01 2007 -0400 ACPI: thermal trip points are read-only -
Should one understand that it IS a wanted behaviour ? Isn't it the DSDT job (which is kernel-accessible, or isn't it ?) to communicate trip_points to ACPI thermal zone ? Isn't OSPM managing thermal zone ? (http://acpi.sourceforge.net/documentation/thermal.html) PS : Sorry for all these (maybe stupid) questions, but I think I remember that changing trip_points had an effect on a (DSDT-bugged) laptop I used to use, and I'd like to understand... PPS : Sorry also for the english mistakes or approximations... -- ~~ |Oo| La banquise fond !!! Adoptez un pingouin... /|\/|\ |__| => http://doc.ubuntu-fr.org/ ^__^ ~~~| |~~~ -
What was the rationale? Can we get this one reverted? Some machines (HP omnibook xe3) have broken trip points -- too high -- so machine will overheat and trigger hw shutdown before starting passive cooling. That's really broken, and write to trip points is reasonable way to 'fix' that. (I'd understand if you only ever let trip points to decrease... but otoh root should be able to shoot himself....) Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Many people need change trippoints, for example I have: cat /proc/acpi/thermal_zone/TZ0/trip_points | grep critical critical (S5): 256 C I _must_ change it to below 105 C, or edit DSDT table (too difficult to me). I cannot use this kernel, when trip points are read only. -- Maciej Rutecki www.unixy.pl Kernel Monkeys (http://kernel.wikidot.com/)
What bad things happen if you leave the critical trip point at 256? Do you find that you can drive the temperature over 105 and the system fails to shut down? -Len -
It isn't problem in this case (nx6310). But on hp nc nc6220 first trip point is at 30 *C, so fan is usually on (noise, power consumption). -- Maciej Rutecki www.unixy.pl Kernel Monkeys (http://kernel.wikidot.com/)
Added to bugzilla (Bug 8496) http://bugzilla.kernel.org/show_bug.cgi?id=8496 -- Maciej Rutecki http://www.maciek.unixy.pl
Something similar happened to me on XE3, yes. (Actual values were different; BIOS specified critical temperature at cca 95C, but hw killed the power at cca 83C. Setting critical trip point at 80C made the problem go away.) Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Great, please file a bug and include the acpidump from the XE3 and we'll fix it, rather than supporting a bogus (manual) workaround for it. Of course if your system is running at 80*C and the hardware shuts off at 83*C, you may have a broken fan, or one clogged with dust... -Len -
It _did_ have broken fan. It also had broken trip points. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Thanks for clarifying this, Pavel. If you come upon an XE3 where Linux-2.6.22 doesn't work as well as Windows, please let me know. Given that the justification for this ill-conceived workaround seems to have diminished to the memory of broken hardware, it is clear that we should stay the course of removing it so that it doesn't further confuse future users. If SuSE violently disagrees with me, you are certainly empowered to restore the workaround in your distribution staring at 2.6.22 as part of your value add. However, given its history of confusing users, it seems that it might increase your support burden rather than decrease it. -Len -
"work as well as windows" is not good enough goal as far as I'm concerned. Please don't break working setups. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
No, writing trip-points is neither a fix, nor it is reasonable. It is a workaround at best, and it is a dangerous and mis-leading hack. The OS has no capability to actually change the ACPI trip points that are used by the BIOS. Changing the OS copy of them to make the user think that trip events will actually happen when the temperature crosses the OS copy is crazy. If there are systems with broken thermals and the ACPI thermal control needs and over-ride to turn on the fan, then that is fine -- but using fake trip-points and giving the user the impression that they are real is not viable. -Len -
Aha... wait. It seemed to work for me when I enabled thermal polling... Slowing cpu down / shutdown / turn the fan on is done in the os after all. Should we just start polling temperatures when user writes custom They become real when we fake _TSP, too, ..? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
That's exactly the point. If you allow a user to think they over-rode a trip-point but that trip point never fires unless they enable polling mode, then they're not going to get what they asked for. Yes, SuSE enables polling mode by default, but that is just I actually agree with you for passively cooled embedded systems. Indeed, that is the topic of one of my OLS papers. However, for an off-the-shelf laptop that the vendor ships with a specific active and passive cooling model, Linux is not currently set up to ignore what the vendor provided and go off on its own. Yes, it could be done, but for We are mis-using _TSP today, and over-riding it is a hack on top of a bug... _TSP is only supposed to be for the passive cooling algorithm -- which by definition is polling based. It is not intended to be used for active cooling at all. That is what active trip were invented for... -Len -
I will do that for openSUSE FACTORY. -- Stefan Seyfried QA / R&D Team Mobile Devices | "Any ideas, John?" SUSE LINUX Products GmbH, Nürnberg | "Well, surrounding them's out." This footer brought to you by insane German lawmakers: SUSE Linux Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) -
Well, I still believe right solution is to enable polling mode as soon as trip points are written (and ignoring bios updates from then on). That gets trip point writing into functional state. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Yes it is a workaround for critical ACPI bugs like that or similar: https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.17/+bug/22336 It's also convenient to e.g. lower passive trip point to avoid fan noise. Some people are used to it, I already wanted to write a little userspace prog to use them as it is really easy to fake cooling_mode (trip points are modified by BIOS) and eliminate fan noise and other things by e.g. reducing passsive or whatever trip point. This is at least a major sysfs interface change, has this been discussed somewhere before or declared deprecated? It's there for a long time, why is this "a dangerous and mis-leading hack." now? I'd suggest to revert this and I can come with something like "only allow lower values than BIOS provides" patch if the current implementation is considered dangerous. -
Thanks for pointing that out -- it is a great example of how powerful mis-information can be. The fact that the trip-points are writable has obscured, rather than clarified, the actual causes of the failures. No less than 4 people in that bug report declared that cleaning the dust out of their fan fixed the root cause. A bunch more said that the issues went away when they stopped using ubuntu's user-space power save daemon. There are a couple more with broken active fan control -- which also gets obscured rather than clarified by over-riding trip points. And finally, there are probably some with clean fans that are working properly, but are thermally challenged systems. I'll venture that Windows is NOT modifying or disabling the critical trip point to work around this issue. I'll venture that their thermal throttling is working and ours may not be. perhaps it was the recently fixed mod_timer() bug in thermal.c, nope, the OS can't reliably override the processor passive trip point. That is what _SCP and cooling_mode are for. The reason is that the BIOS can send us a trip-point changed event at any time, the kernel will evaluate _PSV, and wipe out the modified OS version. if you want to change the state of the fans, then poke /proc/acpi/fan/ directly. This will have effect until the next trip point It has been dangerous and misleading since the day it went in. If the user doesn't enable polling, then they are effectively writing random numbers that have absolutely no effect on the operation of the system, and hiding the numbers that That simply will not address the issue. Indeed, all the entries in the ubuntu bug report are about hitting the critical temperature and having a critical shutdown when it isn't wanted. These people want to RAISE the critical shutdown trip-point. Their cooling problems must be fixed -- raising critical trip points causes them instead to be ignored. For folks with the reverse problem -- active cooling where the fans ...
Whatever it was, it's in a final Ubuntu dist and the trip point
interface
could help some people to still be able to use it.
ACPI is very machine specific. 100 machines may work well and QA might
oversee the 100 and first where critical shutdowns or whatever happens.
Such workarounds are really helpful then.
Same for ignore _PPC and thermal polling (the latter is always on in our
distro,
I bet a lot machine would break if disabling it and just ripping out the
ability to set it, is really not a solution).
One big challenge in the ACPI subsystem (kernel or userspace) is to find
out BIOS implemenations that are at the limit of specs or which violate
the
specs and try to workaround them.
We are not in the position of M$ (at least in the desktop/laptop
segment) yet.
BIOS developers won't follow our implementations and IMO we should go
the
other way and provide more workarounds. If nobody needs them, the
Yes, it's not correct and those trip points might get overridden by BIOS
again on some machines. It still could help and doesn't hurt (Ok, one
should
not increase the critical trip point, but that can be implemented...).
Again, pls go for more workarounds.
The most annoying situation for the developer and the user is after
investing
a lot of time, finding and possibly fixing a bug and then you need to
tell the guy:
- Got it, please wait for the next kernel release coming out in some
weeks/months
- Thanks for the work, but implementing it in the kernel of this ditro
version
is too dangerous. Other machines might break (especially with ACPI
bugs). Better
you wait for the next distro version coming out in half a year.
-
Heh, you suggest this? It is even less functional than current solution -- which works okay as long as you keep thermal polling You are misstating the situation. With thermal polling, it is pretty much okay, and it is certainly better than "ride fans manually" hack No. Manually turning off fans is even worse hack. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
As Len says, the system can force a reevaluation of the trip points at any time which will wipe out the local settings. Either you ignore the spec and the notifications (potentially risking misbehaving hardware) or It's significantly more correct. -- Matthew Garrett | mjg59@srcf.ucam.org -
Significantly more correct? It forces you to do all the thermal management in userspace! Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Why's that a problem? Overriding the hardware policy has to be done somewhere, and doing it in userspace is no more dangerous than kernelspace. -- Matthew Garrett | mjg59@srcf.ucam.org -
Duplicating all the kernel logic in userspace, badly? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
So don't do it badly. The advantage of doing so is that you can make it work properly, which you can't by putting it in the kernel. -- Matthew Garrett | mjg59@srcf.ucam.org -
You want stuff like critical shutdowns to work even if userspace is dead. I do not think you can control passive cooling adequately from userspace, and you can certainly not prevent kernel from slowing machine down too soon. Plus, this is actually nasty user-visible change, and a regression from 2.6.21. I am not sure why we are even debating this; user-kernel interface changed without warning. Patch should be simply reverted. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
I don't think anyone suggested putting the critical shutdown control in Given the choice between something impossible and something difficult, In http://lkml.org/lkml/2007/1/27/93 you were more than happy to break an interface even though it could be fixed in a (ugly) way that made it work again. Here, there's no way to fix this properly - the platform will quite happily do things based on what it believes the trip points should be, and one of those things may be to alter the trip points. Imagine the following situation: 1) Platform sets critical shutdown trip point to 85C 2) Userspace sets critical shutdown trip point to 95C 3) Temperature reaches 90C 4) Platform forces reevaluation of trip points 5) Entire invasion fleet is lost How do you avoid that? Disable the ability for the platform to set trip points? You're breaking the spec and potentially causing hardware damage. If you have specific hardware that requires specific spec breakage, then a better approach would probably be to quirk the kernel to rectify it. On the other hand, if it works with the Other Leading OS, we ought to be able to just fix the problem properly. -- Matthew Garrett | mjg59@srcf.ucam.org -
No it does not. That is what this thread is about. (On old xe3, critical trip point set by BIOS is ~95C, but machine dies by hw safeguard at ~83C. Workaround is to lower critical trip point to We need to ignore trip point updates from BIOS, and we need to poll thermals when use overrides trip points. That's expected. Plus I've yet to see platform actually updating the trip points. Speaking about hw damage... The broken BIOS on xe3 definitely caused damage to its harddrive, so... we are preventing hw damage here. (Plus, Len's patch broke user-kernel in stable series, without warning). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Try any recent HP bios. -- Matthew Garrett | mjg59@srcf.ucam.org -
man cron... ;-)
--
~~
|Oo| La banquise fond !!! Adoptez un pingouin...
/|\/|\
|__| => http://doc.ubuntu-fr.org/
^__^
~~~| |~~~
-
Matthew Garrett pisze: Yes... hp nx 6310, bios version: F.06. cpufreq works, MFCG Bios Error in dmesg (PCI: BIOS Bug: MCFG area at f8000000 is not E820-reserved) F.08. like above + cpufreq broken F.09 Remove this errors, but problem with reboot (too long time - remove psmouse module doesn't help) - some people reports it (i didn't test it) F.0B suspend to ram broken, after suspend to disk keyboard doesn't work F.0D I don't have the heart test it... -- Maciej Rutecki http://www.maciek.unixy.pl
Thinkpad 600, whenever a trip point is crossed, all trip points are updated. I think they implemented hysteresis that way. ISTR that hp nx5000 did something similar, but i might be wrong on this one. -- Stefan Seyfried QA / R&D Team Mobile Devices | "Any ideas, John?" SUSE LINUX Products GmbH, Nürnberg | "Well, surrounding them's out." This footer brought to you by insane German lawmakers: SUSE Linux Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) -
Stripping some CCs, acpi and kernel list should be enough this one goes to... I doubt it is impossible, would you mind sharing your knowledge why you think it is impossible or point to some related discussion, pls. Does this mean checking temperature against trip points and adjust fan and cpufreq should be done in a hal module? In which stage is this, rfc, development, already in some git tree? Yes, trip points are overridden by BIOS on HPs and what is the problem? The workaround won't work for them, but it still does on others (mainly on ThinkPads which have passive tp at about 89 C and critical on 91 C). I could imagine an implementation for this, that e.g. critical...active9 get module parameters. BIOS updates for trip points get ignored as soon as one is set and you can only decrease a value. Nothing bad can happen and it will make some people happy (yes it's hacky, violates the specs and so on..., but some more people have a working machine). Will this (or similar) get accepted? It's even more impossible to get ACPI working correctly for all machines and all subsystems, these little workarounds can help some people to at least use their machine or get some parts working better. Thomas -
Because, as Len has pointed out, you end up with two different ideas about what the trip points are - the kernel's and the hardware's. That works fine until some event in the firmware either forcibly resynchronises the two or makes assumptions about the spec-compliance of You don't know whether the workaround will work or not until you've performed a full audit of the platform firmware, which is going to potentially change between BIOS versions. It's entirely legal for the firmware to behave in this way, and even beneficial under various The interface would need to be more complicated than that if you wanted to be able to implement hysteresis, and there's the potential for hardware damage if paramaters are set inappropriately. Even then, there's no easy way of programatically determining whether it would work It's fairly clearly not impossible, given that there exists at least one OS that these machines work with. -- Matthew Garrett | mjg59@srcf.ucam.org -
Not sure what exactly you'd like to do in userspace, maybe you can be a
bit more precise here:
a) Doing whole thermal management in userspace, reading temp, writing
fan and cpufreq_max_freq, shutting down machine,...
b) Workaround not switching on fans by double checking fan/temperature
by a userspace daemon and try to finally trigger the switch by
writing to /proc/acpi/fan/state (or corresponding /sys,..)
IMO we need a some kind of fan watchdog like Henrique described
recently, maybe this could be put in userspace not sure.
Currently the fan can runs out of sync easily if the fan state is
Hmm, I don't get the point. If it works it's great, if not you have a
But that's exactly what all these workarounds are for. You pass them if
you have a buggy BIOS. You wait for new BIOSes and hope that you can get
The fact that 3 people complained rather fast for a patch in rc1-mm1,
looks like this is a workaround that is needed. I personally advised two
guys to use it with their ThinkPad in the summer and they are happy with
it.
I'd also like to have this a bit extended: be able to just modify
passive trip point.
IMO this is a very powerful feature allowing people a fanless system as
long as they have a cpufreq capable processor.
The idea having this in userspace is interesting. But as said rather
complicated to implement. The hysteresis implementation for passive
cooling works fine in kernel and is field tested, it should get used.
The problem with the ACPI spec is that it's rather complicated. This is
IMO mainly for a BIOS developer point of view for what I can say.
Therefore it's rather seldom picked up by BIOS vendors.
However for the kernel it's easy (to fake, to do) and it's working fine,
so why not making use of it?
IMO we should even provide a passive trip point (initially unused) when
there is no one defined by BIOS.
I agree that it's hard to find the temperature to not let the fan kick
in automatically. But it's really easy then for everyone ......and suggested workaround is to drive fans directly from userspace, which not only violates the specs and has all the problems with Not sure why you try to scare people with 'hardware damage'. HP XE3 bios already _was_ damaging hardware (it cooked the hard drive using cpu as a heater), and no acpi magic can damage correctly working machine. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
I don't think that's obviously true. 11.3.2 of the 3.0 spec states: "A package consisting of references to all active cooling devices that should be engaged when the associated active cooling threshold (_ACx) is exceeded." Given that this presumably didn't occur under Windows, I think it would be significantly better to figure out why and then fix that. Alternatively, if the firmware tables are actually genuinely broken in a way that's impossible to repair, you can replace the table. That has the advantage that there's no risk of the platform and the OS becoming confused. -- Matthew Garrett | mjg59@srcf.ucam.org -
We'd need: a) way to tell acpi not to control fans any more b) in kernel watchdog so that acpi starts controlling fans after oom killer c) way to control passive cooling from userspace. It would happily occur under Windows. You just needed to load machine in a way that cpu stayed ~80C. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
So replace the DSDT. All the problems get solved that way. -- Matthew Garrett | mjg59@srcf.ucam.org -
We are in the middle of stable series, and Len's patch breaks existing setups without prior warning. That's "no-no". Of course I could replace DSDT. I also could throw that machine out of window. I'm not sure what we are arguing about here, that patch is broken. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
This breaks the compilation of the oldest of our IDE disk drivers:
<-- snip -->
...
LD .tmp_vmlinux1
drivers/built-in.o: In function `hd_init':
hd.c:(.init.text+0x44a7d): undefined reference to `drive_info'
hd.c:(.init.text+0x44a89): undefined reference to `drive_info'
hd.c:(.init.text+0x44a95): undefined reference to `drive_info'
hd.c:(.init.text+0x44aa1): undefined reference to `drive_info'
hd.c:(.init.text+0x44aad): undefined reference to `drive_info'
drivers/built-in.o:hd.c:(.init.text+0x44ab9): more undefined references to `drive_info' follow
make[1]: *** [.tmp_vmlinux1] Error 1
<-- snip -->
Considering the fact that we have two more recent drivers with the same
functionality, it might be an option to simply remove this driver...
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Hi, Care to send a patch? Thanks, Bart -
hd.c can drive MFM and RLL disks and drivers/ide cannot. Although it really wants burying further down the config tree the ability to read MFM and RLL disks when recovering ancient data is useful and people do actually use this driver now and then rescuing stuff like twenty year old datasets. It thus needs fixing not removing. Alan -
Why is this driver parked in drivers/ide/legacy when the companion driver, xd.c, is in drivers/block (where hd.c used to be at one point, too)? Especially so since it's not really for IDE, but for ST-506. HOWEVER, the code that fails above hard-assumes that the ST-506 disks that you have are your primary system drives, which is obviously a wrong assumption -- ST-506 drives were obsolete quite a while before Linux existed[1]. xd.c, on the other hand, seems to actually go out and query the hardware directly. I guess this is understandably, since this controller would never have been primary. If hd.c is pure legacy, which it obviously is, should we remove the code to assume the BIOS settings are the MFM/RLL settings (i.e. the __i386__ clause), and instead do something more like the __arm__ clause which means that "if you really want to use this you have to specify the parameters manually"? -hpa [1] The 386-16 that I had access to at Northwestern, which with 0.59 BogoMIPS was the slowest Linux system in existence until Linux was ported to other architectures, might have been an ST-506 drive, but I'm not sure. -
Opinions, anyone (especially Alan): http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-newsetup.git;a=commitdiff;h=36... -hpa -
I believe the technical description for the comment is "bullshit" 8) Almost all MFM controllers and RLL controllers will only run at the standard primary and secondary ATA address. Given the intended use of the driver today I don't see a big problem in requiring "hd=" although you have to question the point of this boot code rewrite when it seems primarily to be removing features Alan -
Yes, but that doesn't (necessarily) apply to the controller that is likely to be the primary controller in a modern system. The whole point is that what the BIOS considers primary isn't necessarily tied to the standard ATA addresses anymore, with SATA controllers being primary. The question I'm asking is: do you think it's better to remove this from hd.c, or do you think it's better to add it back boot code BIOS detection (and take the risk of poking an ST-506 disk with legacy data with parameters which may belong to another disk -- keep in mind this I've been trying to remove features that are obsolete and/or broken. I don't have access to this particular ancient hardware, nor any system that can even host them. It's very easy to add the stuff back in the boot code; it's a much more tricky/annoying question if one *should* do so. That's part of a rewrite/cleanup. -hpa -
To set it up the user will have to know the parameters and have typed them into the BIOS (if it even has an option for it). I see no problem -
Sorry, see no problem which way? My concern here is with getting incorrect data, not getting no data. The BIOS probe amounts to pulling data out of two tables (INT 0x41/0x46, corresponding to BIOS drives 0x80 and 0x81 -- the EDD 1.1 spec is quite specific that if implemented they follow the BIOS drive numbers, not the ATA port addresses), and hoping that they actually match the drives that hd.c uses. That scares me, since we're talking about old legacy data here. I'm not concerned with what's easy, I'm concerned with what's the right thing to do. -hpa -
Forcing the user to provide the geometry. Historically that driver dealt with the main disks the user had. Today its only use is specialist recovery work. Anyone recovering a disk has to get the geometry data into the BIOS (if the BIOS even allows it - many now don't) and will therefore know it for hd= arguments as well Alan -
I have just seen this on boot, with 2.6.22-rc2-mm1 on x86_64: -- libata version 2.20 loaded. usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report BUG: at include/linux/slub_def.h:88 kmalloc_index() Call Trace: [<ffffffff8034f3f9>] pci_dev_put+0x12/0x14 [<ffffffff80283f30>] get_slab+0xb5/0x265 [<ffffffff802841bc>] __kmalloc+0x13/0xa3 [<ffffffff8021a4aa>] cache_k8_northbridges+0x80/0x116 [<ffffffff8063fed2>] gart_iommu_init+0x16/0x594 [<ffffffff804562ac>] genl_rcv+0x0/0x68 [<ffffffff804548ed>] netlink_kernel_create+0x15e/0x16b [<ffffffff804acc52>] mutex_unlock+0x9/0xb [<ffffffff80639fad>] pci_iommu_init+0x9/0x12 [<ffffffff806306af>] kernel_init+0x152/0x322 [<ffffffff80249c7c>] trace_hardirqs_on+0xc0/0x14e [<ffffffff804ae03d>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff80249c7c>] trace_hardirqs_on+0xc0/0x14e [<ffffffff8020a848>] child_rip+0xa/0x12 [<ffffffff80209f5c>] restore_args+0x0/0x30 [<ffffffff8063055d>] kernel_init+0x0/0x322 [<ffffffff8020a83e>] child_rip+0x0/0x12 PCI-GART: No AMD northbridge found. hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 hpet0: 3 64-bit timers, 14318180 Hz ACPI: RTC can wake from S4 pnp: 00:01: iomem range 0xf0000000-0xf3ffffff has been reserved pnp: 00:01: iomem range 0xfed13000-0xfed13fff has been reserved -- The full dmesg is at http://www.reub.net/files/kernel/2.6.22-rc1-mm1-dmesg and the config up at http://www.reub.net/files/kernel/2.6.22-rc1-mm1-config The machine otherwise seems to run OK. Reuben -
This ( http://lkml.org/lkml/2007/5/16/350 ) patch by Ben Collins submitted yesterday should take care of this. Thanks, Satyam -
Hello, I tried it on iMac G3. I got a bunch of warnings and finally it failed to build. WARNING: "fee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol WARNING: "ee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from dt_string_start (offset 0x8) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from dt_string_end (offset 0xc) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from prom_entry (offset 0x10) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from prom (offset 0x3c) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from of_platform (offset 0x50) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from mem_reserve_cnt (offset 0x58) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from mem_reserve_map (offset 0x60) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from alloc_bottom (offset 0x64) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from ram_top (offset 0x68) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from alloc_top (offset 0x70) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from prom_scratch (offset 0x8c) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from dt_header_start (offset 0xbc) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from dt_struct_start (offset 0xc4) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from dt_struct_end (offset 0xcc) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to ...
.... Most - but not all of these warnings should be gone when Linus pulls kbuild-fix.git. When -rc3 is ready can you then please post the result of a build. Then I can take a look at the remaining section mismatch warnings. Sam -
I've been getting this since 2.6.21-rc7-mm1: [ 2.379310] BUG: unable to handle kernel paging request at virtual address 4400d340 [ 2.379491] printing eip: [ 2.379573] c021c978 [ 2.379656] *pdpt = 000000000353c001 [ 2.379739] *pde = 0000000000000000 [ 2.379824] Oops: 0000 [#1] [ 2.379906] PREEMPT SMP [ 2.380059] Modules linked in: thermal processor dm_mod [ 2.380288] CPU: 0 [ 2.380289] EIP: 0060:[<c021c978>] Not tainted VLI [ 2.380291] EFLAGS: 00010297 (2.6.22-rc1-mm1 #2) [ 2.380547] EIP is at vsnprintf+0x448/0x5d0 [ 2.380633] eax: 4400d340 ebx: c348f034 ecx: 4400d340 edx: fffffffe [ 2.380721] esi: c03e0100 edi: 4400d340 ebp: c357ecc0 esp: c357ec68 [ 2.380810] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [ 2.380898] Process udevtrigger (pid: 686, ti=c357e000 task=c1876df0 task.ti=c357e000) [ 2.380987] Stack: c348f014 00000fec c03e1c60 c03e3cec c357eccc c0499b88 c357ece0 c0282513 [ 2.381428] c348f014 00000fec 3cb70fcb c348f034 ffffffff 00000000 ffffffff ffffffff [ 2.381867] ffffffff fffffffe c03e017c c357ed18 00000034 c0494a20 c357ece0 c021cb9f [ 2.382305] Call Trace: [ 2.382470] [<c021cb9f>] sprintf+0x1f/0x30 [ 2.382594] [<c02815ed>] show_uevent+0xed/0x130 [ 2.382720] [<c0281163>] dev_attr_show+0x23/0x30 [ 2.382843] [<c01dc077>] sysfs_read_file+0x97/0x140 [ 2.382968] [<c019502f>] vfs_read+0xaf/0x180 [ 2.383096] [<c0198c3a>] kernel_read+0x3a/0x50 [ 2.383221] [<c01f126c>] evm_calc_hash+0x11c/0x240 [ 2.383347] [<c01efd39>] evm_file_free+0xb9/0x330 [ 2.383470] [<c0195a3a>] __fput+0xba/0x180 [ 2.383593] [<c0195c32>] fput+0x22/0x40 [ 2.383715] [<c0192e07>] filp_close+0x47/0x70 [ 2.383839] [<c0194109>] sys_close+0x69/0xc0 [ 2.383965] [<c01043c8>] syscall_call+0x7/0xb [ 2.384092] [<b7ebd0a7>] 0xb7ebd0a7 [ 2.384212] ======================= [ 2.384295] INFO: lockdep is turned off. [ 2.384379] Code: 21 fd ff ff c6 ...
On Tue, 22 May 2007 03:25:48 -0400 OK, thanks. Does the crash go away if you disable IMA, SLIM, etc in .config? I think I'll drop all those patches, actually - they don't seem to be going anywhere. -
relevant: You are absolutely right, we have been stalled on EVM/IMA/SLIM, while trying to figure out the mtime and revocation issues. In retrospect we tried to submit too much complex code all at once. We will resubmit in small functional pieces as the technical issues have been resolved, starting with the LIM API and hooks, which are independent of the mtime and revocation issues. Mimi Zohar -
| Jesse Barnes | Re: [stable] [BUG][PATCH] cpqphp: fix kernel NULL pointer dereference |
| Greg KH | [003/136] p54usb: add Zcomax XG-705A usbid |
| Magnus Damm | [PATCH 03/07] ARM: Use shared GIC entry macros on Realview |
| Oliver Neukum | Re: [Bug #13682] The webcam stopped working when upgrading from 2.6.29 to 2.6.30 |
| Martin Schwidefsky | Re: [PATCH] optimized ktime_get[_ts] for GENERIC_TIME=y |
git: | |
| Junio C Hamano | Re: Some advanced index playing |
| Jeff King | Re: confusion over the new branch and merge config |
| Robin Rosenberg | Re: cvs2svn conversion directly to git ready for experimentation |
| Linus Torvalds | git binary size... |
| Ævar Arnfjörð Bjarmason | Re: Challenge with Git-Bash |
| Linux Kernel Mailing List | md: move allocation of ->queue from mddev_find to md_probe |
| Linux Kernel Mailing List | md: raid0: Represent zone->zone_offset in sectors. |
| Lin |
