Re: [PATCH v2] Add suspend/resume for HPET

Previous thread: decrease L2 cache size used by Linux? by Guerreiro da Luz on Friday, March 16, 2007 - 9:13 am. (2 messages)

Next thread: [Patch] simplify statistics' debugfs write function by Martin Peschke on Friday, March 16, 2007 - 9:57 am. (1 message)
From: Linus Torvalds
Date: Friday, March 16, 2007 - 9:33 am

I pushed out the -git trees yesterday, but then got distracted, so the 
patches and tar-balls and the announcement got delayed until this morning. 
Oops. I'm a scatter-brain.

Anyway, the good news about -rc4 is that there's just lots of random 
fixes. I'm hoping that we've seriously cut down on the regression list, 
and I'd ask everybody who is on Adrian's list to please re-verify their 
regression, and in case it's one of the "patches available" ones but I 
haven't merged (maybe because it hasn't been sent to me!), make sure I do.

Shortlog appended. Nothing really stands out. 

		Linus
---
Adrian Bunk (2):
      [DLM] fs/dlm/user.c should #include "user.h"
      asus-laptop: make code static

Adrian Hunter (2):
      [MTD] Correct partition failed erase address
      [MTD] [OneNAND] add Nokia Copyright and a credit

Aji Srinivas (1):
      [BRIDGE]: adding new device to bridge should enable if up

Akiyama, Nobuyuki (1):
      [IA64] add missing syscall trace clear

Al Viro (31):
      fix deadlock in audit_log_task_context()
      sanitize security_getprocattr() API
      ibmtr probe is __devinit, not __init
      const file_operations fallout
      appldata build fix
      (uml) sparse flags for userland glue are missing $(CF)
      zatm __init abuse
      stacktrace doesn't work on uml
      fix ipath_dma_free_coherent() prototype
      m32r dma-mapping.h should simply include generic/dma-mapping-broken.h
      include of asm/pgtable.h in nfsfh is bogus
      BLK_DEV_IDE_CELLEB dependency fix
      sparc: have dma-mapping.h include generic/dma-mapping-broken in non-PCI case
      rtc-cmos needs RTC_ALWAYS_BCD known
      misc NULL noise
      fastcall still doesn't make sense in paravirt
      dmfe trivial endianness annotations
      constant should be long
      pasemi trivial iomem annotations
      sparc: nr_free_pages() is unsigned long
      trivial ATA iomem annotations
      cciss endian annotations
      qeth gfp_t annotations
      C99 ...
From: Randy Dunlap
Date: Friday, March 16, 2007 - 2:11 pm

allmodconfig on i386:

WARNING: "default_idle" [arch/i386/kernel/apm.ko] undefined!
WARNING: "machine_real_restart" [arch/i386/kernel/apm.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Randy Dunlap
Date: Friday, March 16, 2007 - 3:39 pm

Please ignore.

I think that this was the result of doing 'make allyesconfig && make all'
followed by 'make allmodconfig && make all' without doing a 'make clean'
between them.

---
~Randy
-

From: Sam Ravnborg
Date: Friday, March 16, 2007 - 11:43 pm

But then we have a dependency error somewhere we need to track down.
I will try to test here.

	Sam
-

From: Sam Ravnborg
Date: Sunday, March 18, 2007 - 5:39 am

So far no luck in reproducing this.
I will await additional reports before looking more into this one.

	Sam
-

From: Randy Dunlap
Date: Sunday, March 18, 2007 - 9:16 pm

Hi Sam,
It's reproducible for me.

(I'm on x86_64:)

make clean
make ARCH=i386 allyesconfig
make ARCH=i386 all
make ARCH=i386 allmodconfig
make ARCH=i386 all

What kind of debug info do you want/need on this?

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Chris Friesen
Date: Friday, March 16, 2007 - 4:13 pm

This would seem to be a bug in the build system then.  Or are you 
supposed to "make clean" after every config change?

Chris
-

From: Jan Engelhardt
Date: Friday, March 16, 2007 - 4:27 pm

No. When .config is changed, include/linux/config/ is updated, which
causes things that depends on it one or the other way to rebuild. At
least that is what I observed since ages.


Jan
-- 
-

From: Rafael J. Wysocki
Date: Friday, March 16, 2007 - 1:34 pm

I'm afraid that if CONFIG_TICK_ONESHOT or CONFIG_NO_HZ is set, we still have a
problem with RCU synchronization while nonboot CPUs are being enabled during a
resume (http://lkml.org/lkml/2007/3/11/144, http://lkml.org/lkml/2007/3/4/88).

Can someone who had this problem with -rc3 check if it's present in -rc4?

Thanks,
Rafael
-

From: Thomas Gleixner
Date: Friday, March 16, 2007 - 1:47 pm

I finally found a box today, which shows this problem. I'm working on a
fix.

	tglx


-

From: Thomas Gleixner
Date: Friday, March 16, 2007 - 4:25 pm

I finally found a dual core box, which survives suspend/resume without
crashing in the middle of nowhere. Sigh, I never figured out from the
code and the bug reports what's going on.

The observed hangs are caused by a stale state transition of the clock
event devices, which keeps the RCU synchronization away from completion,
when the non boot CPU is brought back up.

The suspend/resume in oneshot mode needs the similar care as the
periodic mode during suspend to RAM. My assumption that the state
transitions during the different shutdown/bringups of s2disk would go
through the periodic boot phase and then switch over to highres resp.
nohz mode were simply wrong.

Add the appropriate suspend / resume handling for the non periodic
modes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 5567745..eadfce2 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -307,12 +307,19 @@ int tick_resume_broadcast(void)
 	spin_lock_irqsave(&tick_broadcast_lock, flags);
 
 	bc = tick_broadcast_device.evtdev;
-	if (bc) {
-		if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC &&
-		    !cpus_empty(tick_broadcast_mask))
-			tick_broadcast_start_periodic(bc);
 
-		broadcast = cpu_isset(smp_processor_id(), tick_broadcast_mask);
+	if (bc) {
+		switch (tick_broadcast_device.mode) {
+		case TICKDEV_MODE_PERIODIC:
+			if(!cpus_empty(tick_broadcast_mask))
+				tick_broadcast_start_periodic(bc);
+			broadcast = cpu_isset(smp_processor_id(),
+					      tick_broadcast_mask);
+			break;
+		case TICKDEV_MODE_ONESHOT:
+			broadcast = tick_resume_broadcast_oneshot(bc);
+			break;
+		}
 	}
 	spin_unlock_irqrestore(&tick_broadcast_lock, flags);
 
@@ -347,6 +354,16 @@ static int tick_broadcast_set_event(ktime_t expires, int force)
 	}
 }
 
+int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
+{
+	clockevents_set_mode(bc, ...
From: Milan Broz
Date: Saturday, March 17, 2007 - 2:35 am

Hi,
I can confirm that this patch fixed the problem on Thinkpad X60s.

Thanks !

Milan
--
mbroz@redhat.com
 
-

From: Thomas Meyer
Date: Saturday, March 17, 2007 - 3:07 am

Excellent work. Now suspend to disk is working again. But:

1.) The quirk added in commit a417a21e10831bca695b4ba9c74f4ddf5a95ac06
for the appletouch driver doesn't seem to work after resume.

2.) The first suspend to disk works with no problems, but the second
suspend to disk in a row results in an oops:
 ->resume_device ->
pci_device_resume->ata_host_resume->ahci_pci_device_resume->ata_pci_device_do_resume->pci_restore_state

-

From: Rafael J. Wysocki
Date: Saturday, March 17, 2007 - 2:47 pm

Can you please see if this problem is already in the Adrian's list of known
regressions?

Thanks,
Rafael
-

From: Adrian Bunk
Date: Sunday, March 18, 2007 - 10:58 am

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Adrian Bunk
Date: Saturday, March 17, 2007 - 5:42 pm

Not a regression, but still a bug.
Appropriate people added to the Cc.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Jiri Kosina
Date: Sunday, March 18, 2007 - 11:45 am

(trimmed CC a little bit)

Thomas, could you please provide more information? Did it ever work for 
you after suspend/resume cycle and it just broke at some point in the 
past, or you are not sure whether it ever worked?

Also, what exactly does it mean that it doesn't work? Are both the 
interfaces after resume bound to usbhid driver (you can check this in 
/sys)? When the quirk works correctly, only the keyboard interface should 
be bound to usbhid driver. Please check the binding both before and after 
suspend/resume cycle, and let me know.

Thanks,

-- 
Jiri Kosina
-

From: Thomas Meyer
Date: Sunday, March 18, 2007 - 12:01 pm

Appletouch is bound to the device:

T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  7 Spd=12  MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs=  1
P:  Vendor=05ac ProdID=0218 Rev= 0.60
S:  Manufacturer=Apple Computer
S:  Product=Apple Internal Keyboard / Trackpad
C:* #Ifs= 3 Cfg#= 1 Atr=a0 MxPwr= 40mA
I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=01 Prot=01 Driver=usbhid
E:  Ad=83(I) Atr=03(Int.) MxPS=   8 Ivl=8ms
I:* If#= 1 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=01 Prot=02 Driver=appletouch
E:  Ad=81(I) Atr=03(Int.) MxPS=  64 Ivl=8ms
I:* If#= 2 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver=usbhid
E:  Ad=84(I) Atr=03(Int.) MxPS=   1 Ivl=8ms


But the X server touchpad driver doesn't work anymore, that means i
can't emulte a right click by tapping with 3 fingers on the mouse pad.
after restarting x the mouse driver works again. So i think this is
maybe a problem in X?


-

From: Jiri Kosina
Date: Sunday, March 18, 2007 - 12:22 pm

... but there is something apparently wrong either with the appletouch 
driver or X. Could you test via evtest whether the events are properly 
generated by the kernel? If they do, I'd say it is almost certainly X bug.

Thanks,

-- 
Jiri Kosina
-

From: Thomas Meyer
Date: Tuesday, March 27, 2007 - 2:02 pm

It seems, that after the resume all usb devices gets removed and plug in
again (virtually!). This results in a new input device name:

before suspend and resume:

appletouch Geyser 3 inited.
input: appletouch as /class/input/input2


after resume:
Restarting tasks ... <6>usb 1-2: USB disconnect, address 3
PM: Removing info for No Bus:usbdev1.3_ep83
PM: Removing info for usb:1-2:1.0
PM: Removing info for No Bus:usbdev1.3_ep81
input: appletouch disconnected
[cut]
PM: Adding info for No Bus:usbdev1.4_ep83
PM: Adding info for usb:1-2:1.1
appletouch Geyser 3 inited.
input: appletouch as /class/input/input15

This change confuses the X synaptics driver:

Touchpad no synaptics event device found (checked 11 nodes)
Touchpad The /dev/input/event* device nodes seem to be missing
(EE) xf86OpenSerial: Cannot open device /dev/input/event2
        No such file or directory.
(WW) Touchpad: cannot open input device

And so X falls back to my second pointer device which is a UsbMouse
under /dev/input/mice

One could say that the synaptics driver rightly complains about the
missing event2 device!
So is this a bug in the X synaptics driver?

Comments are welcome.

-

From: Jiri Kosina
Date: Wednesday, March 28, 2007 - 5:26 am

Yes, this is what actually happens. JFYI see current thread on lkml which 

You can of course work this around by adding an udev rule such as

SUBSYSTEM=="input",KERNEL=="event*",SYSFS{name}=="appletouch",SYMLINK+="input/appletouchpad"

and the let Xorg use /dev/input/appletouchpad, which will always be a 
symlink to the correct device.

-- 
Jiri Kosina
-

From: Dmitry Torokhov
Date: Wednesday, March 28, 2007 - 6:24 am

I am not sure if this would help... According to the excerpt from X
log synaptics driver attempted to scan evdev devices and locate the
touchpad. However if this scan happen before udev had a chance to
process the event and create new /dev/input/eventX device node it will
fail.

I wonder if we should adjust the X driver to spin for a couple of
seconds in EventAutoDevProbe if the touchpad was already seen once...
Peter?

-- 
Dmitry
-

From: Thomas Meyer
Date: Wednesday, March 28, 2007 - 9:51 am

This was my first idea, too. But then i found this entry in the
Changelog of the synaptics driver:

- In the DeviceOn() function, if opening the device node fails, try to
  auto-detect the correct event device again. This fixes some problems
  which occur after a suspend/resume cycle or after rmmod/insmod-ing
Okay. This strengthens above statement. And udev is too slow to create

-

From: Jiri Kosina
Date: Wednesday, March 28, 2007 - 10:06 am

Yes, it looks like this is the root cause.

However I must admit that I don't like this behavior too much. We 
shouldn't rely on drivers individual userland to wait for a reasonable 
time before the udev settles down. This is not a nice API to provide. Will 
try to think of some solution which would have reasonable 
nastiness/functionality ratio.

-- 
Jiri Kosina
-

From: Dmitry Torokhov
Date: Wednesday, March 28, 2007 - 10:35 am

The proper fix would be teach X to recognize hotplug events. I've
heard  they are working on it.

-- 
Dmitry
-

From: Adrian Bunk
Date: Sunday, March 18, 2007 - 11:49 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : weird system hangs
References : http://lkml.org/lkml/2007/3/16/288
Submitter  : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
             Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Status     : unknown


Subject    : crashes in KDE
References : http://bugzilla.kernel.org/show_bug.cgi?id=8157
Submitter  : Oliver Pinter <oliver.pntr@gmail.com>
Status     : unknown


Subject    : kwin dies silently
References : http://lkml.org/lkml/2007/2/28/112
Submitter  : Sid Boyce <g3vbv@blueyonder.co.uk>
Status     : unknown

-

From: Tobias Diedrich
Date: Tuesday, March 20, 2007 - 3:24 am

Since I didn't see any mention of this:

I'm seeing an Oops when removing the ohci1394 module:

[   16.047275] ieee1394: Node removed: ID:BUS[158717321-38:0860]  GUID[c033ced600000000]
[   16.047287] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000094
[   16.047451]  printing eip:
[   16.047524] c02daf3d
[   16.047527] *pde = 00000000
[   16.047603] Oops: 0000 [#1]
[   16.047676] PREEMPT 
[   16.047788] Modules linked in: backlight ohci1394 parport_pc parport
[   16.048069] CPU:    0
[   16.048071] EIP:    0060:[<c02daf3d>]    Not tainted VLI
[   16.048074] EFLAGS: 00010246   (2.6.21-rc4 #35)
[   16.048298] EIP is at class_device_remove_attrs+0xa/0x30
[   16.048377] eax: dfd04338   ebx: 00000000   ecx: df655988   edx: 00000000
[   16.048456] esi: 00000000   edi: dfd04338   ebp: 00000000   esp: df506e38
[   16.048535] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
[   16.048614] Process rmmod (pid: 1455, ti=df506000 task=df6cc0b0 task.ti=df506000)
[   16.048693] Stack: dfd04338 dfd04340 00000000 c02db02f 00000000 dfd04338 dfd041e4 c0331871 
[   16.049159]        00000000 c02db065 dfd041b0 c0331858 c055006d 0975d589 00000026 0000035c 
[   16.049626]        00000000 c033ced6 00000000 df24c000 c0331879 c02d859f df24c0bc df24c0bc 
[   16.050091] Call Trace:
[   16.050233]  [<c02db02f>] class_device_del+0xcc/0xfa
[   16.050352]  [<c0331871>] __nodemgr_remove_host_dev+0x0/0xb
[   16.050475]  [<c02db065>] class_device_unregister+0x8/0x10
[   16.050595]  [<c0331858>] nodemgr_remove_ne+0x61/0x7a
[   16.050714]  [<c033ced6>] ether1394_header_cache+0x0/0x43
[   16.050835]  [<c0331879>] __nodemgr_remove_host_dev+0x8/0xb
[   16.050954]  [<c02d859f>] device_for_each_child+0x1a/0x3c
[   16.051073]  [<c0331b98>] nodemgr_remove_host+0x30/0x90
[   16.051192]  [<c032f12c>] __unregister_host+0x1a/0xad
[   16.051311]  [<c032ee17>] hl_get_hostinfo+0x5b/0x76
[   16.051430]  [<c032f34a>] highlevel_remove_host+0x21/0x42
[   16.051549]  [<c032ed9d>] ...
From: Adrian Bunk
Date: Tuesday, March 20, 2007 - 4:14 am

You missed the following entry in my list [1]:

Subject    : Oops in __nodemgr_remove_host_dev
References : http://lkml.org/lkml/2007/3/14/4
             http://lkml.org/lkml/2007/3/18/87
Submitter  : Ismail Dönmez <ismail@pardus.org.tr>
             Stefan Richter <stefanr@s5r6.in-berlin.de>
             Thomas Meyer <thomas@m3y3r.de>
Caused-By  : Greg Kroah-Hartman <gregkh@suse.de>
             commit 43cb76d91ee85f579a69d42bc8efc08bac560278
             commit 40cf67c5fcc513406558c01b91129280208e57bf
Handled-By : Stefan Richter <stefanr@s5r6.in-berlin.de>
Status     : problem is being debugged


cu
Adrian

[1] not meant as an offence - there are so many items in the list
    that it's easy to miss one

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Linus Torvalds
Date: Wednesday, March 21, 2007 - 8:45 pm

According to the console log, it seems to be hung because a lot of 
processes are stuck in D state in various variations of this:

	Call Trace:
	 [<c01ba134>] start_this_handle+0x2d7/0x355
	 [<c01ba265>] journal_start+0xb3/0xe1
	 [<c01b2837>] ext3_journal_start_sb+0x48/0x4a
	 [<c01b0924>] ext3_create+0x47/0xe2
	 [<c017820c>] vfs_create+0xcd/0x13e
	 [<c017ab6e>] open_namei+0x176/0x5b5
	 [<c0170026>] do_filp_open+0x26/0x3b
	 [<c017007e>] do_sys_open+0x43/0xc2
	 [<c0170135>] sys_open+0x1c/0x1e
	 [<c0104064>] syscall_call+0x7/0xb

and then you have "kget" (whatever that is) which is doing

	Call Trace:
	 [<c0318981>] schedule_timeout+0x70/0x8e
	 [<c03189b4>] schedule_timeout_uninterruptible+0x15/0x17
	 [<c01b964a>] journal_stop+0xe2/0x1e6
	 [<c01ba2b0>] journal_force_commit+0x1d/0x1f
	 [<c01b29fb>] ext3_force_commit+0x22/0x24
	 [<c01ad607>] ext3_write_inode+0x34/0x3a
	 [<c0189f74>] __writeback_single_inode+0x1c5/0x2cb
	 [<c018a096>] sync_inode+0x1c/0x2e
	 [<c01a9ff7>] ext3_sync_file+0xab/0xc0
	 [<c018c8c5>] do_fsync+0x4b/0x98
	 [<c018c932>] __do_fsync+0x20/0x2f
	 [<c018c951>] sys_fdatasync+0x10/0x12
	 [<c0104064>] syscall_call+0x7/0xb

with kjournald in D sleep at

	 [<c01bb7b2>] journal_commit_transaction+0x15d/0x11d3
	 [<c01bfcbe>] kjournald+0xab/0x1e8
	 [<c01333dd>] kthread+0xb5/0xe0
	 [<c0104cd3>] kernel_thread_helper+0x7/0x10

which certainly looks like something is waiting for an IO to finish.

In contrast, the hang reported by Mariusz Kozlowski has a slightly 
different feel to it, but there's a tantalizing pattern in there too:

  http://www.ussg.iu.edu/hypermail/linux/kernel/0703.0/1243.html

	Call Trace:
	[<c03ec87e>] io_schedule+0x42/0x59
	[<c0184915>] sleep_on_buffer+0x8/0xc
	[<c03ed217>] __wait_on_bit+0x47/0x6c
	[<c03ed297>] out_of_line_wait_on_bit+0x5b/0x64
	[<c01848a8>] __wait_on_buffer+0x27/0x2d
	[<c01b4228>] journal_commit_transaction+0x707/0x127f
	[<c01b868b>] kjournald+0xac/0x1ed
	[<c0126af5>] kthread+0xa2/0xc9
	[<c010422b>] ...
From: Nick Piggin
Date: Wednesday, March 21, 2007 - 9:18 pm

Nothing sleeps on PageUptodate, so I don't think that could explain it.

The fs: fix __block_write_full_page error case buffer submission patch
does change the locking, but I'd be really suprised if that was the
problem, because it changes locking to match the regular non-error path
submission.

It could be possible that ext3 is doing something weird and expecting
the old behaviour if it failed get_block, but that seems pretty weird
to do, and would need fixing.

fs: nobh data leak... again hard to see how it could cause an unlock/wakeup
to get lost. Is Mariusz using the nobh mount option?


I see what you mean. Could it be an ext3 or jbd change I wonder?

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 
-

From: Linus Torvalds
Date: Thursday, March 22, 2007 - 8:21 am

Good point. I forget that we just test "uptodate", but then always sleep 


jbd hasn't changed since 2.6.20, and the ext3 changes are mostly 
things like const'ness fixes. And others were things like changing 
"journal_current_handle()" into "ext3_journal_current_handle()", which 
looked exciting considering that the hung processes were waiting for the 
journal, but the fact is, that's just an inline function that just calls 
the old function, so..

But interestingly, there *is* a "EA block reference count racing fix" 
that does move a lock_buffer()/unlock_buffer() to cover a bigger area. It 
looks "obviously correct", but maybe there's a deadlock possibility there 
with ext3_forget() or something?

		Linus
-

From: Mingming Cao
Date: Thursday, March 22, 2007 - 6:08 pm

I might missed something, so far I can't see a deadlock yet.
If there is a deadlock, I think we should see ext3_xattr_release_block()
and ext3_forget() on the stack. Is this the case?

Regards,

-

From: Linus Torvalds
Date: Thursday, March 22, 2007 - 6:40 pm

[ Ok, I think it's those timers again...

  Ingo: let me just state how *happy* I am that I told you off when you 
  wanted to merge the hires timers and NO_HZ before 2.6.20 because they 
  were "stable". You were wrong, and 2.6.20 is at least in reasonable 
  shape. Now we just need to make sure that 2.6.21 will be too.. ]


No. What's strange is that two (maybe more, I didn't check) processes seem 
to be stuck in

	 [<c0318981>] schedule_timeout+0x70/0x8e
	 [<c03189b4>] schedule_timeout_uninterruptible+0x15/0x17
	 [<c01b964a>] journal_stop+0xe2/0x1e6
	 [<c01ba2b0>] journal_force_commit+0x1d/0x1f
	 [<c01b29fb>] ext3_force_commit+0x22/0x24
	 [<c01ad607>] ext3_write_inode+0x34/0x3a
	 [<c0189f74>] __writeback_single_inode+0x1c5/0x2cb
	 [<c018a096>] sync_inode+0x1c/0x2e
	 [<c01a9ff7>] ext3_sync_file+0xab/0xc0
	 [<c018c8c5>] do_fsync+0x4b/0x98
	 [<c018c932>] __do_fsync+0x20/0x2f
	 [<c018c960>] sys_fsync+0xd/0xf
	 [<c0104064>] syscall_call+0x7/0xb

but that that thing is literally:

		...
                do {
                        old_handle_count = transaction->t_handle_count;
                        schedule_timeout_uninterruptible(1);
                } while (old_handle_count != transaction->t_handle_count);
		...

and especially if nothing is happening, I'd not expect 
"transaction->t_handle_count" to keep changing, so it should stop very 
quickly.

Maybe it's CONFIG_NO_HZ again, and the problem is that timeout, and simply 
no timer tick happening?

Bingo. I think that's it.

	active timers:
	 #0: hardirq_stack, tick_sched_timer, S:01
	 # expires at 9530893000000 nsecs [in -2567889 nsecs]
	 #1: hardirq_stack, hrtimer_wakeup, S:01
	 # expires at 10858649798503 nsecs [in 1327754230614 nsecs]
	  .expires_next   : 9530893000000 nsecs

See

	http://lkml.org/lkml/2007/3/16/288

and that in turn points to the kernel log:

	http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4/git-console.log
-

From: Nick Piggin
Date: Thursday, March 22, 2007 - 7:11 pm

Seems convincing. Michal, can you post your .config, and if you had
dynticks and hrtimers enabled, try reproducing without them?
-

From: Michal Piotrowski
Date: Friday, March 23, 2007 - 12:51 am

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4/git-config

I don't know how to reproduce this bug on 2.6.21-rc4.On 2.6.21-rc2-mm1
it was very simple, just run youtube, bash_shared_mapping etc. In fact
I didn't see this bug for a week.

Unfortunately, I wasn't able to take a crash dump because of sound
card driver bug (I've got crash dump from 2.6.21-rc2-mm1).

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-

From: Nick Piggin
Date: Friday, March 23, 2007 - 2:37 am

OK... for some reason this is listed as a regression against 2.6.21-rc4.

You do have CONFIG_NO_HZ=y, and it is likely to be the cause of your
2.6.21-rc2-mm1 problems, but maybe there have been fixes since then? Ingo?

-

From: Adrian Bunk
Date: Friday, March 23, 2007 - 10:19 am

Due to
   http://lkml.org/lkml/2007/3/16/288

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Ingo Molnar
Date: Friday, March 23, 2007 - 5:01 am

hm. The sysrq-q info you provided shows a weird (=='impossible') 
high-res timers state, in that CPU#1's hrtimer_cpu_base->expires_next is 
at KTIME_MAX, and state has been there for around 8-9 minutes. The 
timers there just stay pending and never expire. Thomas and me have just 
spent 4 hours reviewing the affected code inside out and upside down but 
we can find no credible way for this condition to trigger. But it 
obviously happened on your box, and persisted for many minutes.

So to move this issue forward, i've written a hrtimers debug patch 
(attached) - which checks a couple of key assumptions in the hrtimers.c 
code, and which also extends the SysRq-Q debug info with this:

  .expires_next   : 88378000000 nsecs
  .exp_prev       : 88377000000 nsecs
 last expires_next stacktrace:
  update_cpu_base_expires_next+5f/63
  hrtimer_interrupt+177/1b3
  smp_apic_timer_interrupt+6e/80
  apic_timer_interrupt+33/38
  <ffffffff>

knowing which codepath updated expires_next to KTIME_MAX would be very 
helpful to us.

So could you please pick up latest git (12998096c or later), undo commit 
25496caec (which broke my laptop - it might break your system too), and 
apply the attached patch, and keep lockdep enabled (so that you get 
CONFIG_STACKTRACE=y, essential for the stacktrace output above)? It 
might take some time for you to trigger this bug again, but that's the 
best idea we have so far. If we are lucky then one of the WARN_ON()s 
triggers much sooner.

if the hang occurs again then please do a SysRq-Q again and send us the 
output.

	Ingo

---------------------------->
Subject: [patch] hrtimers debug patch
From: Ingo Molnar <mingo@elte.hu>

debugging helper for hrtimers. Keep a lookout for WARN_ON messages.
Saves a stacktrace on every expires_next update, and makes that
stack-trace available in SysRq-Q (or /proc/timer_list) output.

( make sure to run this on a lockdep-enabled kernel, so that
  CONFIG_STACKTRACE=y. )

NOT-Signed-off-by: Ingo Molnar ...
From: Ingo Molnar
Date: Friday, March 23, 2007 - 4:42 am

there's a new post-rc4 regression: my T60 hangs during early bootup. I 
bisected the hang down to this recent commit:

| commit 25496caec111481161e7f06bbfa12a533c43cc6f
| Author: Thomas Renninger <trenn@suse.de>
| Date:   Tue Feb 27 12:13:00 2007 -0500
|
|    ACPI: Only use IPI on known broken machines (AMD, Dothan/BaniasPentium M)

undoing this change fixes my T60 so it correctly boots again.

the commit has this confidence-raising comment:

|   However, I am not sure about the naming of the parameter and how it 
|   could/should get integrated into the dyntick part 
|   (CONFIG_GENERIC_CLOCKEVENTS). There, a more fine grained check (TSC 
|   still running?, ..) is needed?

could we please revert this commit until it's done correctly?

and did this end up being a 'fix'? The change weakens the scope of a 
hardware workaround, which IMO has no place so late in the cycle. At a 
minimum the clockevents maintainer (Thomas) should have been Cc:-ed on 
it.

	Ingo
-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 4:56 am

Ingo, 

I had seen it before, and I had no objections under the premise, that it
does not break things and especially survives on Andrews VAIO. I
expected that to come in via -mm so it gets enough testing.

We should revert that patch and add a "trust_lapic_timer_in_c2"
commandline option instead. So we are on the safe side.

	tglx


-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 8:08 am

Here is a patch which applies after reverting 
25496caec111481161e7f06bbfa12a533c43cc6f

It turned out that it is almost impossible to trust ACPI, BIOS & Co.
regarding the C states. This was the reason to switch the local apic
timer off in C2 state already. OTOH there are sane and well behaving
systems, which get punished by that decision.

Allow the user to confirm that the local apic timer is trustworthy in C2
state. This keeps the default behaviour on the safe side.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index e39ab0c..09640a8 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -780,6 +780,9 @@ and is between 256 and 4096 characters. It is defined in the file
 	lapic		[IA-32,APIC] Enable the local APIC even if BIOS
 			disabled it.
 
+	lapic_timer_c2_ok	[IA-32,APIC] trust the local apic timer in
+			C2 power state.
+
 	lasi=		[HW,SCSI] PARISC LASI driver for the 53c700 chip
 			Format: addr:<io>,irq:<irq>
 
diff --git a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
index 244c3fe..e884152 100644
--- a/arch/i386/kernel/apic.c
+++ b/arch/i386/kernel/apic.c
@@ -64,6 +64,9 @@ static int enable_local_apic __initdata = 0;
 static int local_apic_timer_verify_ok;
 /* Disable local APIC timer from the kernel commandline or via dmi quirk */
 static int local_apic_timer_disabled;
+/* Local APIC timer works in C2 */
+int local_apic_timer_c2_ok;
+EXPORT_SYMBOL_GPL(local_apic_timer_c2_ok);
 
 /*
  * Debug level, exported for io_apic.c
@@ -1232,6 +1235,13 @@ static int __init parse_disable_lapic_timer(char *arg)
 }
 early_param("nolapic_timer", parse_disable_lapic_timer);
 
+static int __init parse_lapic_timer_c2_ok(char *arg)
+{
+	local_apic_timer_c2_ok = 1;
+	return 0;
+}
+early_param("lapic_timer_c2_ok", parse_lapic_timer_c2_ok);
+
 static int __init apic_set_verbosity(char *str)
 ...
From: Pavel Machek
Date: Monday, March 26, 2007 - 5:31 am

Could you add comment saying that this is always ok on non-broken
systems? That way perhaps it can be added to linux-firmware-test-cd,
etc.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Thomas Gleixner
Date: Monday, March 26, 2007 - 6:52 am

Yep, post .21. 

I still twist my brain how to autodetect that in a safe way, which would
make it really useful for the firmware tester.

	tglx


-

From: Len Brown
Date: Tuesday, March 27, 2007 - 2:19 pm

I think the only fool-proof way to do this automatically is to
1. do a boot-time calibration vs HPET, RTC, or 8254 to make sure it starts sane.
   ideally, this boot time test would enter the deepest available C-state --
   as that would catch 99% of the failures.
2. do a _continuous_ sanity check against the same time to make sure it never gets off track.
   It has to be continuous because we apparently have no control over
   when the BIOS breaks it on some systems.  However, continuous really
   means "long term" here, because over the uptime of the system we'd
   probably notice the drift between different timers etc, so we'd
   have to reset the sanity check periodically to not get fooled by that.

If this worked, then we could delete the new DMI entry for the nx6325,
as that would get detected and disable the lapic timer use automatically.
We could also delete the check for C2 and the check for C3 to disable
the lapic timer.

-Len
-

From: Linus Torvalds
Date: Tuesday, March 27, 2007 - 2:34 pm

Why not just take the known-good CPUID signature?

Screw firmware or ACPI tables. They're going to be occasionally wrong.

If we know that "Core 2, version X" has a good local APIC timer, we use 
it. Otherwise we don't.

That's generally how we handle other APIC bugs too (the read-after-write 
thing, for example, or the differences between integrated and off-chip 
APIC's). Sometimes we check the APIC version itself, sometimes we check 
the CPUID information, and sometimes we check both ("modern_apic()").

			Linus
-

From: Len Brown
Date: Tuesday, March 27, 2007 - 3:16 pm

Yep, this is what we tried to do last week.
It failed, and the patch was reverted.

I agree, the BIOS vendor can lie with ACPI tables.
In particular, they can map any hardware C-state
to any ACPI C-state. Our expectation that they
would not map hardware C3 to ACPI C2
appears at this point to have been invalid.

So, speaking for Intel parts, every single one that supports
HW C3 from the beginning of history through today has a broken
LAPIC timer.  (and a few listed in that patch are known to
be broken in HW C2)   If we can't guarantee that the BIOS vendor
will not map that broken HW C3 to ACPI C2 (or even C1 via SMM)
then we have to not use the LAPIC timer except for systems with
a "known-good" signature = "part supports only C1".

If we really care about using the LAPIC timer on systems with deeper
than C1 support, the only alternative seems to be to test
if it actually works or not at boot and run-time.
Otherwise, we wait for future hardware with guaranteed
not to break under any (BIOS) conditions ships, and check for that.

Based on what I read of the HP nx6325 where the LAPIC timer
is breaking C1, AMD is in the same boat.

-Len
-

From: Len Brown
Date: Tuesday, March 27, 2007 - 7:18 pm

On an Intel processor, it seems that the safe and simple route
is if the system exports C2 or deeper, don't use the LAPIC timer.
(which is what 2.6.21-rc5 is doing as of this moment)

The nx6325 (Turion 64 X2) exports only C1.
I'm not sure how the conclusion was drawn that it has
a broken lapic timer as reflected in the "nolapic_timer" patch:

+               /*
+                * BIOS exports only C1 state, but uses deeper power
+                * modes behind the kernels back.
+                */
+                 .callback = lapic_check_broken_bios,
+                 .ident = "HP nx6325",
+                 .matches = {
+                       DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6325"),
+                 },
+        },

But if this is true, then I don't know how to determine on
an AMD system if the LAPIC timer is guaranteed to work --
even for systems with just C1.

Jordan, William,
can you clarify?

thanks,
-Len
-

From: Andi Kleen
Date: Thursday, March 29, 2007 - 7:15 am

On AMD it is the same, except that there seems to be at least
one system that does C2 like things while only exporting C1.
That is why i proposed to check for a battery too -- if there is one
always disable it too.

-Andi
-

From: Langsdorf, Mark
Date: Thursday, March 29, 2007 - 7:53 am

If both cores goes into C1 at the same time, the chipset

For K7 and K8 through and including revision E, the LAPIC
timer is guaranteed to work in C1.

For K8 revisions F and G, and for upcoming family 0x10 and
0x11 parts, if either bit in MSRC001_0055[28:27] is set,
C1e is enabled and the LAPIC timer cannot be trusted in
C1.

AMD can craft a patch to sort this out as soon as we have
an idea what the framework is going to look like.

-Mark Langsdorf
Operating Systems Research Center
AMD, Inc.


-

From: Andi Kleen
Date: Thursday, March 29, 2007 - 9:50 am

Just a snippet to detect it would be great. Then the dmi scan
could be removed and replaced with that. This would be a 2.6.21
candidate imho over the DMI hack.

-Andi
-

From: Mark Langsdorf
Date: Thursday, March 29, 2007 - 1:02 pm

Yes.  The APIC timer still runs, but no longer has an HT link

Reviewed but not tested.  Needs to be wrapped in an AMD specific
call.

#define ENABLE_C1E_MASK		0x18000000
#define CPUID_PROCESSOR_SIGNATURE	1
#define CPUID_XFAM		0x0ff00000
#define CPUID_XFAM_K8		0x00000000
#define CPUID_XFAM_10H		0x00100000
#define CPUID_XFAM_11H		0x00200000
#define CPUID_XMOD		0x000f0000
#define CPUID_XMOD_REV_F	0x00040000

	int safe_c1 = 1;
	u32 eax, lo, hi;
	eax = cpuid_eax(CPUID_PROCESSOR_SIGNATURE)
	switch (eax & CPUID_XFAM) {
	case CPUID_XFAM_K8:
		if ((eax & CPUID_XMOD) < CPUID_XMOD_REV_F)
			break;
	case CPUID_XFAM_10H:
	case CPUID_XFAM_11H:
		rdmsr(MSR_ENABLE_C1E, lo, hi);
		if (lo & ENABLE_C1E_MASK)
			safe_c1 = 0;
		break;
	default:
		/* err on the side of caution */
		safe_c1 = 0;
	}

-Mark Langsdorf
Operating Systems Research Center
AMD, Inc.


-

From: Andi Kleen
Date: Thursday, March 29, 2007 - 1:49 pm

Here's a patch. I don't have a system with C1E, so i only tested that
the apic timer still works on a older AMD box.

Would be good if someone with a Turion laptop, especially the HP nx6325
could test it with CONFIG_NO_HZ enabled.

-Andi

Disable local APIC timer use on AMD systems with C1E

AMD dual core laptops with C1E do not run the APIC timer correctly
when they go idle. Previously the code assumed this only happened
on C2 or deeper.  But not all of these systems report support C2.

Use a AMD supplied snippet to detect C1E being enabled and then disable
local apic timer use.

This supercedes an earlier workaround using DMI detection of specific systems.

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux/arch/i386/kernel/apic.c
===================================================================
--- linux.orig/arch/i386/kernel/apic.c
+++ linux/arch/i386/kernel/apic.c
@@ -272,32 +272,6 @@ static void __devinit setup_APIC_timer(v
 }
 
 /*
- * Detect systems with known broken BIOS implementations
- */
-static int __init lapic_check_broken_bios(struct dmi_system_id *d)
-{
-	printk(KERN_NOTICE "%s detected: disabling lapic timer.\n",
-		       d->ident);
-	local_apic_timer_disabled = 1;
-	return 0;
-}
-
-static struct dmi_system_id __initdata broken_bios_dmi_table[] = {
-	{
-		/*
-		 * BIOS exports only C1 state, but uses deeper power
-		 * modes behind the kernels back.
-		 */
-		  .callback = lapic_check_broken_bios,
-		  .ident = "HP nx6325",
-		  .matches = {
-			DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6325"),
-		  },
-	 },
-	 {}
-};
-
-/*
  * In this functions we calibrate APIC bus clocks to the external timer.
  *
  * We want to do the calibration only once since we want to have local timer
@@ -357,6 +331,44 @@ static void __init lapic_cal_handler(str
 	}
 }
 
+#define ENABLE_C1E_MASK         0x18000000
+#define CPUID_PROCESSOR_SIGNATURE       1
+#define CPUID_XFAM              0x0ff00000
+#define CPUID_XFAM_K8           ...
From: Linus Torvalds
Date: Thursday, March 29, 2007 - 2:16 pm

I think this looks better than what we have now, but it would look even 
better if the core CPUID stuff was in arch/i386/kernel/cpu/amd.c, and we 
simply had X86_FEATURE_BROKEN_C1_LAPIC etc..

And then the apic.c code would just check

	if (boot_cpu_has(X86_FEATURE_BROKEN_C1_LAPIC))
		return -1;

or similar.

Doing the same for C2 and C3 gives us a clean way to have all these 
per-vendor things in their relevant places, rather than having various 
vendor-specific checks sprinkled in random places..

That's *especially* true for something like this that can hit both on 
x86-64 and i386, where the cpuid logic is shared, but the APIC logic is 
*not* shared. If I read your patch correctly, this only fixes it on 32-bit 
platforms, and I don't think the problem is in any way 32-bit specific, is 
it?

		Linus
-

From: Andreas Mohr
Date: Thursday, March 29, 2007 - 2:45 pm

Hi,


Please don't, this would break the interface logically.
X86_FEATURE_xxx usually denotes a *feature*, not a "feature"
(Micro$oft speak for "bug" ;).
IOW most flags express a positive attribute, not a negative one.
An exception to this probably is X86_FEATURE_FXSAVE_LEAK and
X86_FEATURE_CMP_LEGACY, but all others seem to be positive, so we might
want to enforce this rule.

Thus, how about e.g. X86_FEATURE_LAPIC_C1_OK?
Or is this less precise from a "C1 working ok" detection point of view?
(i.e. we'd just assume by default that most machines are ok except the ones
where we have issue detection code for, which might be too fuzzy)

Andreas Mohr
-

From: Linus Torvalds
Date: Thursday, March 29, 2007 - 2:56 pm

Sure, we could  make it positive instead, and I agree it would make code 

I have no trouble at all with something like that instead. Anybody willing 
to cook up a patch?

		Linus
-

From: Andi Kleen
Date: Thursday, March 29, 2007 - 3:06 pm

We already have several of those. And negative is much easier. 

-Andi
-

From: Andi Kleen
Date: Thursday, March 29, 2007 - 3:05 pm

It is because 64bit doesn't have dyntick yet and doesn't try to use the lapic
timer only by default. It has a "apicmaintimer" option, but that is never set
automatically. When dyntick is ported over it will be needed there too.

-Andi

Disable local APIC timer use on AMD systems with C1E

AMD dual core laptops with C1E do not run the APIC timer correctly
when they go idle. Previously the code assumed this only happened
on C2 or deeper.  But not all of these systems report support C2.

Use a AMD supplied snippet to detect C1E being enabled and then disable
local apic timer use.

This supercedes an earlier workaround using DMI detection of specific systems.

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux/arch/i386/kernel/apic.c
===================================================================
--- linux.orig/arch/i386/kernel/apic.c
+++ linux/arch/i386/kernel/apic.c
@@ -272,32 +272,6 @@ static void __devinit setup_APIC_timer(v
 }
 
 /*
- * Detect systems with known broken BIOS implementations
- */
-static int __init lapic_check_broken_bios(struct dmi_system_id *d)
-{
-	printk(KERN_NOTICE "%s detected: disabling lapic timer.\n",
-		       d->ident);
-	local_apic_timer_disabled = 1;
-	return 0;
-}
-
-static struct dmi_system_id __initdata broken_bios_dmi_table[] = {
-	{
-		/*
-		 * BIOS exports only C1 state, but uses deeper power
-		 * modes behind the kernels back.
-		 */
-		  .callback = lapic_check_broken_bios,
-		  .ident = "HP nx6325",
-		  .matches = {
-			DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6325"),
-		  },
-	 },
-	 {}
-};
-
-/*
  * In this functions we calibrate APIC bus clocks to the external timer.
  *
  * We want to do the calibration only once since we want to have local timer
@@ -372,12 +346,12 @@ void __init setup_boot_APIC_clock(void)
 	long delta, deltapm;
 	int pm_referenced = 0;
 
-	/* Detect know broken systems */
-	dmi_check_system(broken_bios_dmi_table);
+	if ...
From: Grzegorz Chwesewicz
Date: Friday, March 30, 2007 - 2:06 pm

This patch also works. If You want bootlog or output of /proc/interrupts I can
post it.

--
Greetings - CeHo - Grzegorz Chwesewicz
-

From: Grzegorz Chwesewicz
Date: Saturday, March 31, 2007 - 12:47 am

I've tested this patch little bit more on my nx6325 and I've found scenario in
which my box works slow. When I boot HP with connected AC (it boots fast), and
then after boot I unplug AC and try to power HP off it's working very slow
(powering off process take few minutes). On battery it's always booting and
powering off fast. Can enybody with nx6325 confirm this ?

--
Greetings - CeHo - Grzegorz Chwesewicz
-

From: Grzegorz Chwesewicz
Date: Thursday, March 29, 2007 - 2:43 pm

I have nx6325 with Turion.

ensima-hp linux-2.6 # cat .config|grep NO_HZ
CONFIG_NO_HZ=y

After patching system works ok on battery and on AC.

On battery:

cat /proc/interrupts

           CPU0       CPU1
  0:     115771          0  local-APIC-edge-fasteoi   timer
  1:        508          1   IO-APIC-edge      i8042
  8:          1          1   IO-APIC-edge      rtc
 12:        147          2   IO-APIC-edge      i8042
 14:         36          1   IO-APIC-edge      ide0
 16:          0          1   IO-APIC-fasteoi   yenta, sdhci:slot0
 17:       4538        116   IO-APIC-fasteoi   eth0
 18:       1697          6   IO-APIC-fasteoi   libata, HDA Intel
 19:         50          1   IO-APIC-fasteoi   ohci_hcd:usb1, ohci_hcd:usb2,
ehci_hcd:usb3
 21:         30          9   IO-APIC-fasteoi   acpi
NMI:          0          0
LOC:          0     115601
ERR:          1
MIS:          0

sleep 10

           CPU0       CPU1
  0:     125777          0  local-APIC-edge-fasteoi   timer
  1:        509          1   IO-APIC-edge      i8042
  8:          1          1   IO-APIC-edge      rtc
 12:        147          2   IO-APIC-edge      i8042
 14:         36          1   IO-APIC-edge      ide0
 16:          0          1   IO-APIC-fasteoi   yenta, sdhci:slot0
 17:       4749        116   IO-APIC-fasteoi   eth0
 18:       1704          6   IO-APIC-fasteoi   libata, HDA Intel
 19:         50          1   IO-APIC-fasteoi   ohci_hcd:usb1, ohci_hcd:usb2,
ehci_hcd:usb3
 21:         30          9   IO-APIC-fasteoi   acpi
NMI:          0          0
LOC:          0     125607
ERR:          1
MIS:          0

#######################################

On AC:

cat /proc/interrupts

           CPU0       CPU1
  0:        261          0  local-APIC-edge-fasteoi   timer
  1:        346          1   IO-APIC-edge      i8042
  8:          1          1   IO-APIC-edge      rtc
 12:        147          2   IO-APIC-edge      i8042
 14:         36          1   IO-APIC-edge      ide0
 16:       ...
From: Grzegorz Chwesewicz
Date: Thursday, March 29, 2007 - 2:55 pm

<cut>

Bootlogs on AC and battery can be found here:
http://bugzilla.kernel.org/show_bug.cgi?id=8235


--
Greetings - CeHo - Grzegorz Chwesewicz
-

From: Andi Kleen
Date: Thursday, March 29, 2007 - 7:19 am

I didn't think we could reliably distingush mobile cpus with C2+ 
versus desktop CPUs without it. Or rather you would need a table
for each new CPU revision Intel/AMD put out. That would be probably
quite nightmarish to maintain.

-Andi
-

From: Linus Torvalds
Date: Friday, March 23, 2007 - 11:13 am

Damn. I applied your patch, but it breaks on x86-64:

   drivers/acpi/processor_idle.c:271: error: 'local_apic_timer_c2_ok' 
	undeclared (f irst use in this function)

I really wish we had an x86-64 maintainer that understood that it's 
confusing that files in arch/i386/ are also used for arch/x86-64.

		Linus
-

From: Linus Torvalds
Date: Friday, March 23, 2007 - 11:16 am

Sorry, that was unfair. The patch was simply buggy. It added the test to 
drivers/acpi/ *without* adding it to the architectures that used it, it 
wasn't an i386/x86-64 thing.

Thomas, please fix.

		Linus
-

From: Linus Torvalds
Date: Friday, March 23, 2007 - 11:28 am

Here's a possible fix. It compiles. And I still wish we had common files.

ia64 shouldn't be affected, because ia64 doesn't #define the 
ARCH_APICTIMER_STOPS_ON_C3 flag (and then we don't use the "c2_ok" thing 
either. But this is still pretty damn ugly.

Maybe a field in "struct acpi_processor" for C2/C3 problems?

		Linus

---
diff --git a/arch/x86_64/kernel/apic.c b/arch/x86_64/kernel/apic.c
index 723417d..46acf4f 100644
--- a/arch/x86_64/kernel/apic.c
+++ b/arch/x86_64/kernel/apic.c
@@ -47,6 +47,10 @@ int apic_calibrate_pmtmr __initdata;
 
 int disable_apic_timer __initdata;
 
+/* Local APIC timer works in C2? */
+int local_apic_timer_c2_ok;
+EXPORT_SYMBOL_GPL(local_apic_timer_c2_ok);
+
 static struct resource *ioapic_resources;
 static struct resource lapic_resource = {
 	.name = "Local APIC",
@@ -1192,6 +1196,13 @@ static __init int setup_nolapic(char *str)
 } 
 early_param("nolapic", setup_nolapic);
 
+static int __init parse_lapic_timer_c2_ok(char *arg)
+{
+	local_apic_timer_c2_ok = 1;
+	return 0;
+}
+early_param("lapic_timer_c2_ok", parse_lapic_timer_c2_ok);
+
 static __init int setup_noapictimer(char *str) 
 { 
 	if (str[0] != ' ' && str[0] != 0)
diff --git a/include/asm-x86_64/apic.h b/include/asm-x86_64/apic.h
index e81d0f2..7cfb39c 100644
--- a/include/asm-x86_64/apic.h
+++ b/include/asm-x86_64/apic.h
@@ -102,5 +102,6 @@ void switch_ipi_to_APIC_timer(void *cpumask);
 #define ARCH_APICTIMER_STOPS_ON_C3	1
 
 extern unsigned boot_cpu_id;
+extern int local_apic_timer_c2_ok;
 
 #endif /* __ASM_APIC_H */
-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 11:43 am

Hmm, the acpi processor stuff is modular.

	tglx


-

From: Ingo Molnar
Date: Friday, March 23, 2007 - 5:27 am

agreed - this seems to be a genuine CONFIG_HIGH_RES_TIMERS=y bug. (which 
has probably not been fixed since -rc4 either, we have no bugfix in this 
area that could explain the expires_next==KTIME_MAX timer state visible 
in SysRq-Q.)


yes - i was quite wrong pushing it so hard. (and doubly so given your 

yeah - we are working hard on it.

	Ingo
-

From: Mariusz
Date: Thursday, March 22, 2007 - 11:24 am

Just to make things clear. I didn't say I could reproduce it on 2.6.21-rc4.
In fact I'm running 2.6.21-rc4-mm1 with no problems so far. I just replied
to show my sysrq dumps of processes states with 2.6.21-rc2-mm1.

I could reproduce similar (but still each time slightly different) hangs 
on -mm series from 2.6.20-mm1 to 2.6.21-rc2-mm1. 2.6.21-rc3-mm1 worked well
for me so not sure If my report is still valid here.


As I previously noticed each time the system hang I/O activity to disk looked


No. He is not.

Regards,

	Mariusz Kozlowski
-

From: Adrian Bunk
Date: Sunday, March 18, 2007 - 11:49 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : x86_64: boot hangs unless CONFIG_PCIEPORTBUS=n and acpi=off
References : http://bugzilla.kernel.org/show_bug.cgi?id=8162
Submitter  : Randy Dunlap <randy.dunlap@oracle.com>
Status     : unknown


Subject    : AMD Elan: Crash after "Allocating PCI resources"
References : http://bugzilla.kernel.org/show_bug.cgi?id=8161
Submitter  : Vladimir Brik <no.hope@gmail.com>
Handled-By : Andi Kleen <ak@muc.de>
Status     : patch available


Subject    : ACPI regression with noapic
References : http://lkml.org/lkml/2007/3/8/468
Submitter  : Ray Lee <ray-lk@madrabbit.org>
Status     : unknown


Subject    : acpi_serialize locks system during boot
References : http://bugzilla.kernel.org/show_bug.cgi?id=8171
Submitter  : Colchao <colchaodemola@gmail.com>
Handled-By : Len Brown <len.brown@intel.com>
Patch      : http://bugzilla.kernel.org/show_bug.cgi?id=8171
Status     : patch available


Subject    : NCQ problem with ahci and Hitachi drive  (ACPI related)
References : http://lkml.org/lkml/2007/3/4/178
             http://lkml.org/lkml/2007/3/9/475
Submitter  : Mathieu Bérard <Mathieu.Berard@crans.org>
Handled-By : Tejun Heo <htejun@gmail.com>
Status     : problem is being debugged


Subject    : kernels fail to boot with drives on ATIIXP controller
             (ACPI/IRQ related)
References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229621
             http://lkml.org/lkml/2007/3/4/257
Submitter  : Michal Jaegermann <michal@ellpspace.math.ualberta.ca>
Status     : unknown


Subject    : libata: PATA UDMA/100 configured as UDMA/33
References : ...
From: Andi Kleen
Date: Sunday, March 18, 2007 - 12:25 pm

Patch is in Linus' tree now.

-Andi
-

From: Randy Dunlap
Date: Monday, March 19, 2007 - 9:06 am

Bug is rejected due to user error.
This seems to be a netconsole hang, not ACPI.

---
~Randy
-

From: Adrian Bunk
Date: Monday, March 19, 2007 - 9:15 am

Thanks for this information.


cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Randy Dunlap
Date: Monday, March 19, 2007 - 10:07 am

I have to do more debugging to find out what is going on.  It may be
PCIEPORTBUS-related, but 2.6.20 or later now boot for me without
netconsole and hang when using netconsole.  The ending lines in the console
log are:

[   17.946918] PCI: Setting latency timer of device 0000:00:02.0 to 64
[   17.953211] assign_interrupt_mode Found MSI capability
[   17.958388] Allocate Port Service[0000:00:02.0:pcie00]
[   17.963565] Allocate Port Service[0000:00:02.0:pcie01]
[   17.968736] Allocate Port Service[0000:00:02.0:pcie02]
[   17.973903] Allocate Port Service[0000:00:02.0:pcie03]
[   17.979134] PCI: Setting latency timer of device 0000:00:03.0 to 64
[   17.985430] assign_interrupt_mode Found MSI capability
[   17.990565] Losing some ticks... checking if CPU frequency changed.
[   17.996857] Allocate Port Service[0000:00:03.0:pcie00]
[   18.002013] Allocate Port Service[0000:00:03.0:pcie01]
[   18.007175] Allocate Port Service[0000:00:03.0:pcie02]
[   18.012336] Allocate Port Service[0000:00:03.0:pcie03]
[   18.017552] PCI: Setting latency timer of device 0000:00:03.1 to 64
[   18.023848] assign_interrupt_mode Found MSI capability
[   18.029012] Allocate Port Service[0000:00:03.1:pcie00]


-- 
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Ray Lee
Date: Tuesday, March 20, 2007 - 8:32 am

I finally have time to start bisecting this. I'll have something in the
next day.

Ray
-

From: Adrian Bunk
Date: Sunday, March 18, 2007 - 11:49 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : ThinkPad X60: resume no longer works  (PCI related?)
References : http://lkml.org/lkml/2007/3/13/3
Submitter  : Dave Jones <davej@redhat.com>
             Jeremy Fitzhardinge <jeremy@goop.org>
Caused-By  : PCI merge
             commit 78149df6d565c36675463352d0bfe0000b02b7a7
Handled-By : Eric W. Biederman <ebiederm@xmission.com>
             Rafael J. Wysocki <rjw@sisk.pl>
Status     : problem is being debugged


Subject    : ThinkPad doesn't resume from suspend to RAM
References : http://lkml.org/lkml/2007/2/27/80
             http://lkml.org/lkml/2007/2/28/348
Submitter  : Jens Axboe <jens.axboe@oracle.com>
             Jeff Chua <jeff.chua.linux@gmail.com>
Status     : unknown


Subject    : suspend to disk hangs
References : http://lkml.org/lkml/2007/3/6/142
Submitter  : Jeff Chua <jeff.chua.linux@gmail.com>
Status     : unknown


Subject    : suspend to disk hangs
References : http://lkml.org/lkml/2007/3/16/126
Submitter  : Maxim Levitsky <maximlevitsky@gmail.com>
Status     : unknown


Subject    : suspend to disk hangs
References : http://bugzilla.kernel.org/show_bug.cgi?id=8224
Submitter  : Mike Harris <atarimike@wavecable.com>
Status     : unknown


Subject    : ThinkPad R60: suspend broken
References : http://lkml.org/lkml/2007/3/16/57
Submitter  : Marcus Better <marcus@better.se>
Status     : unknown


Subject    : laptop immediately resumes after suspend
References : http://lkml.org/lkml/2007/3/8/469
Submitter  : Ray Lee <ray-lk@madrabbit.org>
Caused-By  : Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
         ...
From: Jeff Chua
Date: Sunday, March 25, 2007 - 6:25 pm

The good news is on 2.6.21-rc5, suspend to disk, and resume from disk works.
But, it only works with CONFIG_NO_HZ unset.

Setting CONFIG_NO_HZ will cause suspend to disk to hang just before
saving memory to disk.

Resume from RAM (s2ram) still broke (tried with or without
CONFIG_NO_HZ). Suspend to RAM seems ok, but upon resume, the screen
will only display "inu" and only after pressing the power button will
the system return to console. But "date" still doesn't advance.

Thanks,
Jeff.
-

From: Adrian Bunk
Date: Sunday, March 25, 2007 - 9:05 pm

This might be related to the following regression:

Subject    : first disk access after resume takes several minutes
             ('date' does not advance after resume from RAM, CONFIG_NO_HZ=n)
References : http://lkml.org/lkml/2007/3/8/117
             http://lkml.org/lkml/2007/3/25/20
Submitter  : Michael S. Tsirkin <mst@mellanox.co.il>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
             Ingo Molnar <mingo@elte.hu>

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Jeff Chua
Date: Sunday, March 25, 2007 - 10:37 pm

Adrian,

It's related. I tested without CONFIG_HPET_TIMER, and now my X60 can
suspend and resume from RAM (s2ram). Even better, it works
with/without CONFIG_NO_HZ.

But, suspend to disk still broke with CONFIG_NO_HZ set.

Thanks,
Jeff.
-

From: Thomas Gleixner
Date: Monday, March 26, 2007 - 9:26 am

Does the patch below fix the HPET_TIMER=y case ?

	tglx

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index f3ab61e..76afea6 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -197,7 +197,7 @@ static int hpet_next_event(unsigned long delta,
 	cnt += delta;
 	hpet_writel(cnt, HPET_T0_CMP);
 
-	return ((long)(hpet_readl(HPET_COUNTER) - cnt ) > 0);
+	return ((long)(hpet_readl(HPET_COUNTER) - cnt ) > 0) ? -ETIME : 0;
 }
 
 /*



-

From: Jeff Chua
Date: Monday, March 26, 2007 - 10:46 am

Thomas, I tried, but it didn't help. Upon resume from ram, "date"
still didn't advance.

Thanks,
Jeff.
-

From: Thomas Gleixner
Date: Wednesday, March 28, 2007 - 12:04 am

Can you please issue a SysRq-Q in this situation and provide the dmesg
output ?

Thanks,

	tglx


-

From: Maxim
Date: Wednesday, March 28, 2007 - 6:43 am

Hi,
	I almost sure Iknow why this happens,
		The problem is that both hpet clock source and hpet clockevents doesn't have a suspend/resume function
		On resume we should enable the main counter _and_ enable legacy replacement mode,
		On my system main counter in enabled, by I think by bios, but legacy replacement mode is not, so if
		a system doesn't use lapic as a tick source, but use hpet+broadcast, it will hang for sure on resume, and i tested it

		The patch below is a temporally fix, until clock-events and clocksources will get proper suspend/resume hooks:

		Regards,
			Maxim Levitsky

---

Add suspend/resume for HPET
Signed-off-by: Maxim Levitsky <maximlevisky@gmail.com>

---

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index 0fd9fba..a1ec79e 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -152,6 +152,16 @@ static void hpet_set_mode(enum clock_event_mode mode,
 	unsigned long cfg, cmp, now;
 	uint64_t delta;
 
+
+	if ( mode != CLOCK_EVT_MODE_UNUSED && mode != CLOCK_EVT_MODE_SHUTDOWN)
+	{
+		unsigned long cfg = hpet_readl(HPET_CFG);
+		cfg |= HPET_CFG_ENABLE | HPET_CFG_LEGACY;
+		hpet_writel(cfg, HPET_CFG);
+		
+	}
+		
+
 	switch(mode) {
 	case CLOCK_EVT_MODE_PERIODIC:
 		delta = ((uint64_t)(NSEC_PER_SEC/HZ)) * hpet_clockevent.mult;

-

From: Ingo Molnar
Date: Wednesday, March 28, 2007 - 7:41 am

cool! Your patch is a definite improvement on my t60 (where 
suspend/resume never worked with hpet enabled). But it does not fix 
everything - for example the timings are way off after resume. Thomas?

	Ingo
-

From: Maxim
Date: Wednesday, March 28, 2007 - 8:01 am

The problem is that HPET  consists of two parts:
	one is a main counter and second is a a timers.

	To enable main counter one must set HPET_CFG_ENABLE.
	It is set only on boot, and not on resume now.

	But on my system I think bios re-enables it.

	Secondary to enable HPET to act as a timer on IRQ0 and to make it replace RTC IRQ8 one must set
	HPET_CFG_LEGACY

	This bit is definitely not set on resume, so on my system I get a hang if I use HPET as clockevents device,
	and on other system where bios doesn't reenable HPET, then even if HPET is used as timing device 
	'date won't advance'

	But I set those bits only in initialization of HPET clockevents device, and I set always, when it is turned on,
	It is not that good,but works.

	Now I don't have a clue how to set those bits if only HPET is used as clock source because now clocksources
	don't have _any_ resume hook.

	The timing problem you mention I think isd connected to the fact that HPET is not enabled instantly after resume, but after some time when clockevents
	core calls HPET to enable event timer.

	Best regards,
		Maxim Levitsky
-

From: Linus Torvalds
Date: Wednesday, March 28, 2007 - 9:38 am

One thing that drives me wild about that "clocksource resume" thing is 
that it seems to think that clocksources are somehow different from any 
other system devices..

Why isn't the HPET considered a "device", and has it's own *device* 
"suspend" and "resume"? Why do we seem to think that only "set_mode()" 
etc should wake up clock sources?

It's a *device*, dammit. It should save and resume like one (probably as a 
system device). The "set_mode()" etc stuff is at a completely different 
(higher) conceptual level.

Thomas? It does seem like Maxim has hit the nail on the head (at least 
partly) on the HPET timer resume problems..

		Linus
-

From: David Brownell
Date: Wednesday, March 28, 2007 - 12:38 pm

Agreed, except about "probably as a system device".

Last I checked, there was no good reason to use sysdev suspend()/resume()
rather than platform_device suspend_late()/early_resume().  Which more
or less means no good reason to use sysdev in new code...


Also, making HPET use the legacy mode seems like a step backwards.

- Dave
-

From: Maxim
Date: Wednesday, March 28, 2007 - 1:19 pm

Hi,
	It is not 'legacy' mode,
It is a legacy replacement mode.
It this mode HPET takes over IRQ0 and IRQ 8 and provides this way replacement for PIT and RTC periodic function

	Best regards,
		Maxim Levitsky

-

From: David Brownell
Date: Wednesday, March 28, 2007 - 1:59 pm

It's that RTC periodic thing that bothers me, I don't mind about
the PIT.  Remember that IRQ8 is also used for other RTC functions.

Now, if there were a way to tell rtc-cmos that HPET is active,
and arrange some kind of handshake ... that would be different.

- Dave
-

From: Maxim
Date: Wednesday, March 28, 2007 - 2:27 pm

Yes,
	When HPET is active it eats RTC IRQ,
	So the only way out is to emulate RTC using HPET,
	It is done this way in old rtc driver, rtc-cmos should do the same.

	Of course suspend resume is not supported at all by old rtc driver

	I already wrote complete support for suspend/resume for old rtc driver (I wrote it long time ago)

	Now I fixed it to support HPET , and this way I discovered that HPET doesn't have suspend resume functions

	I will do last checks now and send this patch very soon

	I am also planning to add support of HPET and suspend/resume for rtc-cmos, but I didn't start this yet.

	Best regards,
		Maxim Levitsky




	
-

From: David Brownell
Date: Thursday, March 29, 2007 - 3:33 pm

Only when HPET timers 0 and 1 are set up for "Legacy Replacement Mode".

No.  Patches like

  http://marc.info/?l=linux-kernel&m=117219531503973&w=2

should be merged (I hope they're in the 2.6.22 queue!), making
HPET run in "Standard Mode" so that HPET can stop sticking its

It's already got suspend/resume support, and in the 2.6.22 queue
are RTC framework updates which will let the RTC framework replace
a lot more platform-specific RTC support.  (Platform changes can come
later, where they're needed.  ARM for example doesn't need any.)

Once HPET stops using "Legacy Replacement Mode" you won't need to
touch anything in the RTC stack (except maybe the legacy char/rtc.c
driver, removing HPET stuff).

The open issue with suspend/resume support in rtc-cmos relates to
how ACPI wakeup alarms should trigger.  I've not made time to test
those patches.

- Dave
-

From: Maxim Levitsky
Date: Thursday, March 29, 2007 - 4:29 pm

Hi,
	It is not that simple,

	Only in legacy replacement mode HPET can be put on IRQ0 (and sadly  IRQ8)
	At least this is true on some systems, on mine for example
	
	On my system first 2 hpet timers can only be assigned to IRQ21-23
	and third to ether IRQ11, IRQ21-IRQ23

	Or in legacy replacement mode first is assigned IRQ0 and second IRQ8

	this will make it difficult to use it as a clockevents source

	Not to mention the fact that current code assumes that BIOS assigned IRQs to all timers which is not true on my system.

	I have brand new intel DG965 motherboard.

	What is wrong with relying on HPET to provide RTC IRQ ?

	Best regards,
		Maxim Levitsky

-

From: David Brownell
Date: Thursday, March 29, 2007 - 5:09 pm

Right, that's the entire point of legacy replacement mode.

But so what?  In standard mode, HPET just uses other IRQs.
Nothing would care about irq0, and irq8 would be used by

Why?  It's not like the clockevent logic cares what IRQ a given
programmable timer uses.  So long as the HPET driver can receive
its IRQ, it'll make a fine clockevent.  There's no reason to have
HPET prevent other drivers from working, by insisting it use that
nasty "prevent other hardware from issuing IRQs" mode.

The patches from Venkatesh sure seem to have behaved for him.
And while he might have complained about difficulty, I think
that'd be more likely due to the SMP issues he also addressed

Getting IRQ routing sorted out is a problem that's been solved
numerous times before.  And again, the patches referenced above

For starters, it's not an RTC.  Why in the world would you want to
make the OS think it's an RTC ... unless you're Microsoft, and are
desperate to get another periodic timer (and don't much care about
the other RTC functionality?


... that's all off-topic for 2.6.21 regressions though; it's too
late to merge x86_64 clockevent support, or fix HPET issues like
not using "standard mode".

- Dave

-

From: Maxim Levitsky
Date: Thursday, March 29, 2007 - 5:48 pm

No they don't, 

First patch does that:

                        hd.hd_irq[i] = (timer->hpet_config &
                                        Tn_INT_ROUTE_CNF_MASK) >>                                
					Tn_INT_ROUTE_CNF_SHIFT;


Hi,
	I feel that you are right,

	You meant that one of HPET timers will be used as clock source and will be assigned some IRQ (not IRQ0)
	
	Seems fine,

	Only one thing: the kernel must assign IRQs to HPETS , relying on bios is not good,
	Also the IRQ for clocksource should be not shared, but maybe I am wrong here 
	(I am afraid that latencies might be a problem here)

	By the way I never thought about the fact that legacy replacement mode  is a 'virtual legacy'

	I mean that it is intended to simulate RTCs and PITs on systems that don't have them, am I right here ?
	HPET spec says that RTC is still requred to provide all its usial functions except periodic freq.

	Best regards,
		Maxim Levitsky
-

From: Linus Torvalds
Date: Wednesday, March 28, 2007 - 1:42 pm

I won't disagree - it might well be much nicer to just show it in the 
"real" device tree. I'm not 100% sure where in the tree it would go, 
though. It should probably be "inside" the root entry, before any of the 
PCI buses. It's generally what we've used those "system device" things 
for, but I agree that it would be better to just make system devices show 
up early on the regular device list than it is to have them be special 
cases.

Bit I think that's a separate (and fairly small) issue compared to the 
"don't use the clocksource infrastructure as a make-believe suspend/resume 
mechanism" problem that Maxim's patch had.

(Maxim, don't take that the wrong way - I think your analysis and patch 

I don't think that's actually "legacy" in any sense but the interrupt 
delivery, where the "legacy mode" bit is not so much that the HPET itself 
is "legacy" but that it *replaces* legacy devices.

But I may have misunderstood the thing. I'm an old fart, so I know the old 
timers much better than I know the new ones ;). Somebody feel free to hit 
me with the clue-2x4.

			Linus
-

From: David Brownell
Date: Wednesday, March 28, 2007 - 2:17 pm

Mixing "inside" and "before" is a small linguistic clue about
one of the issues with driver model PM.  Off topic here; and
in terms of suspend/resume callback sequencing that answer

Yes -- where "platform_device" is a regular Joe-Sixpack kind of

Agreed -- although isn't it the "clockevent" change which is at issue?

A "clockevent" thingie wraps various kinds of timer IRQs; the clocksource
is conceptually just a free run counter.  Clocksources have been around
for a while, with no particular problems.

It's clockevent sources have been the problem with dynamic tick solutions
all along, since they mask such chaos inside x86 hardware and interact
with so many different parts of the kernel.  ;)

-

From: Maxim
Date: Wednesday, March 28, 2007 - 3:26 pm

Exactly, I agree completely
I said that my patch was a  temporary fix, and I agree that the best way is to create a new system device

Best regards,
	Maxim Levitsky
-

From: Maxim
Date: Wednesday, March 28, 2007 - 9:41 pm

Hi,
	I am sending here a patch that as was discussed here adds hpet to list of system devices
	and adds suspend/resume hooks this way.
	I tested it and it works fine.

---
Add suspend/resume support for HPET
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>

---
 arch/i386/kernel/hpet.c |   64 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 64 insertions(+), 0 deletions(-)

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index 0fd9fba..ac41476 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -3,6 +3,8 @@
 #include <linux/errno.h>
 #include <linux/hpet.h>
 #include <linux/init.h>
+#include <linux/sysdev.h>
+#include <linux/pm.h>
 
 #include <asm/hpet.h>
 #include <asm/io.h>
@@ -524,3 +526,65 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 #endif
+
+
+/*
+ * Suspend/resume part
+ */
+
+#ifdef CONFIG_PM
+
+static int hpet_suspend(struct sys_device *sys_device, pm_message_t state)
+{
+	unsigned long cfg = hpet_readl(HPET_CFG);
+
+	cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY);
+	hpet_writel(cfg, HPET_CFG);
+
+	return 0;
+}
+
+static int hpet_resume(struct sys_device *sys_device)
+{
+	unsigned int id;
+
+	hpet_start_counter();
+
+	id = hpet_readl(HPET_ID);
+
+	if (id & HPET_ID_LEGSUP)
+		hpet_enable_int();
+
+	return 0;
+}
+
+static struct sysdev_class hpet_class = {
+	set_kset_name("hpet"),
+	.suspend	= hpet_suspend,
+	.resume		= hpet_resume,
+};
+
+static struct sys_device hpet_device = {
+	.id		= 0,
+	.cls		= &hpet_class,
+};
+
+
+static __init int hpet_register_sysfs(void)
+{
+	int err;
+
+	err = sysdev_class_register(&hpet_class);
+
+	if (!err) {
+		sysdev_register(&hpet_device);
+		if (err)
+			sysdev_class_unregister(&hpet_class);
+	}
+
+	return err;
+}
+
+device_initcall(hpet_register_sysfs);
+
+#endif
-- 
1.4.4.2

-

From: Linus Torvalds
Date: Wednesday, March 28, 2007 - 10:08 pm

Ok, it certainly looks better, but it *also* looks like it just assumes 
the HPET is there. Which would work in testing _with_ a HPET, but would 
likely break on hardware without one, no?

Shouldn't there be at least something like a

	if (!is_hpet_capable())
		return 0;

at the top of that init routine? I'd also expect that you'd need to check 
that "hpet_virt_address" is valid or something?

(Or, better yet, shouldn't we set "boot_hpet_disable" when we decide not 
to use the HPET, and set hpet_virt_address to NULL?)

		Linus
-

From: Maxim
Date: Wednesday, March 28, 2007 - 10:47 pm

This is done here

out_nohpet:
	iounmap(hpet_virt_address);

Hi, 
	Of course, I forgot.

	I was planning to put sysdev code in hpet_enable()
	but it is not possible because this function is called too early.

	Thus I put sysdev initialization  in separate function but forgot to test for HPET

	Thanks a lot.

	Best regards
		Maxim Levitsky

---
This adds support of suspend/resume on i386 for HPET
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>

---
 arch/i386/kernel/hpet.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index 0fd9fba..7c67780 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -3,6 +3,8 @@
 #include <linux/errno.h>
 #include <linux/hpet.h>
 #include <linux/init.h>
+#include <linux/sysdev.h>
+#include <linux/pm.h>
 
 #include <asm/hpet.h>
 #include <asm/io.h>
@@ -310,6 +312,7 @@ int __init hpet_enable(void)
 out_nohpet:
 	iounmap(hpet_virt_address);
 	hpet_virt_address = NULL;
+	boot_hpet_disable = 1;
 	return 0;
 }
 
@@ -524,3 +527,68 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 #endif
+
+
+/*
+ * Suspend/resume part
+ */
+
+#ifdef CONFIG_PM
+
+static int hpet_suspend(struct sys_device *sys_device, pm_message_t state)
+{
+	unsigned long cfg = hpet_readl(HPET_CFG);
+
+	cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY);
+	hpet_writel(cfg, HPET_CFG);
+
+	return 0;
+}
+
+static int hpet_resume(struct sys_device *sys_device)
+{
+	unsigned int id;
+
+	hpet_start_counter();
+
+	id = hpet_readl(HPET_ID);
+
+	if (id & HPET_ID_LEGSUP)
+		hpet_enable_int();
+
+	return 0;
+}
+
+static struct sysdev_class hpet_class = {
+	set_kset_name("hpet"),
+	.suspend	= hpet_suspend,
+	.resume		= hpet_resume,
+};
+
+static struct sys_device hpet_device = {
+	.id		= 0,
+	.cls		= &hpet_class,
+};
+
+
+static __init int ...
From: Sergei Shtylyov
Date: Thursday, March 29, 2007 - 6:20 am

Hello.


   The part after usually "---" gets cut off, the patch description and 

    This doesn't make sense, err will always be 0.  Perhaps you actually 

WBR, Sergei
-

From: Maxim
Date: Thursday, March 29, 2007 - 6:31 am

Hi,
	Big thanks for pointing this out,
		I will resend that updated patch.

	Best regards,
		Maxim Levitsky

-

From: Maxim Levitsky
Date: Thursday, March 29, 2007 - 6:46 am

Subject: Add suspend/resume for HPET
This adds support of suspend/resume on i386 for HPET
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>

---
 arch/i386/kernel/hpet.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index 0fd9fba..7c67780 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -3,6 +3,8 @@
 #include <linux/errno.h>
 #include <linux/hpet.h>
 #include <linux/init.h>
+#include <linux/sysdev.h>
+#include <linux/pm.h>
 
 #include <asm/hpet.h>
 #include <asm/io.h>
@@ -310,6 +312,7 @@ int __init hpet_enable(void)
 out_nohpet:
 	iounmap(hpet_virt_address);
 	hpet_virt_address = NULL;
+	boot_hpet_disable = 1;
 	return 0;
 }
 
@@ -524,3 +527,68 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 #endif
+
+
+/*
+ * Suspend/resume part
+ */
+
+#ifdef CONFIG_PM
+
+static int hpet_suspend(struct sys_device *sys_device, pm_message_t state)
+{
+	unsigned long cfg = hpet_readl(HPET_CFG);
+
+	cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY);
+	hpet_writel(cfg, HPET_CFG);
+
+	return 0;
+}
+
+static int hpet_resume(struct sys_device *sys_device)
+{
+	unsigned int id;
+
+	hpet_start_counter();
+
+	id = hpet_readl(HPET_ID);
+
+	if (id & HPET_ID_LEGSUP)
+		hpet_enable_int();
+
+	return 0;
+}
+
+static struct sysdev_class hpet_class = {
+	set_kset_name("hpet"),
+	.suspend	= hpet_suspend,
+	.resume		= hpet_resume,
+};
+
+static struct sys_device hpet_device = {
+	.id		= 0,
+	.cls		= &hpet_class,
+};
+
+
+static __init int hpet_register_sysfs(void)
+{
+	int err;
+
+	if (!is_hpet_capable())
+		return 0;
+
+	err = sysdev_class_register(&hpet_class);
+
+	if (!err) {
+		err = sysdev_register(&hpet_device);
+		if (err)
+			sysdev_class_unregister(&hpet_class);
+	}
+
+	return err;
+}
+
+device_initcall(hpet_register_sysfs);
+
+#endif
-- 
1.4.4.2

-

From: Linus Torvalds
Date: Thursday, March 29, 2007 - 9:53 am

Btw, what about arch/x86_64/kernel/hpet.c?

That thing seems totally broken. Lookie here:

  arch/x86_64/kernel/hpet.c:irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs)
  drivers/char/rtc.c:extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id);

anybody see a problem? The x86-64 version doesn't seem to be very well 
maintained. Is there some fundamental reason why this file isn't shared 
across architectures?

			Linus
-

From: Maxim Levitsky
Date: Thursday, March 29, 2007 - 10:28 am

Hi,
	I agree with that, there seems to be lot of code duplication between i386 and x86_64.
	By the way, x86_64 does take care of suspend/resume for hpet, it is done by 

	linux-2.6/arch/x86_64/kernel/time.c:timer_resume(struct sys_device *dev):
		hpet_reenable()


	on i386 PIT driver goes out of way when HPET is detected
	So it seems that there is lot of work to do to remove redundant code.


	Best regards,
		Maxim Levitsky
-

From: Ingo Molnar
Date: Thursday, March 29, 2007 - 10:51 am

at least wrt. suspend/resume it should be fine, because in 
arch/x86_64/kernel/time.c it does this upon resume:

 static int timer_resume(struct sys_device *dev)
 {
         if (hpet_address)
                 hpet_reenable();
         else
                 i8254_timer_resume();

[ barring the issue that mixing two pieces of hardware like this in a 
  single resume function is wrong - all timer hardware should be 
  separated like we did it for i386. I've got 64-bit clockevents code in 

there's no fundamental reason. x86_64 COW-ed hpet_timer.c and 
time_hpet.c years ago and drifted off into different areas.
They should be unified: more power to arch/x86/ ;-)

	Ingo
-

From: Andi Kleen
Date: Thursday, March 29, 2007 - 1:46 pm

Not quite -- x86-64 did HPET long before i386; the only stuff cowed
was the character driver support code. But the core HPET code
was always totally different code streams. We never did the complicated 
pluggable clock code i386 did though -- i never quite saw the point of that
because there aren't that many timers there.
Of course it is already obsolete with clocksources now.

-Andi
-

From: Jeff Chua
Date: Thursday, March 29, 2007 - 11:11 am

Confirmed that suspend/resume disk/ram works on X60s with
CONFIG_HPET_TIMER=y and CONFIG_NO_HZ unset.

But suspend to disk still hang with CONFIG_NO_HZ unset.

Thanks,
Jeff.
-

From: Thomas Gleixner
Date: Saturday, March 31, 2007 - 8:51 am

Sorry for being inresponsive. I was travelling and unexpectedly cut off
from the internet for some days.

While I agree in principle with the patch, I'm a bit uncomfortable. The
sys device suspend / resume ordering is not guaranteed and relies on the
registering order.

Jeff still seems to have problems with CONFIG_NO_HZ=n and it might be
caused by time keeping / tick management resume happening before the
HPET resume.

The required resume order is:

clocksources
timekeeping
clockevents
tick management

I'm not sure how to do this properly with the sys device facilities, but
I look into it.

/me goes off to understand the sys device magic.

	tglx


-

From: Jeff Chua
Date: Saturday, March 31, 2007 - 9:01 am

me>Confirmed that suspend/resume disk/ram works on X60s with
me>CONFIG_HPET_TIMER=y and CONFIG_NO_HZ unset.
me> But suspend to disk still hang with CONFIG_NO_HZ unset.


Oops, sorry. Typo (as a result copy/paste using mouse)
    ... I actually meant CONFIG_NO_HZ "set".

Just to be clear, suspend to disk still hang with CONFIG_NO_HZ=y. It
hang just before you see the percent saving %.


Jeff.
-

From: Thomas Gleixner
Date: Saturday, March 31, 2007 - 9:09 am

Ah, that's a different one then. In that path the timers should be
alive, but who knows.

	tglx


-

From: Linus Torvalds
Date: Saturday, March 31, 2007 - 9:09 am

Well, this is why we probably should try to get away from the "system 
device" issue, exactly because system devices are totally outside the 
normal ordering and only have a random linear order.

If the clocksources were actually in the device tree, you'd get all the 
normal guarantees about hierarchical ordering..

		Linus
-

From: Thomas Gleixner
Date: Saturday, March 31, 2007 - 9:33 am

Right, but clock - sources/events need to be extremly late suspended and
early resumed. How can we ensure this ?

	tglx


-

From: Greg KH
Date: Saturday, March 31, 2007 - 9:41 am

Put it at the top of the device tree.

thanks,

greg k-h
-

From: Linus Torvalds
Date: Saturday, March 31, 2007 - 9:53 am

Make them be at the top of the device tree by adding them early. That's 
the whole point of the device tree after all - we have an ordering that is 
enforced by its topology, and that we sort by when things were added.

Right now the way things work is (iirc - somebody like Greg should 
double-check me) is that we add all devices to the power list at 
device_add() time by traversing the devices fromt he root all the way out, 
and doing a device_add() which does a device_pm_add(), which in turn adds 
it to the power-management list - so that the list is always topologically 
sorted.

So the only thing that needs to be done is to make sure that we add the 
timer devices early during bootup - something we have to do *anyway*. If a 
device is added early in bootup, that automatically means that it will be 
suspended late, and resumed early - because we maintain that order all the 
way through..

		Linus
-

From: Ingo Molnar
Date: Saturday, March 31, 2007 - 10:02 am

IIRC hpet is particularly hard to initialize early on in the bootup 
sequence. So the way the clockevents code works is that it will always 
try to make the best out of all available devices, and dynamically 
adapts things as devices 'arrive' or 'depart' - no matter how late that 
happens. (That way there's no dependency on how late a device gets 
registered - it will only delay the switch to high-res mode for 
example.) A given time device might 'depart' because for example the 
watchdog mechanism finds that its quality is not good enough, or because 
someone initiated cpufreq which breaks the TSC clocksource.

i dont think there's any particular problem here because suspend/resume 
wont be done during bootup - but we might need a way to move a device to 
earlier spots in the device tree, even if they got registered later on - 
instead of forcing the time devices to be registered very early?

	Ingo
-

From: David Brownell
Date: Saturday, March 31, 2007 - 11:18 am

( please remove obsolute linux-pm@lists.osdl.org  from further messages!! )


I'm about ready to test the appended patch... a "move one device" call
might be safest at this point in the release cycle though.

- Dave

========================	SNIP!
Change how the PM list is constructed, so that devices are added right
after their parents (when they have one) rather than at the end of the
list.  This preserves sequencing guarantees, but enables sequencing of
suspend/resume operations by more important characteristics than "when
device happened to enumerate" ... e.g. clocksources and clockevents
at a clearly defined point during suspend and resume.

This patch has a potential downside for devices that have multiple
power dependencies and which "just happened to work" before.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>

--- g26.orig/drivers/base/power/main.c	2006-07-02 12:30:30.000000000 -0700
+++ g26/drivers/base/power/main.c	2007-03-31 11:02:28.000000000 -0700
@@ -52,12 +52,17 @@ EXPORT_SYMBOL_GPL(device_pm_set_parent);
 int device_pm_add(struct device * dev)
 {
 	int error;
+	struct device *parent = dev->parent;
 
-	pr_debug("PM: Adding info for %s:%s\n",
-		 dev->bus ? dev->bus->name : "No Bus", dev->kobj.name);
+	pr_debug("PM: Adding info for %s:%s, after %s\n",
+		 dev->bus ? dev->bus->name : "No Bus", dev->kobj.name,
+		 parent ? parent->bus_id : "(no parent)");
 	down(&dpm_list_sem);
-	list_add_tail(&dev->power.entry, &dpm_active);
-	device_pm_set_parent(dev, dev->parent);
+	if (parent)
+		list_add(&dev->power.entry, &parent->power.entry);
+	else
+		list_add_tail(&dev->power.entry, &dpm_active);
+	device_pm_set_parent(dev, parent);
 	if ((error = dpm_sysfs_add(dev)))
 		list_del(&dev->power.entry);
 	up(&dpm_list_sem);
-

From: David Brownell
Date: Saturday, March 31, 2007 - 12:32 pm

As expected, this behaved OK on an x86 laptop.  I'll see if it breaks
some of the ARM boards I have handy.

To be clear, what this means is that if "clockevent" and "clocksource"
devices get registered "very early", and the various clockevent and
clock source devices get registered, then the suspend/resume methods
for those will all get grouped together ... suspended "very late" and
resumed "very early", regardless of when they get registered.  Pretty
much the driver model parts of what Linus was suggesting; clockevent
bits would still be needed.

-

From: Jeff Chua
Date: Saturday, March 31, 2007 - 8:13 pm

I tested your patch with CONFIG_NO_HZ=y, but it still hangs while
suspending to disk (before the percent saving).

But one discovery. I get tired of all these hangs, so I decided to
press some keys and the power button. Accidentally, the suspend
proceeded and successfully suspended!

I tried many more times, and discovered that to proceed with the
suspend, hit any 4 keys slowly. (e.g. "F1 F2 F3 F4", or "1 2 3 4").

My .config has CONFIG_NO_HZ=y and CONFIG_HIGH_RES_TIMERS=y, but I
suppose CONFIG_HIGH_RES_TIMERS=y gots nothing to do with the hang.

I went back with my vanilla 2.6.21-rc5 with Maxim's patch, and with
hitting the keys, I could suspend to disk with CONFIG_NO_HZ=y and
CONFIG_HIGH_RES_TIMERS=y.

Jeff.
-

From: David Brownell
Date: Saturday, March 31, 2007 - 9:13 pm

Of course.  No clockevent updates...

-

From: Greg KH
Date: Saturday, March 31, 2007 - 10:08 am

Yes, this is how it works (or if not, then there's a bug that needs to
be fixed, as that is how it _should_ work...)

thanks,

greg k-h
-

From: David Brownell
Date: Saturday, March 31, 2007 - 10:55 am

Right, but "when things get added" doesn't correlate well to
"when they should get suspended/resumed".  It's also in basic
conflict with runtime PM models, where devices may be suspended
at essentially any time.  And sysdevs are even stranger.


One way out:  rather than constructing that list as devices get
enumerated, it could be constructed by a (linear-time, non-recursive)
walk of the device tree(s) before they get suspended.

(Or equivalently:  construct lists at enumeration time, but just
adding them *right after their parent* rather than at the end of
the list.)

Would that solve the problem here?  Potentially ... if the tree is

If each of those were a device node, with "clocksource" suspend/resume
methods handling Thomas' "timekeeping" item, and simlarly for "clockevent"
devices ... I could see that all working neatly.

- Dave

-

From: Maxim Levitsky
Date: Saturday, March 31, 2007 - 9:56 am

Hi,

So maybe I was right afrer all,
Maybe it is better to add a suspend/resume hook to each clock source and call 
it from timekeeping_resume() ?

Or maybe even unite clocksources with clockevents, don't know 

By the way I want to report maybe a bug / maybe a feature :-)  : 
(sorry for long explanation)

Basically I have two clockevent sources : PIT(HPET) and APIC
(Actually I it seems that in next version of kernel HPET will be switched out 
of 'legacy replacement mode' , so PIT and HPET and RTC could coexist of same system,
But HPET won't be able to generate IRQ0, and it will be assigned some IRQ, possibly shared with other devices)

APIC timer  is chosen by default and works fine, 
since I don't have C2/C2 states on my system (ICH8 doesn't support them :-( )

But if I force it off (nolapic_timer) HPET or PIC is chosen and strangely they are
 put in _periodic_ mode although they are capable of one-shot mode
Is this a bug ?

Secondary I am getting a very strange behavior if I use CONFIG_NOHZ + !CONFIG_HIGH_RES_TIMERS
and try to suspend to ram:

System resumes, but gets crazy:
'top' shows that  ksoftirqd consumes 9999 % of cpu time (this is not a typo)
And other 'normal' programs that are running show same 9999 too.
System slows to crawl.

Also I found that one of APICS is in periodic mode,  and second is in one shoot mode.
And I tested this with or without my patch (thank goodness it is not my fault)

CONFIG_NOHZ + CONFIG_HIGH_RES_TIMERS work just fine.

Best regards,
	Maxim Levitsky
-

From: Linus Torvalds
Date: Saturday, March 31, 2007 - 10:09 am

Umm.. WHy not make the device tree look like this:

	    -- "clocksource" -- +-- HPET
				|
				+-- TSC
				|
				+-- i8259
				|
				+-- lapic timer
				|
				.. whatever else

and use the "struct device" that we *have* for this? The whole "struct 
device" is literally designed to do this, and to be embedded into whatever 
bigger structures you have that describes higher-level behaviour. Ie you'd 
put a "struct device" inside the "struct clocksource".

That thingalready *has* the suspend/resume hooks, and it will mean that 
people will see the clocks in the device tree rather than have a new 
notion.


			Linus
-

From: Ingo Molnar
Date: Saturday, March 31, 2007 - 10:17 am

yeah. There's some practical problems that need to be sorted out: much 
of the current GTOD code is irq-driven (and all GTOD locks are 
irq-safe), while the sysfs code needs to run in process-context level. 

Clocksources 'arrive' and 'depart' in hardirq context (which is the 
primary place where we notice their breakage, determine that they are 
now verified to be usable, etc.). This came partly from legacy: the 
gradual conversion of the monolithic time code, and the need to preserve 
GTOD and non-GTOD architectures without too much duplication. It also 
came partly because there's also a fundamental need to have accurate 
time, which is better served from irq context.

i very much agree that this should and must be cleaned up, but it needs 
quite a bit more logistics than it might appear at first sight. 
Clockevents basically just followed (and had to follow) the direction of 
clocksources in this regard.

	Ingo
-

From: Daniel Walker
Date: Saturday, March 31, 2007 - 10:58 am

Is this in reference to the irq-context clocksource polling stuff? I
don't see a dire reason to keep that code, and I agree removing that is
a certainly a worth while cleanup .. I added this cleanup to one of my
trees when you first suggested it , and there is some infrastructure
that really should be added to facilitate it.

Daniel

-

From: Linus Torvalds
Date: Thursday, March 29, 2007 - 9:35 am

No, that only clears hpet_virt_address, and thus makes all subsequent 
"hpet_readl()" and "hpet_writel()" calls oops.

But it doesn't actually *tell* anybody that the HPET is disabled, so if 
you later on do

	if (is_hpet_capable()) {
		time = hpet_readl(..);
		..

you will just Oops!

So as far as I can see, even with your latest patch, if hpet_enable() 
fails (and triggers the "goto out_nohpet" cases), you'll just oops 
immediately when you try to suspend/resume the HPET.

THAT was what I meant - when we clear hpet_virt_address, we should also 
tell all potential subsequent users that the HPET is not there!

		Linus
-

From: Maxim Levitsky
Date: Thursday, March 29, 2007 - 9:51 am

Hi,
	I agree, and as you said I did exactly that:

out_nohpet:
	iounmap(hpet_virt_address);
	hpet_virt_address = NULL;
	boot_hpet_disable = 1; <<<--- this ensures that is_hpet_capable() will never return positive value


	I also sent an updated version on my patch with subject line "[PATCH v2] Add suspend/resume for HPET"

	I forgot (a typo) to check error code in  hpet_register_sysfs
	Thanks to Sergei Shtylyov for pointing me on that.

	This patch should be ok.

	Best regards,
		Maxim Levitsky
-

From: Linus Torvalds
Date: Thursday, March 29, 2007 - 10:22 am

Gaah. I'm blind. Sorry. Your patch did indeed do exactly that, I somehow 
overlooked it.

		Linus
-

From: Ingo Molnar
Date: Thursday, March 29, 2007 - 10:47 am

update: i've tested Maxim's v2 patch on both a hpet-capable and a 
hpet-less system, and it works fine here.

on a dual-core hpet-capable system, running a NO_HZ+!HIGH_RES_TIMERS 
kernel:

  europe:~> grep Clock /proc/timer_list
  Clock Event Device: hpet
  Clock Event Device: lapic
  Clock Event Device: lapic

s2ram works fine now - it hung on resume before.

on a dual-core non-hpet system, with a NO_HZ+!HIGH_RES_TIMERS kernel:

  neptune:~> grep Clock /proc/timer_list
  Clock Event Device: pit
  Clock Event Device: lapic
  Clock Event Device: lapic

s2ram worked fine before - and it still works now.

(The combination of NO_HZ+!HIGH_RES_TIMERS was the most fragile wrt. 
suspend because in the !HIGH_RES_TIMERS there's just a single instance 
after resume that we touch the timer hardware, and we very much rely on 
the periodic interrupt being set to the precise value.)

So this is a go on my systems - good work Maxim! (I've reproduced 
Maxim's patch below with minor patch-metadata updates.)

	Ingo

---------------------------->
Subject: [patch] add suspend/resume for HPET
From: Maxim Levitsky <maximlevitsky@gmail.com>

This adds support for suspend/resume on i386 for HPET.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>

---
 arch/i386/kernel/hpet.c |   68 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

Index: linux/arch/i386/kernel/hpet.c
===================================================================
--- linux.orig/arch/i386/kernel/hpet.c
+++ linux/arch/i386/kernel/hpet.c
@@ -3,6 +3,8 @@
 #include <linux/errno.h>
 #include <linux/hpet.h>
 #include <linux/init.h>
+#include <linux/sysdev.h>
+#include <linux/pm.h>
 
 #include <asm/hpet.h>
 #include <asm/io.h>
@@ -307,6 +309,7 @@ int __init hpet_enable(void)
 out_nohpet:
 	iounmap(hpet_virt_address);
 	hpet_virt_address = NULL;
+	boot_hpet_disable = 1;
 	return 0;
 }
 
@@ -523,3 +526,68 @@ irqreturn_t ...
From: Michael S. Tsirkin
Date: Wednesday, March 28, 2007 - 11:04 am

Bingo!

The patch below fixes the two problems (listed above) with
resume from RAM that I have observed on my T60 with
2.6.21-rc5: with this patch applied, and with CONFIG_NO_HZ
unset, date advances correctly, X functions properly and
there is no delay on first disk access.

Thanks very much.


Maxim, do you plan to send this upstream?

Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>

---

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index 0fd9fba..a1ec79e 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -152,6 +152,16 @@ static void hpet_set_mode(enum clock_event_mode mode,
 	unsigned long cfg, cmp, now;
 	uint64_t delta;
 
+
+	if ( mode != CLOCK_EVT_MODE_UNUSED && mode != CLOCK_EVT_MODE_SHUTDOWN)
+	{
+		unsigned long cfg = hpet_readl(HPET_CFG);
+		cfg |= HPET_CFG_ENABLE | HPET_CFG_LEGACY;
+		hpet_writel(cfg, HPET_CFG);
+		
+	}
+		
+
 	switch(mode) {
 	case CLOCK_EVT_MODE_PERIODIC:
 		delta = ((uint64_t)(NSEC_PER_SEC/HZ)) * hpet_clockevent.mult;

-- 
MST
-

From: Ingo Molnar
Date: Wednesday, March 28, 2007 - 11:32 am

we'll fix this the right way tomorrow, by adding a proper suspend/resume 
sysdev handler - the lapic clockevents driver already has that. Maxim 
definitely deserves alot of kudos for having found this bug!

	Ingo
-

From: Randy Dunlap
Date: Wednesday, March 28, 2007 - 11:35 am

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Jeff Chua
Date: Thursday, March 29, 2007 - 7:24 am

I confirmed that it suspend/resume disk/ram all works on my X60s with
CONFIG_HPET_TIMER=y and CONFIG_NO_HZ unset.

But suspend to disk still hang with CONFIG_NO_HZ=y no matter what
setting CONFIG_HPET_TIMER is set to.

Thanks,
Jeff.
-

From: Adrian Bunk
Date: Sunday, March 18, 2007 - 11:49 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : suspend/resume hangs (until keypress)  (CONFIG_NO_HZ)
References : http://bugzilla.kernel.org/show_bug.cgi?id=8181
             http://lkml.org/lkml/2007/3/11/112
Submitter  : Tomas Janousek <tomi@nomi.cz>
             Thomas Meyer <thomas@m3y3r.de>
             Milan Broz <mbroz@redhat.com>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
Patch      : http://lkml.org/lkml/2007/3/16/406
Status     : patch available


Subject    : system doesn't come out of suspend  (CONFIG_NO_HZ)
References : http://lkml.org/lkml/2007/2/22/391
Submitter  : Michael S. Tsirkin <mst@mellanox.co.il>
             Soeren Sonnenburg <kernel@nn7.de>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
             Ingo Molnar <mingo@elte.hu>
             Tejun Heo <htejun@gmail.com>
             Rafael J. Wysocki <rjw@sisk.pl>
Status     : problem is being debugged


Subject    : first disk access after resume takes several minutes
             ('date' does not advance after resume from RAM, CONFIG_NO_HZ=n)
References : http://lkml.org/lkml/2007/3/8/117
Submitter  : Michael S. Tsirkin <mst@mellanox.co.il>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
             Ingo Molnar <mingo@elte.hu>
Status     : problem is being debugged


Subject    : after resume: X hangs after drawing a couple of windows
References : http://lkml.org/lkml/2007/3/8/117
Submitter  : Michael S. Tsirkin <mst@mellanox.co.il>
Status     : unknown

-

From: Adrian Bunk
Date: Sunday, March 18, 2007 - 11:49 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : Dynticks and High resolution Timer hanging the system
             workaround: clocksource=acpi_pm
References : http://lkml.org/lkml/2007/3/7/504
Submitter  : Stephane Casset <sept@logidee.com>
Caused-By  : Thomas Gleixner <tglx@linutronix.de>
Status     : unknown


Subject    : soft lockup detected on CPU#0
References : http://lkml.org/lkml/2007/3/3/152
Submitter  : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
             Ingo Molnar <mingo@elte.hu>
Status     : unknown


Subject    : dynticks makes ksoftirqd1 use unreasonable amount of cpu time
References : http://bugzilla.kernel.org/show_bug.cgi?id=8100
Submitter  : Emil Karlson <jkarlson@cc.hut.fi>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
Status     : problem is being debugged


Subject    : i386: APIC timer disabled due to verification failure
             (once in three boots or so)
References : http://lkml.org/lkml/2007/3/16/126
Submitter  : Maxim Levitsky <maximlevitsky@gmail.com>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
Patch      : http://lkml.org/lkml/2007/3/16/420
Status     : patch available


-

From: Maxim
Date: Sunday, March 18, 2007 - 12:07 pm

Hello,

This one is a small issue, I got bigger ones: as I said change in code order in suspend code broke both suspend to ram and disk

Those are commits.
        e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change code ordering in main.c (breaks  S3)
        ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change code ordering in disk.c (breaks swsusp, I don't use it, but I tested it)
        259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change code ordering in user.c (breaks uswsusp, that I use)

System freezes before suspend with those commits, even before suspend to disk.

I think that is is not so good idea to tell about all bugs in single letter ;-)

Regards,
	Maxim Levitsky
-

From: Adrian Bunk
Date: Sunday, March 18, 2007 - 12:22 pm

This was in my list as

Subject    : suspend to disk hangs
References : http://lkml.org/lkml/2007/3/16/126
Submitter  : Maxim Levitsky <maximlevitsky@gmail.com>
Status     : unknown


These were 6 emails.

And they should give an overview as complete as possible of all 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Maxim
Date: Sunday, March 18, 2007 - 12:59 pm

You didn't understand me, I wanted to say that I sent a email with subject line 
"[BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far"

And I think now that I did wrong that I put all regressions and bugs I found here, but not in separate emails
Anyway it is not a problem

And by the way suspend to ram also hangs and hangs before suspend (Disk powers down, screen too, but fans are on, and etc...)


-

From: Maxim
Date: Sunday, March 18, 2007 - 1:03 pm

oops, ;-)

Regards,
	Maxim Levitsky
-

From: Adrian Bunk
Date: Sunday, March 18, 2007 - 11:49 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : Oops when changing DVB-T adapter
References : http://lkml.org/lkml/2007/3/9/212
Submitter  : CIJOML <cijoml@volny.cz>
Status     : unknown


Subject    : ipv6 crash
References : http://lkml.org/lkml/2007/3/10/2
Submitter  : Len Brown <lenb@kernel.org>
Status     : unknown


Subject    : forcedeth: skb_over_panic
References : http://bugzilla.kernel.org/show_bug.cgi?id=8058
Submitter  : Albert Hopkins <kernel@marduk.letterboxes.org>
Handled-By : Ayaz Abdulla <aabdulla@nvidia.com>
Status     : submitter was asked to test a patch


Subject    : snd_intel8x0: divide error: 0000
References : http://lkml.org/lkml/2007/3/5/252
Submitter  : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Handled-By : Takashi Iwai <tiwai@suse.de>
Status     : problem is being debugged


Subject    : snd-intel8x0: no 3d surround sound
References : http://lkml.org/lkml/2007/3/5/164
Submitter  : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Caused-By  : Randy Cushman <rcushman_linux@earthlink.net>
             commit 831466f4ad2b5fe23dff77edbe6a7c244435e973
Handled-By : Takashi Iwai <tiwai@suse.de>
Status     : patch available


Subject    : Oops in __nodemgr_remove_host_dev
References : http://lkml.org/lkml/2007/3/14/4
             http://lkml.org/lkml/2007/3/18/87
Submitter  : Ismail Dönmez <ismail@pardus.org.tr>
             Stefan Richter <stefanr@s5r6.in-berlin.de>
             Thomas Meyer <thomas@m3y3r.de>
Caused-By  : Greg Kroah-Hartman <gregkh@suse.de>
             commit 43cb76d91ee85f579a69d42bc8efc08bac560278
             commit ...
From: David Miller
Date: Monday, March 19, 2007 - 7:38 pm

From: Adrian Bunk <bunk@stusta.de>

This is caused by some problem in the router round-robin code in
net/ipv6/route.c:rt6_select()

Somehow it NULLs out fn->leaf, and then fib6_add_1() crashes
dererencing that NULL pointer as is seen in the report.

Deleting the router round-robin list mangling code in rt6_select()
makes the crash go away, but such a change causes regressions in the
ipv6 conformance tests.

Thomas Graf discovered this bug some time ago, but we still
haven't come up with a fix suitable for upstream :-/

This bug has been there for a very long time and is not a regression
of 2.6.21

I'll see if I can come up with something to fix this properly.
-

From: David Miller
Date: Saturday, March 24, 2007 - 12:50 pm

From: David Miller <davem@davemloft.net>

Here is the fix I came up with and just posted to netdev for
a quick review, I'll push this to the appropriate places soon
if nobody spots any problems in it.

commit 4c68db63b8314df3cf30b7fe595a1b8935bb2cb0
Author: David S. Miller <davem@sunset.davemloft.net>
Date:   Sat Mar 24 12:06:32 2007 -0700

    [IPV6]: Fix routing round-robin locking.
    
    As per RFC2461, section 6.3.6, item #2, when no routers on the
    matching list are known to be reachable or probably reachable we
    do round robin on those available routes so that we make sure
    to probe as many of them as possible to detect when one becomes
    reachable faster.
    
    Each routing table has a rwlock protecting the tree and the linked
    list of routes at each leaf.  The round robin code executes during
    lookup and thus with the rwlock taken as a reader.  A small local
    spinlock tries to provide protection but this does not work at all
    for two reasons:
    
    1) The round-robin list manipulation, as coded, goes like this (with
       read lock held):
    
    	walk routes finding head and tail
    
    	spin_lock();
    	rotate list using head and tail
    	spin_unlock();
    
       While one thread is rotating the list, another thread can
       end up with stale values of head and tail and then proceed
       to corrupt the list when it gets the lock.  This ends up causing
       the OOPS in fib6_add() later onthat many people have been hitting.
    
    2) All the other code paths that run with the rwlock held as
       a reader do not expect the list to change on them, they
       expect it to remain completely fixed while they hold the
       lock in that way.
    
    So, simply stated, it is impossible to implement this correctly using
    a manipulation of the list without violating the rwlock locking
    semantics.
    
    Reimplement using a per-fib6_node round-robin pointer.  This way we
    don't need to manipulate the ...
From: Andrew Morton
Date: Friday, March 16, 2007 - 11:26 am

fsck indeed.  I don't even understand what's happening with that one - it
seems like the kernel schedules a user process, but never deschedules it
again.
-

From: Michal Piotrowski
Date: Friday, March 16, 2007 - 11:55 am

Tomorrow, I'll try to find out how to reproduce this bug.

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-

From: Jan Engelhardt
Date: Friday, March 16, 2007 - 4:23 pm

"I have noticed some strange system behavior. When i try to build a
 kernel (medium load) - X, keyboard, mouse and sound hangs."

Note that ping is handled in interrupt or softirq context. So something has
locked up. Try without X? Or perhaps attack a serial console/netconsole, and
when it hangs, use Sysrq to dump the process' states.


Jan
-- 
-

From: Michal Piotrowski
Date: Friday, March 16, 2007 - 4:31 pm

I already did this

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-

From: Mariusz Kozlowski
Date: Saturday, March 17, 2007 - 1:19 am

Me too.

http://www.ussg.iu.edu/hypermail/linux/kernel/0703.0/1243.html

Although I can't reproduce it with 2.6.21-rc3-mm1.

Regards,

	Mariusz Kozlowski
-

From: Takashi Iwai
Date: Friday, March 16, 2007 - 11:54 am

At Fri, 16 Mar 2007 18:44:26 +0100,

The patch wasn't merged to rc4.
Or, do you mean the bug is still present even with the patch?


Takashi
-

From: Michal Piotrowski
Date: Friday, March 16, 2007 - 12:03 pm

This patch doesn't fix the problem.
http://www.ussg.iu.edu/hypermail/linux/kernel/0703.0/3080.html


Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-

From: Adrian Bunk
Date: Saturday, March 17, 2007 - 4:46 pm

You said at [1] that the patch from [2] (not yet merged into Linus' tree) 
fixed this problem for you.


cu
Adrian

[1] http://lkml.org/lkml/2007/3/13/276
[2] http://lkml.org/lkml/2007/3/7/540

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Michal Piotrowski
Date: Sunday, March 18, 2007 - 6:04 am

Sorry, it's my fault.


Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-

From: Takashi Iwai
Date: Friday, March 16, 2007 - 10:01 am

At Fri, 16 Mar 2007 09:33:54 -0700 (PDT),

alsa-git merge seems missing in rc4:
	http://lkml.org/lkml/2007/3/14/43

Linus, could you pull from below (linus brunch)?

  master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa.git linus

It includes regression fixes.


Thanks,

Takashi
-

From: Adrian Bunk
Date: Monday, March 19, 2007 - 1:39 pm

Let's not bore people running -rc kernels with regressions that have 
patches available - let's get this patches into the tree for giving 
them the pure exciting experience of the many unfixed regressions.


This email lists some known regressions in Linus' tree compared to 2.6.20
with patches available.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : acpi_serialize locks system during boot
References : http://bugzilla.kernel.org/show_bug.cgi?id=8171
Submitter  : Colchao <colchaodemola@gmail.com>
Handled-By : Len Brown <len.brown@intel.com>
Patch      : http://bugzilla.kernel.org/show_bug.cgi?id=8171
Status     : patch available


Subject    : laptop immediately resumes after suspend
References : http://lkml.org/lkml/2007/3/8/469
Submitter  : Ray Lee <ray-lk@madrabbit.org>
Caused-By  : Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
             commit ed41dab90eb40ac4911e60406bc653661f0e4ce1
Handled-By : Len Brown <lenb@kernel.org>
Patch      : http://lkml.org/lkml/2007/3/12/228
Status     : patch available


Subject    : suspend/resume hangs (until keypress)  (CONFIG_NO_HZ)
References : http://bugzilla.kernel.org/show_bug.cgi?id=8181
             http://lkml.org/lkml/2007/3/11/112
Submitter  : Tomas Janousek <tomi@nomi.cz>
             Thomas Meyer <thomas@m3y3r.de>
             Milan Broz <mbroz@redhat.com>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
Patch      : http://lkml.org/lkml/2007/3/16/406
Status     : patch available


Subject    : snd-intel8x0: no 3d surround sound
References : http://lkml.org/lkml/2007/3/5/164
Submitter  : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Caused-By  : Randy Cushman <rcushman_linux@earthlink.net>
   ...
From: Takashi Iwai
Date: Tuesday, March 20, 2007 - 4:02 am

At Mon, 19 Mar 2007 21:39:43 +0100,

The patch was already merged after rc4.


thanks,

Takashi
-

From: Adrian Bunk
Date: Friday, March 23, 2007 - 11:48 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : problem with sockets
References : http://lkml.org/lkml/2007/3/21/248
Submitter  : Jose Alberto Reguero <jareguero@telefonica.net>
Status     : unknown


Subject    : crashes in KDE
References : http://bugzilla.kernel.org/show_bug.cgi?id=8157
Submitter  : Oliver Pinter <oliver.pntr@gmail.com>
Status     : unknown


Subject    : kwin dies silently  (sysctl related?)
References : http://lkml.org/lkml/2007/2/28/112
Submitter  : Sid Boyce <g3vbv@blueyonder.co.uk>
Status     : submitter was asked to bisect further


-

From: David Miller
Date: Saturday, March 24, 2007 - 9:45 pm

From: Adrian Bunk <bunk@stusta.de>

Not enough information in his report, for example for the
case he says does not work he fails to indicate what kernel
or system type the Client runs on.

Furthermore, his scripts don't even execute properly when
I try to run them myself, for example server.py gives me
this syntax error when python tries to parse the script:

davem@sunset:~/src/GIT/net-2.6$ /usr/bin/python server.py
  File "server.py", line 9
    struct sockaddr_in ServerAddr;
                     ^
SyntaxError: invalid syntax


Can someone help de-crapify this bug report?
-

From: Paul Collins
Date: Saturday, March 24, 2007 - 10:08 pm

The reporter's client and server code seem to be C, but if the he thinks
they should be Python, I guess they're not the correct ones anyway.

-- 
Paul Collins
Wellington, New Zealand

Dag vijandelijk luchtschip de huismeester is dood
-

From: Adrian Bunk
Date: Sunday, March 25, 2007 - 5:22 am

He described one Python and one C example, and attached 2+2 programs.

The mailing list archive I quoted only contains the C files and doesn't 
display the Python files with the error message
"unhandled content-type:application/x-python" - but that's not the fault 
of the submitter, the bug report itself is OK.

I've attached all 4 files to this email.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

From: Adrian Bunk
Date: Friday, March 23, 2007 - 11:48 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : kernels fail to boot with drives on ATIIXP controller
             (ACPI/IRQ related)
References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229621
             http://lkml.org/lkml/2007/3/4/257
Submitter  : Michal Jaegermann <michal@ellpspace.math.ualberta.ca>
Status     : unknown


Subject    : x86_64: ACPI regression with noapic  (APICTIMER_STOPS_ON_C3?)
References : http://lkml.org/lkml/2007/3/8/468
             http://lkml.org/lkml/2007/3/22/156
Submitter  : Ray Lee <ray-lk@madrabbit.org>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
Status     : problem is being debugged


Subject    : NCQ problem with ahci and Hitachi drive  (ACPI related)
References : http://lkml.org/lkml/2007/3/4/178
             http://lkml.org/lkml/2007/3/9/475
Submitter  : Mathieu Bérard <Mathieu.Berard@crans.org>
Handled-By : Tejun Heo <htejun@gmail.com>
Status     : problem is being debugged


Subject    : libata: PATA UDMA/100 configured as UDMA/33
References : http://lkml.org/lkml/2007/2/20/294
             http://www.mail-archive.com/linux-ide@vger.kernel.org/msg04115.html
             http://bugzilla.kernel.org/show_bug.cgi?id=8133
             http://bugzilla.kernel.org/show_bug.cgi?id=8164
             http://lkml.org/lkml/2007/3/21/330
Submitter  : Fabio Comolli <fabio.comolli@gmail.com>
             Plamen Petrov <plamen.petrov@tk.ru.acad.bg>
             Laurent Riffard <laurent.riffard@free.fr>
             Lukas Hejtmanek <xhejtman@mail.muni.cz>
Handled-By : Tejun Heo <htejun@gmail.com>
             Alan Cox <alan@lxorguk.ukuu.org.uk>
Status ...
From: Thomas Gleixner
Date: Friday, March 23, 2007 - 2:08 pm

Ray,

can you please test the patch below ?

Thanks,

	tglx

------------------>
Subject: [PATCH] x86_64: avoid sending LOCAL_TIMER_VECTOR IPI to itself

Ray Lee reported, that on an UP kernel with "noapic" commandline option
set, the box locks hard during boot.

Adding some debug printks revieled, that the last action on the box
before stalling was "Send IPI" - a debug printk which was put into
smp_send_timer_broadcast_ipi().

It seems that send_IPI_mask(mask, LOCAL_TIMER_VECTOR) fails when
"noapic" is set on the commandline on an UP kernel.

Aside of that it does not make much sense to trigger an interrupt
instead of calling the function directly on the CPU which gets the
PIT/HPET interrupt in case of broadcasting.

Reported-by: Ray Lee <ray-lk@madrabbit.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git a/arch/x86_64/kernel/apic.c b/arch/x86_64/kernel/apic.c
index 723417d..83328e1 100644
--- a/arch/x86_64/kernel/apic.c
+++ b/arch/x86_64/kernel/apic.c
@@ -930,9 +930,17 @@ EXPORT_SYMBOL(switch_APIC_timer_to_ipi);
 
 void smp_send_timer_broadcast_ipi(void)
 {
+	int cpu = smp_processor_id();
 	cpumask_t mask;
 
 	cpus_and(mask, cpu_online_map, timer_interrupt_broadcast_ipi_mask);
+
+	if (cpu_isset(cpu, mask)) {
+		cpu_clear(cpu, mask);
+		add_pda(apic_timer_irqs, 1);
+		smp_local_timer_interrupt();
+	}
+
 	if (!cpus_empty(mask)) {
 		send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
 	}


-

From: Ray Lee
Date: Friday, March 23, 2007 - 5:14 pm

(I wondered about the IPI on a UP system, seemed a bit weird :-).)

Works great, booting both with NOAPIC and without. *Much* thanks for
debugging this while you're also handling a bunch of other issues at
the same time.

Patch reproduced below, with an acked-by (and, uhm, a couple of spelling
fixes in the description -- don't hate me, 'kay?). Please apply before
2.6.21 final.

------------------>
Subject: [PATCH] x86_64: avoid sending LOCAL_TIMER_VECTOR IPI to itself

Ray Lee reported, that on an UP kernel with "noapic" command line option
set, the box locks hard during boot.

Adding some debug printks revealed, that the last action on the box
before stalling was "Send IPI" - a debug printk which was put into
smp_send_timer_broadcast_ipi().

It seems that send_IPI_mask(mask, LOCAL_TIMER_VECTOR) fails when
"noapic" is set on the command line on an UP kernel.

Aside of that it does not make much sense to trigger an interrupt
instead of calling the function directly on the CPU which gets the
PIT/HPET interrupt in case of broadcasting.

Reported-by: Ray Lee <ray-lk@madrabbit.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by:  Ray Lee <ray-lk@madrabbit.org>

diff --git a/arch/x86_64/kernel/apic.c b/arch/x86_64/kernel/apic.c
index 723417d..83328e1 100644
--- a/arch/x86_64/kernel/apic.c
+++ b/arch/x86_64/kernel/apic.c
@@ -930,9 +930,17 @@ EXPORT_SYMBOL(switch_APIC_timer_to_ipi);

 void smp_send_timer_broadcast_ipi(void)
 {
+	int cpu = smp_processor_id();
 	cpumask_t mask;

 	cpus_and(mask, cpu_online_map, timer_interrupt_broadcast_ipi_mask);
+
+	if (cpu_isset(cpu, mask)) {
+		cpu_clear(cpu, mask);
+		add_pda(apic_timer_irqs, 1);
+		smp_local_timer_interrupt();
+	}
+
 	if (!cpus_empty(mask)) {
 		send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
 	}


-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 11:40 pm

Ray,



I know that my English sucks.

Thanks,

	tglx


-

From: Ray Lee
Date: Saturday, March 24, 2007 - 11:17 am

Your English is fantastic, and far better than my German ever will be, so
no worries :-).

~r.
-

From: Ingo Molnar
Date: Saturday, March 24, 2007 - 12:11 pm

i think this bug deserves a bit more attention, because similar problems 
could be in other codepaths too.

the problem here is that we tried to send an IPI to ourselves - which 
confused Ray's system which has an IO-APIC, but where due to noapic we 
keep the IO-APIC in its BIOS default.

this isnt a new problem: the new time code just exposed it more 
prominently that it was visible before. (the SMP kernel probably would 
hang in a similar way on Ray's system)

i dont see any clear debugging in the IPI code that excludes self-IPIs. 
I think the only valid way to do that is to use DEST_SELF. Andi?

	Ingo
-

From: Ray Lee
Date: Sunday, March 25, 2007 - 12:24 pm

Taking the hint, yes it does. (I'd never had a reason to test it before.)
Booting an SMP kernel without NOAPIC works, with NOAPIC hangs fairly early
on, implying a real fix to my problem belongs down at the arch level, I

Ray
-

From: Tejun Heo
Date: Monday, March 26, 2007 - 3:01 am

Patch is available and whether to put it into mainline or not is being
discussed.  libata EH does the right thing after several errors so
things should work properly after several errors.


Further patch submitted.

http://thread.gmane.org/gmane.linux.ide/17444

This should fix all regression cases.  sata_nv has been always broken so
isn't a regression.  It needs acpi tricks and I don't think it fits
2.6.21 cycle.

Thanks.

-- 
tejun
-

From: Adrian Bunk
Date: Friday, March 23, 2007 - 11:50 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : Oops when changing DVB-T adapter
References : http://lkml.org/lkml/2007/3/9/212
Submitter  : CIJOML <cijoml@volny.cz>
Status     : unknown


Subject    : USB: iPod doesn't work
References : http://lkml.org/lkml/2007/3/21/320
Submitter  : Tino Keitel <tino.keitel@gmx.de>
Handled-By : Oliver Neukum <oneukum@suse.de>
Status     : problem is being debuggged


Subject    : snd_intel8x0: divide error: 0000
References : http://lkml.org/lkml/2007/3/5/252
Submitter  : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Handled-By : Takashi Iwai <tiwai@suse.de>
Status     : problem is being debugged


Subject    : forcedeth: skb_over_panic
References : http://bugzilla.kernel.org/show_bug.cgi?id=8058
Submitter  : Albert Hopkins <kernel@marduk.letterboxes.org>
Handled-By : Ayaz Abdulla <aabdulla@nvidia.com>
Patch      : http://bugzilla.kernel.org/show_bug.cgi?id=8058
Status     : patch available


-

From: Adrian Bunk
Date: Friday, March 23, 2007 - 11:50 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : system doesn't come out of suspend  (CONFIG_NO_HZ)
References : http://lkml.org/lkml/2007/2/22/391
Submitter  : Michael S. Tsirkin <mst@mellanox.co.il>
             Soeren Sonnenburg <kernel@nn7.de>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
             Ingo Molnar <mingo@elte.hu>
             Tejun Heo <htejun@gmail.com>
             Rafael J. Wysocki <rjw@sisk.pl>
Status     : problem is being debugged


Subject    : first disk access after resume takes several minutes
             ('date' does not advance after resume from RAM, CONFIG_NO_HZ=n)
References : http://lkml.org/lkml/2007/3/8/117
Submitter  : Michael S. Tsirkin <mst@mellanox.co.il>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
             Ingo Molnar <mingo@elte.hu>
Status     : problem is being debugged


Subject    : Dynticks and High resolution Timer hanging the system
             workaround: clocksource=acpi_pm
References : http://lkml.org/lkml/2007/3/7/504
Submitter  : Stephane Casset <sept@logidee.com>
Caused-By  : Thomas Gleixner <tglx@linutronix.de>
Status     : unknown


Subject    : soft lockup detected on CPU#0
References : http://lkml.org/lkml/2007/3/3/152
Submitter  : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Handled-By : Thomas Gleixner <tglx@linutronix.de>
             Ingo Molnar <mingo@elte.hu>
Status     : unknown


Subject    : gettimeofday increments too slowly
References : http://bugzilla.kernel.org/show_bug.cgi?id=8027
Submitter  : David L <idht4n@hotmail.com>
Caused-By  : Thomas Gleixner <tglx@linutronix.de>
             commit ...
From: Thomas Gleixner
Date: Friday, March 23, 2007 - 12:15 pm

Patch available: http://lkml.org/lkml/2007/3/22/301

commit 6b3964cde70cfe6db79d35b42137431ef7d2f7e4

	tglx


-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 12:21 pm

Oops. That fixed only the one half of the problem. The timeofday one
persists.

John, any idea ?

	tglx




-

From: Adrian Bunk
Date: Friday, March 23, 2007 - 12:15 pm

As far as I understood it, this patch only fixed the bogomips issue
caused by commit e9e2cdb412412326c4827fc78ba27f410d837e6e, but not the 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Chuck Ebbert
Date: Friday, March 23, 2007 - 3:23 pm

For the other issue raised there, clock running too slow, I now
realize there is a similar report:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=231626

-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 3:43 pm

That's a different one, AFAICT. Davids problem is probably caused by me
breaking the TSC watchdog. 

/me orders paperbags prophylactically and goes back to look at the code

	tglx


-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 4:35 pm

David,

can you please test the patch below ?

	tglx

--------------------->
Subject: [PATCH] clocksource: Fix thinko in watchdog selection

The watchdog implementation excludes low res / non continuous
clocksources from being selected as a watchdog reference
unintentionally.

Allow using jiffies/PIT as a watchdog reference as long as no better
clocksource is available. This is necessary to detect TSC breakage on
systems, which have no pmtimer/hpet.

The main goal of the initial patch (preventing to switch to highres/nohz
when no reliable fallback clocksource is available) is still guaranteed
by the checks in clocksource_watchdog().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 5b0e46b..fe5c7db 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -151,7 +151,8 @@ static void clocksource_check_watchdog(struct clocksource *cs)
 			watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
 			add_timer(&watchdog_timer);
 		}
-	} else if (cs->flags & CLOCK_SOURCE_IS_CONTINUOUS) {
+	} else {
+		if (cs->flags & CLOCK_SOURCE_IS_CONTINUOUS)
 			cs->flags |= CLOCK_SOURCE_VALID_FOR_HRES;
 
 		if (!watchdog || cs->rating > watchdog->rating) {


-

From: Thomas Gleixner
Date: Sunday, March 25, 2007 - 5:42 am

The watchdog implementation excludes low res / non continuous
clocksources from being selected as a watchdog reference
unintentionally.

Allow using jiffies/PIT as a watchdog reference as long as no better
clocksource is available. This is necessary to detect TSC breakage on
systems, which have no pmtimer/hpet.

The main goal of the initial patch (preventing to switch to highres/nohz
when no reliable fallback clocksource is available) is still guaranteed
by the checks in clocksource_watchdog().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 5b0e46b..fe5c7db 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -151,7 +151,8 @@ static void clocksource_check_watchdog(struct clocksource *cs)
 			watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
 			add_timer(&watchdog_timer);
 		}
-	} else if (cs->flags & CLOCK_SOURCE_IS_CONTINUOUS) {
+	} else {
+		if (cs->flags & CLOCK_SOURCE_IS_CONTINUOUS)
 			cs->flags |= CLOCK_SOURCE_VALID_FOR_HRES;
 
 		if (!watchdog || cs->rating > watchdog->rating) {


-

From: Adrian Bunk
Date: Friday, March 23, 2007 - 4:00 pm

It shouldn't be the same issue:
2.6.20-1.2925.fc6 is based in 2.6.20.3-rc1 while this issue is a
2.6.21-rc regression.
Or do the -rc kernels include parts of the -rt patchsets?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Chuck Ebbert
Date: Friday, March 23, 2007 - 4:05 pm

Sometimes problems leak from the next kernel version into -stable
via the stable kernel patches.

Other times the bug may have been there all along but nobody had
found it yet...


-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 12:22 pm

The problem is not reproducible on any of my machines.

Emil, is it still there with Linus latest ?

	tglx


-

From: Thomas Gleixner
Date: Saturday, March 24, 2007 - 6:47 am

Emil,


I've uploaded a patch against 2.6.21-rc4 to

http://tglx.de/private/tglx/2.6.21-rc4-trace/2.6.21-rc4-trace.patch.bz2

It contains all changes in Linus tree since -rc4 plus the two pending
fixes (http://tglx.de/private/tglx/2.6.21-rc4-pending/) along with a
backport of the latency tracer from the realtime preemption patch.

Can you please apply the patch on top of -rc4 and build it with the
configuration, which exposes this strange behaviour. Please enable also
CONFIG_LATENCY_TRACE in the Kernel hacking menu.

When the problem is visible, then run trace-it
(http://tglx.de/private/tglx/2.6.21-rc4-trace/trace-it.c) as root:

# trace-it >trace.txt

This captures roughly one second of kernel code pathes. Please stick
trace.txt into Bugzilla.

Thanks,

	tglx


-

From: Thomas Gleixner
Date: Sunday, March 25, 2007 - 5:31 am

The rework of next_timer_interrupt() fixed the timer wheel bugs, but
invented a rounding error versus the next hrtimer event. This is caused
by the conversion of the hrtimer internal representation to relative
jiffies.

This causes bug #8100:
http://bugzilla.kernel.org/show_bug.cgi?id=8100

next_timer_interrupt() returns "now" in such a case and causes the code
in tick_nohz_stop_sched_tick() to trigger the timer softirq, which is
bogus as no timer is due for expiry. This results in an endless context
switching between idle and ksoftirqd until a timer is due for expiry.

Modify the hrtimer evaluation so that, it returns now + 1, when the
conversion results in a delta < 1 jiffie. 

It's confirmed to resolve bug #8100

Reported-by: Emil Karlson <jkarlson@cc.hut.fi>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git a/kernel/timer.c b/kernel/timer.c
index 797cccb..440048a 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -695,15 +695,28 @@ static unsigned long cmp_next_hrtimer_event(unsigned long now,
 {
 	ktime_t hr_delta = hrtimer_get_next_event();
 	struct timespec tsdelta;
+	unsigned long delta;
 
 	if (hr_delta.tv64 == KTIME_MAX)
 		return expires;
 
-	if (hr_delta.tv64 <= TICK_NSEC)
-		return now;
+	/*
+	 * Expired timer available, let it expire in the next tick
+	 */
+	if (hr_delta.tv64 <= 0)
+		return now + 1;
 
 	tsdelta = ktime_to_timespec(hr_delta);
-	now += timespec_to_jiffies(&tsdelta);
+	delta = timespec_to_jiffies(&tsdelta);
+	/*
+	 * Take rounding errors in to account and make sure, that it
+	 * expires in the next tick. Otherwise we go into an endless
+	 * ping pong due to tick_nohz_stop_sched_tick() retriggering
+	 * the timer softirq
+	 */
+	if (delta < 1)
+		delta = 1;
+	now += delta;
 	if (time_before(now, expires))
 		return now;
 	return expires;


-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 12:49 pm

I lost track of Michaels various nested problems.

Michael can you please give a summary on _all_ entries in the
regressions list against Linus latest ?

Thanks,

	tglx


-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 1:00 pm

Stephane, does the problem still exists with Linus latest ?

Thanks,

	tglx


-

From: Thomas Gleixner
Date: Friday, March 23, 2007 - 1:08 pm

Michal,

any news on that one ? 

You said the same problem exists in 2.6.20.1. Has this been resolved in
2.6.20.2/3

Thanks,

	tglx


-

From: Michal Piotrowski
Date: Saturday, March 24, 2007 - 6:59 am

Hi,



Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-

From: Thomas Gleixner
Date: Saturday, March 24, 2007 - 8:14 am

Is it solved in Linus latest too ?

	tglx


-

From: Michal Piotrowski
Date: Saturday, March 24, 2007 - 9:13 am

Yes, it's solved.

Adrian, please remove this bug from known regressions list.
It's fixed in the latest -git and -stable.

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-

From: john stultz
Date: Friday, March 23, 2007 - 2:43 pm

The incorrect clocksource selection is resolved w/ this patch:
http://lkml.org/lkml/2007/3/22/287

There is still an issue of why the PIT clocksource hangs, but for the
moment the issue its worked-around.

thanks
-john

-

From: Linus Torvalds
Date: Friday, March 23, 2007 - 2:54 pm

Hmm.. I haven't seen it until now. Is it waiting for something?

		Linus
-

From: john stultz
Date: Friday, March 23, 2007 - 5:44 pm

Not really. Just waiting for Andrew to pick it up and push it on.

thanks
-john

-

From: Adrian Bunk
Date: Friday, March 23, 2007 - 11:50 am

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : ThinkPad X60: resume no longer works  (PCI related?)
References : http://lkml.org/lkml/2007/3/13/3
Submitter  : Dave Jones <davej@redhat.com>
             Jeremy Fitzhardinge <jeremy@goop.org>
Caused-By  : PCI merge
             commit 78149df6d565c36675463352d0bfe0000b02b7a7
Handled-By : Eric W. Biederman <ebiederm@xmission.com>
             Rafael J. Wysocki <rjw@sisk.pl>
Status     : problem is being debugged


Subject    : MacMini: doesn't come out of suspend to ram
References : http://lkml.org/lkml/2007/3/21/374
Submitter  : Frédéric RISS <frederic.riss@gmail.com>
             Tino Keitel <tino.keitel@gmx.de>
Caused-By  : Bob Moore <robert.moore@intel.com>
             commit c5a7156959e89b32260ad6072bbf5077bcdfbeee
Status     : unknown


Subject    : Suspend to RAM doesn't work anymore  (ACPI?)
References : http://lkml.org/lkml/2007/3/19/128
             http://bugzilla.kernel.org/show_bug.cgi?id=8247
Submitter  : Tobias Doerffel <tobias.doerffel@gmail.com>
Handled-By : Rafael J. Wysocki <rjw@sisk.pl>
Status     : problem is being debugged


Subject    : s2ram autowake regression  (ACPI?)
References : http://lkml.org/lkml/2007/3/20/96
Submitter  : Pavel Machek <pavel@ucw.cz>
Handled-By : Len Brown <lenb@kernel.org>
Status     : submitter was asked to test a patch


Subject    : ThinkPad doesn't resume from suspend to RAM
References : http://lkml.org/lkml/2007/2/27/80
             http://lkml.org/lkml/2007/2/28/348
Submitter  : Jens Axboe <jens.axboe@oracle.com>
             Jeff Chua <jeff.chua.linux@gmail.com>
Status     : ...
From: Maxim
Date: Friday, March 23, 2007 - 12:07 pm

Hello, 
	It is fixed

	The problem is that now cpu_up/cpu_down is called with tasks frozen,
	and this can lead to deadlock if some driver that registered cpu up/down notifier, sleeps,

	On my system it froze in two places, one in XFS due to freezable workqueues, and in
	microcode update driver that ask the "frozen" userspace for firmware.

	Fix for XFS is already in mainline, and Rafael J. Wysocki. already posted a patch that fixes microcode issue,
	I will test it.

	But I feel that there are more drivers that can deadlock system in same way, on my system S3/S4 works perfect :-)
	Even the weird  hang i had disappeared.

	Big thanks to Rafael J. Wysocki.


	Best regards,
		Maxim Levitsky
-

From: Rafael J. Wysocki
Date: Friday, March 23, 2007 - 1:53 pm

The problem has been identified as the known issue related to the XFS freezable
workqueues.  There is a patch available (http://lkml.org/lkml/2007/3/21/328),
that has been merged.

Still, there is a problem with the microcode update driver that's being worked
on.

The reporters of the resume problems who use the microcode driver, please
check if the problems go away if you unload the driver before the suspend.

Greetings,
Rafael
-

From: Thomas Meyer
Date: Saturday, March 24, 2007 - 10:04 am

The problem is identified: http://lkml.org/lkml/2007/3/22/150



-

From: Eric W. Biederman
Date: Saturday, March 24, 2007 - 11:02 am

Given the description above I'm a little confused.  Doesn't this
happen every time now?

Or was this happening only the second time before I started my msi
fixes... 

Eric
-

From: Thomas Meyer
Date: Saturday, March 24, 2007 - 11:20 am

With current git head the oops happens in the second suspend to disk
So i think, that the current git head already contains your msi fixes.
I don't know if this already happend before your msi changes, but i can
test 2.6.20 if you like to?

-

From: Eric W. Biederman
Date: Saturday, March 24, 2007 - 11:47 am

Odd.  I would have thought the oops happened in the first resume, not
the second. 

Hmm.  It may have something to do with the ``managed'' driver


Sure.  A data point if you boot with nomsi or have a kernel compiled
without msi support would be interesting as well.

As the problem case may not show up without msi support in the picture.

Eric

-

From: Thomas Meyer
Date: Saturday, March 24, 2007 - 1:34 pm

No. I don't think so. The problem is caused by this sequence: (the info
is always before entry of a function and before the exit of a function):

1.) Normal boot
[kernel] ahci 0000:00:1f.2: version 2.1
[kernel] pci_enable_device: dev= c1a59000
[kernel] pci_enable_device: irq= 0
[kernel] pci_enable_device: msi_enabled= 0
[kernel] PCI: Enabling device 0000:00:1f.2 (0005 -> 0007)
[kernel] ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) ->
IRQ 19
[kernel] pci_enable_device: dev= c1a59000
[kernel] pci_enable_device: irq= 19
[kernel] pci_enable_device: msi_enabled= 0

2.) msi irq 218 gets assigned

3) First suspend to disk. Consists of
3a) Suspend devices
[kernel] ahci 0000:00:1f.2: freeze
[kernel] pci_disable_device: dev= c1a59000
[kernel] pci_disable_device: irq= 218
[kernel] pci_disable_device: msi_enabled= 1
[kernel] ACPI: PCI interrupt for device 0000:00:1f.2 disabled
[kernel] pci_disable_device: dev= c1a59000
[kernel] pci_disable_device: irq= 218
[kernel] pci_disable_device: msi_enabled= 1

3b) Disable non-boot cpus
3c) Snapshot memory
3d) Enable non-boot cpus
3e) Resume devices (after snapshot!)
[kernel] ahci 0000:00:1f.2: resuming
[kernel] PM: Writing back config space on device 0000:00:1f.2 at offset
1 (was 2b00403, writing 2b00407)
[kernel] pci_enable_device: dev= c1a59000
[kernel] pci_enable_device: irq= 218
[kernel] pci_enable_device: msi_enabled= 1
[kernel] ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) ->
IRQ 19
[kernel] pci_enable_device: dev= c1a59000
[kernel] pci_enable_device: irq= 19
[kernel] pci_enable_device: msi_enabled= 1

3f) Write memory image
3g) Power down + reboot

4a) Normal start and restore memory image
4b) Enable non-boot cpus
4c) Resume devices
[kernel] ahci 0000:00:1f.2: resuming
[kernel] PM: Writing back config space on device 0000:00:1f.2 at offset
1 (was 2b00403, writing 2b00407)
[kernel] pci_enable_device: dev= c1a59000
[kernel] pci_enable_device: irq= 218
[kernel] pci_enable_device: msi_enabled= ...
From: Eric W. Biederman
Date: Saturday, March 24, 2007 - 8:39 pm

Ok.  Thanks.   It is the ordering of events that keeps it
from showing up.  The problem happens the first time but only
after we have restored msi state so we don't see the ill effects
until the second time.


Ok staring at the code and thinking about the problem.  The only
thing that pci_enable_device does (except messing with irqs is
flip enable bits).   Further pci_enable_device only messes with
on 5 architectures.   Only ia64 really cares.  i386 and x86_64
it is simply delaying work until we need it.  frv doesn't
really care it just pokes the irq value back into the hardware
for some reason.  cris just sets a hard coded value.  Does cris
only have one pci irq?

So I think the right solution is to simply make pci_enable_device
just flip enable bits and move the rest of the work someplace else.

However a thorough cleanup is a little extreme for this point in
the release cycle, so I think a quick hack that makes the code
not stomp the irq when msi irq's are enabled should be the first
fix.  Then we can later make the code not change the irqs at all.

Thomas could you verify the patch below makes the problem go away
for you.

Tony, Len the way pci_disable_device is being used in a suspend/resume
path by a few drivers is completely incompatible with the way irqs
are allocated on ia64.  In particular people the following sequence
occurs in several drivers.

probe:
  pci_enable_device(pdev);
  request_irq(pdev->irq);
suspend:
  pci_disable_device(pdev);
resume:
  pci_enable_device(pdev);
remove:
  free_irq(pdev->irq);
  pci_disable_device(pdev);

What I'm proposing we do is move the irq allocation code out of
pci_enable_device and the irq freeing code out of pci_disable_device
in the future.  If we move ia64 to a model where the irq number equal
the gsi like we have for x86_64 and are in the middle of for i386 that
should be pretty straight forward.  It would even be relatively simple
to delay vector allocation in that context until request_irq, if we
needed the ...
From: Thomas Meyer
Date: Sunday, March 25, 2007 - 4:41 am

The patch solves the problem. I'm writing this after the third suspend
and resume cycle.
msi irq stays enabled for libata device:
cat /sys/devices/pci0000\:00/0000\:00\:1f.2/irq
218

cat /proc/interrupts
           CPU0       CPU1
  0:     274190          0   IO-APIC-edge      timer
  9:      13417          0   IO-APIC-fasteoi   acpi
 16:        166          0   IO-APIC-fasteoi   uhci_hcd:usb4
 17:      70908      88643   IO-APIC-fasteoi   wifi0
 18:       3060          0   IO-APIC-fasteoi   libata, uhci_hcd:usb3
 19:          8          0   IO-APIC-fasteoi   ohci1394, uhci_hcd:usb2
 20:      46252          0   IO-APIC-fasteoi   HDA Intel
 21:     168437          0   IO-APIC-fasteoi   uhci_hcd:usb1, ehci_hcd:usb5
218:      15896          0   PCI-MSI-edge      libata
219:          1          0   PCI-MSI-edge      eth0
NMI:          0          0
LOC:      87574     123338
ERR:          0
MIS:          0


BUT...


The first suspend to disk is ok. The second suspend to disk has a
strange behaviour:
1.) write pm image
2.) the system disable the non-boot cpus again (i guess this happens in
power_down())
3.) the system doesn't power down.
4.) pressing any key and the system powers down.

The same is true for the third suspend cycle. Maybe an acpi problem?


-

From: Eric W. Biederman
Date: Sunday, March 25, 2007 - 5:03 am

Sounds possible.  You could probably verify it isn't my patch but running
an unpatched kernel without msi support.  As I think the crash you saw should
only be reproducible when using devices that support msi.

Unless I hear different I'm going to assume that this second case is a
completely different problem.  You might check to see if the acpi
interrupt is stuck after a suspend/resume cycle.

At this point I'm going to wait a bit for Tony and Len to have a
chance to give their opinion but unless I hear something I'm going to
plan on sending the patch out shortly...

Eric
-

From: Rafael J. Wysocki
Date: Sunday, March 25, 2007 - 5:28 am

Yes, in kernel/power/disk.c:power_down() .

Please comment out the disable_nonboot_cpus() in there and retest (but please

I think it is different too.

Greetings,
Rafael
-

From: Eric W. Biederman
Date: Sunday, March 25, 2007 - 5:56 am

<rant>

Why do we even need a disable_nonboot_cpus in that path?  machine_shutdown
on i386 and x86_64 should take care of that.  Further the code that computes
the boot cpu is bogus (not all architectures require cpu == 0 to be
the boot cpu), and disabling non boot cpus appears to be a strong
x86ism, in the first place.

If the only reason for disable_nonboot_cpus there is to avoid the
WARN_ON in init_low_mappings() we should seriously consider killing
it.  If we can wait for 2.6.22 the relocatable x86_64 patchset that
Andi has queued, has changes that kill the init_low_mapping() hack.

I'm not very comfortable with calling cpu_down in a common code path
right now either.  I'm fairly certain we still don't have that
correct.  So if we confine the mess that is cpu_down to #if
defined(CPU_HOTPLUG) && defined(CONFIG_EXPERIMENTAL) I don't care.
If we start using it everywhere I'm very nervous.  I know the irq
migration when bringing a cpu down is strongly racy, and I don't think
we actually put cpus to sleep properly either.

</rant>


Eric
-

From: Rafael J. Wysocki
Date: Sunday, March 25, 2007 - 12:14 pm

I think we should kill the WARN_ON() right now, perhaps replacing it with

I'm interested in all of the details, please.  I seriously consider dropping
cpu_up()/cpu_down() from the suspend code paths.

Greetings,
Rafael
-

From: Eric W. Biederman
Date: Sunday, March 25, 2007 - 1:37 pm

The problem with the current init_low_mappings is that it hacks the
current page table.  If we can instead use a different page table
the code becomes SMP safe.

I have extracted the patch that addresses this from the relocatable
patchset and appended it for sparking ideas.  It goes a little



So I'm not certain if in a multiple cpu context we can avoid all of the
issues with cpu hotplug but there is a reasonable chance so I will
explain as best I can.

Yanking the appropriate code out of linuxbios the way a processor should stop
itself is to send an INIT IPI to itself.  This puts a cpu into an optimized
wait for startup IPI state where it is otherwise disabled.  This is the state

I'm not certain what to do with the interrupt races.  But I will see
if I can explain what I know.

<braindump>

- Most ioapics are buggy.
- Most ioapics do not follow pci-ordering rules with respect to
  interrupt message deliver so ensuring all in-flight irqs have
  arrived somewhere is very hard.
- To avoid bugs we always limit ourselves to reprogramming the ioapics
  in the interrupt handler, and not considering an interrupt
  successfully reprogrammed until we have received an irq in the new
  location.
- On x86 we have two basic interrupt handling modes.
  o logical addressing with lowest priority delivery.
  o physical addressing with delivery to a single cpu.
- With logical addressing as long as the cpu is not available for
  having an interrupt delivered to it the interrupt will be
  never be delivered to a particular cpu.  Ideally we also update
  the mask in the ioapic to not target that cpu.
- With physical addressing targeting a single cpu we need to reprogram
  the ioapics not to target that specific cpu.  This needs to happen
  in the interrupt handler and we need to wait for the next interrupt
  before we tear down our data structures for handling the interrupt.

  The current cpu hotplug code attempts to reprogram the ioapics from
  process context which is just ...
From: Rafael J. Wysocki
Date: Monday, March 26, 2007 - 2:03 pm

The devices are expected (and in fact required) not to generate interrupts


In the suspend to disk context we must ensure that the other CPUs won't
modify memory in any way while the image is being created, so I think we
should at least make them loop in a safe place and refuse to take any

I'm not sure what you mean.

Anyway, currently cpu_up() is used to enable the other CPUs when we have

Well, we've been using it for suspending SMP boxes for quite some time now
and there haven't been any major low-level problems.  The current problems are
related to the fact that the CPU hotplug calls lots of notifiers that need not


Thanks a lot for the info.

The patch looks a bit too complicated for a quick fix.  I think we'll need to
remove that WARN_ON() in 2.6.21 after all.

Greetings,
Rafael
-

From: Thomas Meyer
Date: Sunday, March 25, 2007 - 7:17 am

Without disable_nonboot_cpus in power_down the computer powers down
Yes, it's a different problem
-

From: Rafael J. Wysocki
Date: Sunday, March 25, 2007 - 11:56 am

Do you have CONFIG_NO_HZ set?

Rafael
-

From: Thomas Meyer
Date: Sunday, March 25, 2007 - 6:54 am

Without your patch and with pci=nomsi option the same error occur. But i
think this is not an acpi error, because every interrupt seems to
D'accord.
-

From: Adrian Bunk
Date: Sunday, March 25, 2007 - 7:48 am

Is this also present with 2.6.20, or is it a regression?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Thomas Meyer
Date: Sunday, March 25, 2007 - 10:25 am

No, this one is not present in 2.6.20 and this error doesn't (head=
317ec6cd00f25d05d153a780bc178c5335f320ee) occur with NO_HZ=n and
HIGH_RES_TIMERS=n

This error is maybe related with this commit:

commit cd05a1f818073a623455a58e756c5b419fc98db9
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Sat Mar 17 00:25:52 2007 +0100

    [PATCH] clockevents: Fix suspend/resume to disk hangs

    I finally found a dual core box, which survives suspend/resume without
    crashing in the middle of nowhere. Sigh, I never figured out from the
    code and the bug reports what's going on.

    The observed hangs are caused by a stale state transition of the clock
    event devices, which keeps the RCU synchronization away from completion,
    when the non boot CPU is brought back up.

    The suspend/resume in oneshot mode needs the similar care as the
    periodic mode during suspend to RAM. My assumption that the state
    transitions during the different shutdown/bringups of s2disk would go
    through the periodic boot phase and then switch over to highres resp.
    nohz mode were simply wrong.

    Add the appropriate suspend / resume handling for the non periodic
    modes.

    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

-

From: Rafael J. Wysocki
Date: Sunday, March 25, 2007 - 12:06 pm

Yes, it is, but I'd rather remove the disable_nonboot_cpus() from
power_down() (as Eric suggested) instead of trying to handle the RCU sync
problem here.

This has been caused by my commit 94985134b7b46848267ed6b734320db01c974e72
(swsusp: disable nonboot CPUs before entering platform suspend) that in such a
case should be reverted.

Greetings,
Rafael
-

From: Rafael J. Wysocki
Date: Sunday, March 25, 2007 - 12:31 pm

s/such a case/the present situation/

Rafael
-

From: Luck, Tony
Date: Monday, March 26, 2007 - 1:01 pm

Sounds rational ... in a world that wasn't dominated by PCI it would
seem to be the logical approach (since the irq code would have much

Long-term-direction-acked-by: Tony Luck <tony.luck@intel.com>

-Tony
-

From: Eric W. Biederman
Date: Monday, March 26, 2007 - 8:29 pm

Right.  We can even do this earlier in the pci code.  Just doing this
on demand when the device driver needs it is problematic.  As devices
drivers like to keep the requested over a pci_disable_device pci_enable_device
pair.

The big practical issue is that we will like wind up allocating an irq
number to all usable irqs on ia64.  Which means we will like need many
more irq numbers...  Although I guess if we keep it at the pci layer
we should be fairly safe.

I was afraid there was some hotplug reason for waiting until pci_enable_device

Thanks.  Then small surgery will happen now, and I will start queuing up
the major surgery patches.  Although I won't be able to do more than
compile test and code review the ia64 changes.

Eric
-

From: Bjorn Helgaas
Date: Monday, April 2, 2007 - 8:38 am

The main reason we wait until pci_enable_device() to allocate an
IRQ number is that ia64 currently only has about 180 device vectors,
and there are machines with more PCI slots than that.

I also think it's nice that we don't do anything with a device until
we have a driver to claim it.  But there certainly have been cases
where delaying IRQ allocation has caused troubles.

I really like the idea of moving to the IRQ == GSI model for ia64.
But of course, we'll have to get rid of the 180-vector limit to
make that work, too.

Bjorn
-

From: Bjorn Helgaas
Date: Monday, April 2, 2007 - 9:38 am

Sigh, that didn't make much sense, did it?  At the time, ia64 didn't
support sharing IRQ vectors, and we preallocated four vectors for
every slot, including empty ones.  Allocate-on-demand dramatically
increased the number of devices we could support because most cards
use only one IRQ.
-

From: Eric W. Biederman
Date: Monday, April 2, 2007 - 12:50 pm

If we don't reserve irqs that the hardware doesn't support we should
be able to simply move the allocation and have about the same cost as

Agreed.  It is the second call to pci_enable_device() by a driver where

Mostly that is a matter of porting the code from x86_64 where that is
already the case.  I'm pretty certain I have worked through all of
the bit issues, but there might be a few small problems that crop up.

If need by I will do the patches as I find time.  But if someone else gets
there before I do that would be great :)

Eric
-

From: Frédéric
Date: Sunday, March 25, 2007 - 2:34 pm

I spent some time this weekend investigating this issue more thoroughly.
In fact the regression caused by this commit has been corrected by
f3ccb06f3b8e0cf42b579db21f3ca7f17fcc3f38 (ACPI: Disable wake GPEs only
once.)  

However, as I pointed out in the initial report, the MacMini doesn't
come out of suspend to ram because a commit in another merged patchset
broke it. I tracked it down to:

commit e9e2cdb412412326c4827fc78ba27f410d837e6e
parent 79bf2bb335b85db25d27421c798595a2fa2a0e82 
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Fri Feb 16 01:28:04 2007 -0800

    [PATCH] clockevents: i386 drivers
    
This patch has already been mentioned in regression reports, but AFAICS
not related to suspend issues.

To be totally clear about what works and what doesn't:

79bf2bb335b85db25d27421c798595a2fa2a0e82 
   + cherry-pick f3ccb06f3b8e0cf42b579db21f3ca7f17fcc3f38  ==> works

e9e2cdb412412326c4827fc78ba27f410d837e6e
   + cherry-pick f3ccb06f3b8e0cf42b579db21f3ca7f17fcc3f38  ==> broken

To try to get more information, I commented the call to
do_suspend_lowlevel in drivers/acpi/sleep/main.c and used
CONFIG_DISABLE_CONSOLE_SUSPEND. Interestingly, the suspend/resume cycle
completes correctly in this mode.

Fred.

-

From: Frédéric
Date: Sunday, March 25, 2007 - 11:45 pm

Additional data point: I just tried with -rc5 and the issue is still
present. The config I used for this test defines neither NO_HZ nor
HIGH_RES_TIMERS.

Fred.

-

From: Thomas Gleixner
Date: Monday, March 26, 2007 - 2:14 am

Do you have CONFIG_HPET_TIMER enabled and does the box have one ?
If yes, can you please turn it off and retry ?

Thanks,

	tglx



-

From: Frederic Riss
Date: Monday, March 26, 2007 - 3:36 am

IIRC the box has a HPET and it gets used. I'll test and confirm when I
get home tonight.

Thanks,
Fred
-

From: Frédéric
Date: Monday, March 26, 2007 - 11:53 am

Indeed, turning off CONFIG_HPET_TIMER does fix the coming-out-of-suspend
issue. (In fact it hangs at the second suspend, but that's another ATA
problem that I think has already been reported).

Fred.

-

From: Adrian Bunk
Date: Monday, March 26, 2007 - 12:02 pm

This sounds like the MSI problem.

Do you have CONFIG_PCI_MSI enabled?
If yes, does disabling it fix it?

cu
Adrian

[1] http://lkml.org/lkml/2007/3/24/136

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Frederic Riss
Date: Monday, March 26, 2007 - 12:39 pm

Yes !

Just to be 100% clear, the hang I was seeing at the second suspend is this one:

Subject    : second suspend to disk in a row results in an oops  (libata?)
References : http://lkml.org/lkml/2007/3/17/43
Submitter  : Thomas Meyer <thomas@m3y3r.de>
Status     : unknown

I'm not sure it was associated with the MSI issue yet.

Thanks a lot,
-

From: Adrian Bunk
Date: Monday, March 26, 2007 - 12:46 pm

That's what I was calling "the MSI problem".

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Marcus Better
Date: Monday, March 26, 2007 - 3:00 am

--Boundary-01=_3m5BG3NnBRshdFX
Content-Type: text/plain;
  charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline


I just tried -rc5. Now suspend to disk seems to work. I think the XFS=20
workqueue patch fixed this.

It can also suspend to RAM, but resume is worse. The first time around it=20
resumed but corrupted the vesafb console (greenish blinking character cells=
),=20
something that used to work before. But the system responded to input, so I=
=20
suspended to RAM again. This time the resume failed, it hung after=20
printing "Linux!" in yellow at the top of the screen. (Seems to be some=20
artifact, I have seen it before even with working suspend.)

I'm attaching my config.

Not sure how to bisect this. I guess it would be necessary to keep the XFS=
=20
workqueue patch throughout, otherwise it is guaranteed to break.

Marcus

--Boundary-01=_3m5BG3NnBRshdFX
Content-Type: text/plain;
  charset="utf-8";
  name=".config"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=".config"

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.21-rc5-melech
# Mon Mar 26 10:56:13 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General ...
From: Pavel Machek
Date: Monday, March 26, 2007 - 5:35 am

Yellow Linux! is my debugging trick. It should be there, but it should
also disapear quickly.

Try vga=0 ... text console seems to work for you.
							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Adrian Bunk
Date: Monday, March 26, 2007 - 7:34 am

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Marcus Better
Date: Monday, March 26, 2007 - 10:42 am

Yes, it does. The hanging resume problem went away. 

(The display corruption and the instant resume were not affected.)

Marcus

-

From: Adrian Bunk
Date: Monday, March 26, 2007 - 11:48 am

Thanks for testing.


Not a surprise - it's currently quite common that people run into 

cu
Adrian

[1] http://lkml.org/lkml/2007/3/24/136 [2]
[2] x86_64 uses arch/i386/pci/common.c

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Marcus Better
Date: Tuesday, March 27, 2007 - 2:42 am

Yes, it appears to fix it.

Marcus
From: Adrian Bunk
Date: Saturday, March 24, 2007 - 4:25 am

This email lists some known regressions in Linus' tree compared to 2.6.20
with patches available.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : suspend to disk hangs  (microcode driver)
References : http://lkml.org/lkml/2007/3/16/126
Submitter  : Maxim Levitsky <maximlevitsky@gmail.com>
Caused-By  : Rafael J. Wysocki <rjw@sisk.pl>
             commit e3c7db621bed4afb8e231cb005057f2feb5db557
             commit ed746e3b18f4df18afa3763155972c5835f284c5
             commit 259130526c267550bc365d3015917d90667732f1
Handled-By : Rafael J. Wysocki <rjw@sisk.pl>
Patch      : http://lkml.org/lkml/2007/3/23/179
Status     : patch available


Subject    : gettimeofday increments too slowly
References : http://bugzilla.kernel.org/show_bug.cgi?id=8027
             http://lkml.org/lkml/2007/3/23/329
Submitter  : David L <idht4n@hotmail.com>
Caused-By  : Thomas Gleixner <tglx@linutronix.de>
             commit 92c7e00254b2d0efc1e36ac3e45474ce1871b6b2
Handled-By : Thomas Gleixner <tglx@linutronix.de>
Patch      : http://lkml.org/lkml/2007/3/23/329
Status     : patch available


Subject    : boot hangs during IDE detection  (clocksource)
References : http://lkml.org/lkml/2007/3/19/465
Submitter  : Bob Tracy <rct@gherkin.frus.com>
Caused-By  : John Stultz <johnstul@us.ibm.com>
             commit 6bb74df481223731af6c7e0ff3adb31f6442cfcd
Handled-By : John Stultz <johnstul@us.ibm.com>
Patch      : http://lkml.org/lkml/2007/3/22/287
Status     : workaround-patch available


Subject    : forcedeth: skb_over_panic
References : http://bugzilla.kernel.org/show_bug.cgi?id=8058
Submitter  : Albert Hopkins <kernel@marduk.letterboxes.org>
Handled-By : Ayaz Abdulla <aabdulla@nvidia.com>
Patch  ...
From: Bob Tracy
Date: Monday, March 26, 2007 - 5:37 am

The subject problem is fixed *without* the workaround patch in
2.6.21-rc5.  The acpi_pm clocksource patch for the case where the
PIIX4 bug is present should probably be included anyway.

-- 
-----------------------------------------------------------------------
Bob Tracy                   WTO + WIPO = DMCA? http://www.anti-dmca.org
rct@frus.com
-----------------------------------------------------------------------
-

Previous thread: decrease L2 cache size used by Linux? by Guerreiro da Luz on Friday, March 16, 2007 - 9:13 am. (2 messages)

Next thread: [Patch] simplify statistics' debugfs write function by Martin Peschke on Friday, March 16, 2007 - 9:57 am. (1 message)