Re: [GIT PULL] time.c - respin

Previous thread: distributed shared memory / mmap by Martin Uecker on Wednesday, August 20, 2008 - 6:18 am. (5 messages)

Next thread: none
From: Glauber Costa
Date: Wednesday, August 20, 2008 - 8:07 am

Ingo, please pull the latest master git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/glommer/linux-2.6-x86-integration.git master

into your tree. It solves a bisectability issue, but I can't, unfortunately, reproduce
the problem you report. So hopefuly you'll now be able to bisect your problems to a point
we can find out what's going on


Thanks!

	Glauber

------------------>
Glauber Costa (31):
      x86: use user_mode macro
      x86: coalesce tests
      x86: set bp field in pt_regs properly
      x86: use frame pointer information on x86_64 profile_pc
      x86: remove SEGMENT_IS_FLAT_CODE
      x86: use user_mode_vm instead of user_mode
      x86: bind irq0 irq data to cpu0
      x86: factor out irq initialization for x86_64
      x86: make init_ISA_irqs nonstatic
      x86: rename timer_event_interrupt to timer_interrupt
      x86: allow x86_64 to build with subarch support
      x86: replace ISA initialization function
      x86: use generic intr_init call
      x86: use time_init_hook in time_64.c
      x86: use generic time hook
      x86: replace hardcoded number
      x86: unconditionalize timer_ack
      x86: assign timer_ack variable
      x86: wrap MCA_bus test around an ifdef
      x86: wrap conditional inside ifdef at profile_pc
      x86: fix checkpatch errors
      x86: move vgetcpu mode probing to cpu detection
      x86: add pre_time_init_hooks to x86_64 time initialization
      x86: conditionalize interrupt accounting
      x86: fix address reference
      x86: remove header from time_64.c
      x86: reorganize time_32.c headers
      x86: reorganize 64-bit only code
      x86: merge copyright notice
      x86: make time_64 and time_32 equal
      x86: merge time.c

 arch/x86/Makefile                     |    4 -
 arch/x86/kernel/Makefile              |    2 +-
 arch/x86/kernel/cpu/common_64.c       |    9 +++
 arch/x86/kernel/entry_64.S            |    7 ++
 arch/x86/kernel/io_apic.c             |    6 +-
 ...
From: Ingo Molnar
Date: Thursday, August 21, 2008 - 3:22 am

i've tried it, and -tip testing found a hang on 64-bit x86.

It hangs here:

[    0.340029] calling  tc_filter_init+0x0/0x4c
[    0.340029] initcall tc_filter_init+0x0/0x4c returned 0 after 0 msecs
[    0.340029] calling  genl_init+0x0/0xd8
[ hard hang ]

it should have continued with:

[    2.976346] initcall genl_init+0x0/0xd8 returned 0 after 15 msecs
[    2.982303] calling  cipso_v4_init+0x0/0x88
[ ... etc ... ]

i've bisected the hang back to:

0de577d0dd2d1101431d3438d0880fa32a6188d6 is first bad commit
commit 0de577d0dd2d1101431d3438d0880fa32a6188d6
Author: Glauber Costa <gcosta@redhat.com>
Date:   Fri Jul 11 15:43:19 2008 -0300

    x86: use generic intr_init call

    Replace apic initialization code with generic intr_init_hook().

    Signed-off-by: Glauber Costa <gcosta@redhat.com>

the config is at:

 http://redhat.com/~mingo/misc/config-Thu_Aug_21_11_14_36_CEST_2008.bad

but the bug is rather obvious:

-       apic_intr_init();
-
-       if (!acpi_ioapic)
-               setup_irq(2, &irq2);
+       intr_init_hook();

why exactly did you remove the cascade IRQ registration? If it remains 
unallocated and a driver happens to use it funny things might occur.

Also, the commit log does not declare why it's removed.

	Ingo
--

From: Glauber Costa
Date: Thursday, August 21, 2008 - 6:07 am

Because it's not (was not supposed to be) removed. It used to be done by
intr_init_hook(), so we're just using the same function for both architectures.
Commit 2ae111cdd8d83ebf9de72e36e68a8c84b6ebbeea changed this behaviour without
me noticing (and my tests didn't hit it)

It this is the way things gonna stay, so we'd probably want to drop this patch
completely. Can you remove it cleanly, or it triggers dependency problems?
--

From: Ingo Molnar
Date: Sunday, August 24, 2008 - 5:31 am

i've dropped that patch and have respun tip/x86/time accordingly, based 
on latest -git - and merged it into tip/master. Started testing the 
result.

	Ingo
--

From: Ingo Molnar
Date: Sunday, August 24, 2008 - 6:16 am

got a spontaneous reboot bug upon bootup on a dual-core system booting a 
32-bit x86 kernel - no message on the console. Config attached.

as this is an integrated branch with a few fixups, i have pushed it out 
into tip/x86/time.broken2 - you can check that end result branch. 
(commit-ID: e8d0f9bf5a0)

the same config boots fine with x86/time removed from the integration. 
(commit-ID: 887f98f62c8)

	Ingo
From: Ingo Molnar
Date: Sunday, August 24, 2008 - 6:39 am

bisection is a bit inconclusive:

  ee096f75b69913dbad0e6f7f2572513de5c90002 is first bad commit

that is a merge commit.

here's the bisection log:

 # good: [887f98f6] Merge branch 'irq/sparseirq'
 # bad:  [e8d0f9bf] Merge branch 'x86/time'
 # good: [bea0a6a6] x86: wrap MCA_bus test around an ifdef
 # good: [8def7ec4] x86: reorganize 64-bit only code
 # bad:  [ee096f75] manual merge of x86/time
 # good: [291ae6a3] x86: make time_64 and time_32 equal
 # good: [a2496715] x86: merge time.c

	Ingo
--

Previous thread: distributed shared memory / mmap by Martin Uecker on Wednesday, August 20, 2008 - 6:18 am. (5 messages)

Next thread: none