Re: [RFC PATCH]Multi-threaded Initcall with dependence support

Previous thread: [RFC] LZO de/compression support - take 5 by Nitin Gupta on Sunday, May 27, 2007 - 11:59 pm. (6 messages)

Next thread: Re: [BUG] signal: multithread program returns with wrong errno on receiving SIGSTOP by Oleg Nesterov on Monday, May 28, 2007 - 12:07 am. (6 messages)
From: Yang Sheng
Date: Monday, May 28, 2007 - 12:03 am

Why we need this:

It can speed up the calling of initcalls, especially useful for some embed 
device. 

Idea:

1. The initcall can indicate a executing sequence by using the a 
macro(initcall_depend()) in case of causing dependence problem in 
multi-threaded running. Multi dependences is also allowed.
2. Ensure the calling of initcalls in the same layer would be completed before 
the next layers' calling.

Usage:
You can indicate that initcall A must be run after initcall B by calling the 
macro in A's file:

initcall_depend(A, B);

Means initcall A must run after initcall B finish executing(A depends on B).

Take notice of that if you declare A depends on B and C, you must put these 
together as (the sequences is not important):

initcall_depend(A, B);
initcall_depend(A, C);

The detail of method:

A new section called .initcall.depend was added to 
arch/xxx/kernel/vmlinux.lds.S to indicate the dependence relationship. A 
struct called initcall_depend_t stored the relationship between A and B, and 
was stored in section .initcall.depend.

Because all the dependence of A are put together, and the sequences of 
initcall_depend_t was decided in linker order as initcall itself did. When A 
is going to run, we can check if A would depend on others by checking the 
point indicate the current item in dependence table. If the field "call" of 
initcall_depend_t point to A, we know that A is depend on something and get 
the prev_addr of the struct to find what it depends on. The field "prev_addr"  
point to somewhere in .initcall.init section to indicate the address(also the 
order) of depended initcall, so it can be used to find out whether other 
threads complete executing of the depended initcall. If the current point of 
the thread executing is smaller than prev_addr(it means some thread not 
completed executing, not only this thread), we'll wait, otherwise we can 
continue to check next thread. If all the thread is ok, we will run the 
initcall and go to the next ...
From: Randy Dunlap
Date: Monday, May 28, 2007 - 3:52 pm

Can you give concrete example(s) of why we need this?
Any real configs/hardware where it helps and how much it helps.


General:
1/ Patch has 12 lines with trailing whitespace.  Please chop those
off (always).
2/ Try to keep sources lines < 80 characters.
3/ Read & use Documentation/CodingStyle, Documentation/SubmittingPatches,









printk needs KERN_* level.

printk needs KERN_* level.


                                                   non-existent





---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-

From: Yang Sheng
Date: Monday, May 28, 2007 - 6:47 pm

We didn't got the precise data at hand now, because we should build a complete  
stable initcall dependence relationship for it, but we can't do it now. 

But we have done a relative stable test in a common x86_64 machine, with 2 
threads and one dependence relation(pnpacpi_init depends on pnp_init and 
acpi_init). The result is the time spending on initcall calling reducing from 
about _5s_ to _2s_ (make the kernel with the defconfig). We analyzed the 
dmesg and found the most of time was save by run ide_generic_init and 
piix_init in parallel. 

Of course the dependence in the test case is not sufficient, but the effect is 
shown. 

We think this patch would be very useful in some embed deviced which requires 
fast boot speed. Some server may benefit too because of it's long time for 

Thanks for advise! Next time I will be more careful on coding style and check 
the patch following the document above. 

The patch below is a prototype now. Some problem can't be solve in current 
stage, like the map function's algorithm complexity is O(n^2). I have another 
method for this but require some duplicate name initcall's rename. 



This is the thread number which would be run at the same time. Should be able 

Yeah, I shouldn't use panic here. In fact, if dpt->prev.func is not found, 
which the initcall it depends is not exist in current image. But we can 

Thanks!

-- 
regards
Yang, Sheng
-

From: Dave Jones
Date: Thursday, May 31, 2007 - 1:26 pm

On Tue, May 29, 2007 at 09:47:53AM +0800, Yang Sheng wrote:
 > On Tuesday 29 May 2007 06:52, Randy Dunlap wrote:
 > > On Mon, 28 May 2007 15:03:10 +0800 Yang Sheng wrote:
 > > > Why we need this:
 > > >
 > > > It can speed up the calling of initcalls, especially useful for some
 > > > embed device.
 > >
 > > Can you give concrete example(s) of why we need this?
 > > Any real configs/hardware where it helps and how much it helps.
 > >
 > 
 > We didn't got the precise data at hand now, because we should build a complete  
 > stable initcall dependence relationship for it, but we can't do it now. 
 > 
 > But we have done a relative stable test in a common x86_64 machine, with 2 
 > threads and one dependence relation(pnpacpi_init depends on pnp_init and 
 > acpi_init). The result is the time spending on initcall calling reducing from 
 > about _5s_ to _2s_ (make the kernel with the defconfig). We analyzed the 
 > dmesg and found the most of time was save by run ide_generic_init and 
 > piix_init in parallel. 
 > 
 > Of course the dependence in the test case is not sufficient, but the effect is 
 > shown. 
 > 
 > We think this patch would be very useful in some embed deviced which requires 
 > fast boot speed. Some server may benefit too because of it's long time for 
 > device initiation. 

If we decide to do this, we should also introduce a way to disable it
at runtime with initcall=noparallel or something.  Why?
Because right now when people say "my computer hangs during bootup"
we can ask them to boot with initcall_debug and usually find out
the last thing it did before it locked up.   If we parallelise this,
the output will be a lot harder to decipher.

	Dave

-- 
http://www.codemonkey.org.uk
-

From: Sheng Yang
Date: Sunday, June 3, 2007 - 6:06 pm

Thank you for the advice. I will introduce a parameter to do this. 

But what's about idea itself? I don't know whether people like this... It 
required a little more work on initcall writing. 

Maybe we could limit the multithread part in device_initcall? For it seems the 
most time consumed here, and the others in total just less than 1s(at least 
on my machine). 

Thanks. 
-- 
regards
Yang, Sheng
-

Previous thread: [RFC] LZO de/compression support - take 5 by Nitin Gupta on Sunday, May 27, 2007 - 11:59 pm. (6 messages)

Next thread: Re: [BUG] signal: multithread program returns with wrong errno on receiving SIGSTOP by Oleg Nesterov on Monday, May 28, 2007 - 12:07 am. (6 messages)