On Wed, 2008-05-07 at 10:11 +0800, Zhang, Yanmin wrote:With my patch+gprof, I collected some data. Below was outputed by gprof. index % time self children called name 0.00 0.00 2/223305376 __down_write_nested [22749] 0.00 0.00 3/223305376 journal_commit_transaction [10526] 0.00 0.00 6/223305376 __down_read [22745] 0.00 0.00 8/223305376 start_this_handle [19167] 0.00 0.00 15/223305376 sys_pause [19808] 0.00 0.00 17/223305376 log_wait_commit [11047] 0.00 0.00 20/223305376 futex_wait [8122] 0.00 0.00 64/223305376 pdflush [14335] 0.00 0.00 71/223305376 do_get_write_access [5367] 0.00 0.00 84/223305376 pipe_wait [14460] 0.00 0.00 111/223305376 kjournald [10726] 0.00 0.00 116/223305376 int_careful [9634] 0.00 0.00 224/223305376 do_nanosleep [5418] 0.00 0.00 1152/223305376 watchdog [22065] 0.00 0.00 4087/223305376 worker_thread [22076] 0.00 0.00 5003/223305376 __mutex_lock_killable_slowpath [23305] 0.00 0.00 7810/223305376 ksoftirqd [10831] 0.00 0.00 9389/223305376 __mutex_lock_slowpath [23306] 0.00 0.00 10642/223305376 io_schedule [9813] 0.00 0.00 23544/223305376 migration_thread [11495] 0.00 0.00 35319/223305376 __cond_resched [22673] 0.00 0.00 49065/223305376 retint_careful [16146] 0.00 0.00 119757/223305376 sysret_careful [20074] 0.00 0.00 151717/223305376 do_wait [5545] 0.00 0.00 250221/223305376 do_exit [5356] 0.00 0.00 303836/223305376 cpu_idle [4350] 0.00 0.00 222333093/223305376 schedule_timeout [2] [1] 0.0 0.00 0.00 223305376 schedule [1] ----------------------------------------------- 0.00 0.00 2/222333093 io_schedule_timeout [9814] 0.00 0.00 4/222333093 journal_stop [10588] 0.00 0.00 8/222333093 cifs_oplock_thread [3760] 0.00 0.00 14/222333093 do_sys_poll [5513] 0.00 0.00 20/222333093 cifs_dnotify_thread [3733] 0.00 0.00 32/222333093 read_chan [15648] 0.00 0.00 47/222333093 wait_for_common [22017] 0.00 0.00 658/222333093 do_select [5479] 0.00 0.00 2000/222333093 inet_stream_connect [9324] 0.00 0.00 222330308/222333093 __down [22577] [2] 0.0 0.00 0.00 222333093 schedule_timeout [2] 0.00 0.00 222333093/223305376 schedule [1] ----------------------------------------------- 0.00 0.00 1/165565 flock_lock_file_wait [7735] 0.00 0.00 7/165565 __posix_lock_file [23371] 0.00 0.00 203/165565 de_put [4665] 0.00 0.00 243/165565 opost [13633] 0.00 0.00 333/165565 proc_root_readdir [14982] 0.00 0.00 358/165565 write_chan [22090] 0.00 0.00 6222/165565 proc_lookup_de [14908] 0.00 0.00 32081/165565 sys_fcntl [19687] 0.00 0.00 36045/165565 vfs_ioctl [21822] 0.00 0.00 42025/165565 tty_release [20818] 0.00 0.00 48047/165565 chrdev_open [3702] [3] 0.0 0.00 0.00 165565 lock_kernel [3] 0.00 0.00 152987/153190 down [4] ----------------------------------------------- 0.00 0.00 203/153190 __reacquire_kernel_lock [23420] 0.00 0.00 152987/153190 lock_kernel [3] [4] 0.0 0.00 0.00 153190 down [4] 0.00 0.00 153190/153190 __down [22577] ----------------------------------------------- 0.00 0.00 153190/153190 down [4] [22577 0.0 0.00 0.00 153190 __down [22577] 0.00 0.00 222330308/222333093 schedule_timeout [2] As system idle is more than 50%, so the schedule/schedule_timeout caller is important information. 1) lock_kernel causes most schedule/schedule_timeout; 2) When lock_kernel calls down, then __down, __down calls schedule_timeout for lots of times in a loop; 3) Caller of lcok_kernel are sys_fcntl/vfs_ioctl/tty_release/chrdev_open. -yanmin --
