Re: twa + dump = sbwait

Previous thread: zone allocation failures? by Michael Butler on Sunday, September 16, 2007 - 1:31 pm. (1 message)

Next thread: using unix domain socket get ENOTCONN in both 6.2 and 7.0 by Deng XueFeng on Monday, September 17, 2007 - 1:12 am. (2 messages)
From: Boris Samorodov
Date: Monday, September 17, 2007 - 12:28 am

Hi!


I can't use dump at -CURRENT with twa. The process goes to sbwait
state forever. Here are the details.

$ uname -a
FreeBSD test.ipt.ru 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Sat Sep 15 19:37:17 MSD 2007     bsam@test.ipt.ru:/usr/obj/usr/src/sys/TEST  amd64

The controller:
twa0: <3ware 9000 series Storage Controller> port 0x3000-0x30ff mem 0xd8000000-0xd9ffffff,0xda300000-0xda300fff irq 16 at device 0.0 on pci8
twa0: [GIANT-LOCKED]
twa0: [ITHREAD]
twa0: INFO: (0x04: 0x0053): Battery capacity test is overdue: 
twa0: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-8LPML, 8 ports, Firmware FE9X 3.06.00.005, BIOS BE9X 3.06.00.002

Two disks at stripe:
da1 at twa0 bus 0 target 1 lun 0
da1: <AMCC 9650SE-8LP DISK 3.06> Fixed Direct Access SCSI-5 device 
da1: 100.000MB/s transfers
da1: 476816MB (976519168 512 byte sectors: 255H 63S/T 60785C)

Mounted as:
/dev/da1    /s  ufs rw 2 2

The command:
$ dump -0Luan -f s.dump /s
  DUMP: Date of this level 0 dump: Mon Sep 17 10:26:21 2007
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping snapshot of /dev/da1 (/s) to s.dump
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 432054 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
[wait here forever]

The relevant part of iostat -w 10 and top:
-----
      tty             da0              da1             cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0  231 16.00   0  0.01  16.00 152  2.37   0  0  1  0 99
   0  257 16.00   1  0.02  16.00 147  2.29   0  0  1  0 99
   0  310 16.00   1  0.01  16.00 146  2.28   0  0  1  0 99
   0  265 16.00   0  0.01  16.00 152  2.38   0  0  1  0 99
   0  442 16.00   0  0.00  15.98 299  4.66   0  0  1  0 99
   0  620 15.54   3  0.04   3.46 1999  6.76   0  0  2  1 96
   0  363 16.00   0  0.01   2.77 5141 13.90   0  0  5  2 93
   0  309  0.00   0  0.00   3.22 114  0.36   0  0  0  0 100
   0  260 16.00   1  0.01   0.00   ...
From: Kostik Belousov
Date: Monday, September 17, 2007 - 1:04 am

Please, verify that you have rev. 1.39 of sys/kern/subr_sleepqueue.c.
From: Boris Samorodov
Date: Monday, September 17, 2007 - 1:09 am

% grep FBSDID /sys/kern/subr_sleepqueue.c
__FBSDID("$FreeBSD: src/sys/kern/subr_sleepqueue.c,v 1.39 2007/09/13 09:12:36 attilio Exp $");


WBR
-- 
Boris Samorodov (bsam)
Research Engineer, http://www.ipt.ru Telephone & Internet SP
FreeBSD committer, http://www.FreeBSD.org The Power To Serve
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Kostik Belousov
Date: Monday, September 17, 2007 - 1:32 am

2:36 attilio Exp $");

Most likely, this is a different bug then. Just to make sure, please, show
the output of ps axl | grep dump. The dump processes shall sleep, and be
killable (check that, please).
From: Boris Samorodov
Date: Monday, September 17, 2007 - 1:45 am

% ps axwwl | grep dump | grep -v grep
 9900 74530 74443   0   8  0 28248 26000 wait   I+    p1    0:00.88 dump -0Luan -f s.dump /s (dump)
 9900 74593 74530   0   4  0 28248 26020 sbwait I+    p1    0:01.05 dump: /dev/da1: pass 4: 44.73% done, finished in 0:00 at Mon Sep 17 10:27:03 2007 (dump)
 9900 74594 74593   1  20  0 28248 26000 pause  I+    p1    0:01.06 dump -0Luan -f s.dump /s (dump)
 9900 74595 74593   0  20  0 28248 26000 pause  I+    p1    0:01.06 dump -0Luan -f s.dump /s (dump)
 9900 74596 74593   0  20  0 28248 26000 pause  I+    p1    0:01.06 dump -0Luan -f s.dump /s (dump)


The process is killable:
# killall dump
[...]
  DUMP: Is the new volume mounted and ready to go?: ("yes" or "no") 
  DUMP: "Yes" or "No"?
  DUMP: Is the new volume mounted and ready to go?: ("yes" or "no") no
  DUMP: Do you want to abort?: ("yes" or "no") yes
  DUMP: The ENTIRE dump is aborted.
#


WBR
-- 
Boris Samorodov (bsam)
Research Engineer, http://www.ipt.ru Telephone & Internet SP
FreeBSD committer, http://www.FreeBSD.org The Power To Serve
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Kostik Belousov
Date: Monday, September 17, 2007 - 1:57 am

/dev/da1: pass 4: 44.73% done, finished in 0:00 at Mon Sep 17 10:27:03 200=
And, are you sure that no dump processes left in the system ? If yes, this
is indeed some other bug.
From: Boris Samorodov
Date: Monday, September 17, 2007 - 2:16 am

% ps auwx | grep dump | grep -v grep
%


WBR
-- 
Boris Samorodov (bsam)
Research Engineer, http://www.ipt.ru Telephone & Internet SP
FreeBSD committer, http://www.FreeBSD.org The Power To Serve
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Boris Samorodov
Date: Monday, September 17, 2007 - 11:15 am

The same is for gmirror (plain GENERIC but without WITNESS* and
INVARIANT*):
-----
$ df
Filesystem             1K-blocks     Used     Avail Capacity  Mounted on
/dev/mirror/gm0s1a        507630   387740     79280    83%    /
devfs                          1        1         0   100%    /dev
/dev/mirror/gm0s1e        507630       16    467004     0%    /tmp
/dev/mirror/gm0s1f      20802680  4463056  14675410    23%    /usr
/dev/mirror/gm0s1d       9117774    53354   8335000     1%    /var
[...]

$ dump -0Luan -f usr.dump /usr
  DUMP: Date of this level 0 dump: Mon Sep 17 21:58:24 2007
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping snapshot of /dev/mirror/gm0s1f (/usr) to usr.dump
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 4485376 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
load: 0.03  cmd: dump 98789 [sbwait] 0.19u 0.44s 0% 2448k
load: 0.01  cmd: dump 98791 [pause] 0.03u 1.00s 0% 2408k
load: 0.00  cmd: dump 98789 [sbwait] 0.19u 0.44s 0% 2448k
[waiting forever]

% sockstat | grep duser
duser    dump       98792 5  stream -> ??
duser    dump       98792 7  stream -> ??
duser    dump       98792 9  stream -> ??
duser    dump       98791 5  stream -> ??
duser    dump       98791 7  stream -> ??
duser    dump       98790 5  stream -> ??
duser    dump       98789 5  stream -> ??
duser    dump       98789 6  stream -> ??
duser    dump       98789 7  stream -> ??
duser    dump       98789 8  stream -> ??
duser    dump       98789 9  stream -> ??
duser    dump       98789 10 stream -> ??

Top:
last pid: 98864;  load averages:  0.00,  0.01,  0.03    up 1+21:27:20  22:06:20
86 processes:  1 running, 85 sleeping
CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.9% idle
Mem: 86M Active, 3331M Inact, 337M Wired, 468K Cache, 214M Buf, 3994M Free
Swap: 4096M Total, 4096M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   ...
Previous thread: zone allocation failures? by Michael Butler on Sunday, September 16, 2007 - 1:31 pm. (1 message)

Next thread: using unix domain socket get ENOTCONN in both 6.2 and 7.0 by Deng XueFeng on Monday, September 17, 2007 - 1:12 am. (2 messages)