NetBSD Problem Report #46462
From www@NetBSD.org Thu May 17 22:33:27 2012
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id D745F63B89C
for <gnats-bugs@gnats.NetBSD.org>; Thu, 17 May 2012 22:33:27 +0000 (UTC)
Message-Id: <20120517223326.D6D8663B882@www.NetBSD.org>
Date: Thu, 17 May 2012 22:33:26 +0000 (UTC)
From: cyber@netbsd.org
Reply-To: cyber@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: Hang trying to run a disklabel on an inserted USB flash drive
X-Send-Pr-Version: www-1.0
>Number: 46462
>Category: kern
>Synopsis: Hang trying to run a disklabel on an inserted USB flash drive
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: feedback
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu May 17 22:35:00 +0000 2012
>Closed-Date:
>Last-Modified: Sat May 27 12:21:19 +0000 2023
>Originator: Erik Berls
>Release:
>Organization:
NetBSD
>Environment:
amd64
>Description:
Plugging in a usb memory stick got me a hung proc.
These details are for rmind@
db> tr/t 0t3707
trace: pid 3707 lid 1 at 0xffffa00048ebc860
sleepq_block() at netbsd:sleepq_block+0xdd
turnstile_block() at netbsd:turnstile_block+0x2a3
mutex_vector_enter() at netbsd:mutex_vector_enter+0xf7
sdopen() at netbsd:sdopen+0x99
spec_open() at netbsd:spec_open+0x15e
VOP_OPEN() at netbsd:VOP_OPEN+0x29
vn_open() at netbsd:vn_open+0x13d
sys_open() at netbsd:sys_open+0xeb
syscall() at netbsd:syscall+0xb4
db>
db> ps/l
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
23700 1 3 0 84 ffffa00048245bc0 pickup kqueue
4394 1 3 0 84 ffffa00049fe9be0 csh ttyraw
3707 1 3 0 1000004 ffffa00048e1f7e0 disklabel tstile
3516 1 3 0 84 ffffa00048e0b7c0 tcsh pause
2981 1 3 0 84 ffffa00048e0b3e0 sshd select
3229 1 3 0 84 ffffa00048e0bba0 sshd netio
3222 1 3 0 84 ffffa00048f8dbc0 mountd select
1015 5 3 0 84 ffffa000488d5420 slave nfsd
4 3 0 84 ffffa000488d5be0 slave nfsd
3 3 0 84 ffffa00048e1f020 slave nfsd
2 3 0 84 ffffa00048e1fbc0 slave nfsd
1 3 0 84 ffffa000488d5800 master select
740 1 3 0 84 ffffa00048ccd000 ssh-agent select
639 1 3 0 84 ffffa00048982000 qmgr kqueue
558 1 3 0 84 ffffa00048192000 login wait
630 1 3 0 84 ffffa00048a14be0 cron nanoslp
618 1 3 0 80 ffffa00048a02400 inetd kqueue
601 1 3 0 84 ffffa00048a02bc0 master kqueue
372 1 3 0 84 ffffa000489827c0 sshd select
373 18 3 0 84 ffffa00048c76420 java parked
17 3 0 80 ffffa00048c76800 java select
16 3 0 84 ffffa00048c76be0 java parked
15 3 0 80 ffffa00048c0c020 java select
14 3 0 84 ffffa00048c0c400 java parked
13 5 0 4 ffffa00048c0c7e0 java
11 3 0 84 ffffa00048b3b000 java parked
10 3 0 84 ffffa00048b3b3e0 java parked
9 3 0 80 ffffa00048b3b7c0 java parked
8 3 0 84 ffffa00048b3bba0 java parked
7 3 0 84 ffffa00048a14040 java parked
6 3 0 80 ffffa00048a14420 java parked
5 3 0 80 ffffa00048a14800 java parked
4 3 0 80 ffffa00048332be0 java parked
3 3 0 84 ffffa00048a02020 java parked
2 3 0 80 ffffa00048a027e0 java select
1 3 0 80 ffffa000489823e0 java parked
297 1 3 0 84 ffffa00048982ba0 mdnsd select
303 1 3 0 84 ffffa00048912040 mDNSResponderPos select
275 2 3 0 84 ffffa00048895ba0 xenbackendd netio
1 3 0 84 ffffa00048332800 xenbackendd parked
262 2 3 0 84 ffffa00048895000 xenconsoled netio
1 3 0 84 ffffa0004890cbc0 xenconsoled select
254 63 5 0 4 ffffa00049bd7400 python2.6
7 3 0 84 ffffa00048912420 python2.6 select
6 3 0 80 ffffa00048912800 python2.6 select
5 3 0 84 ffffa00048912be0 python2.6 netio
4 3 0 84 ffffa0004890c020 python2.6 parked
3 3 0 84 ffffa0004890c400 python2.6 select
2 3 0 80 ffffa0004890c7e0 python2.6 socket
1 3 0 80 ffffa000488953e0 python2.6 select
251 1 3 0 80 ffffa000488957c0 python2.6 wait
235 1 3 0 84 ffffa00048332420 xenstored select
163 1 3 0 84 ffffa000482457e0 rpcbind select
76 1 2 0 4 ffffa00048245400 syslogd
1 1 3 0 84 ffffa0004545f7e0 init wait
0 59 3 0 204 ffffa00049c5b000 scsibus0 biowait
57 5 0 204 ffffa0004544f040 (zombie)
49 3 0 204 ffffa00048245020 physiod physiod
48 3 0 204 ffffa000481923e0 aiodoned aiodoned
47 3 0 204 ffffa000481927c0 ioflush syncer
46 3 0 204 ffffa00048192ba0 pgdaemon pgdaemon
45 3 0 204 ffffa00048188040 xbdbackd xbdbackd
44 3 0 204 ffffa00048188420 cryptoret crypto_wait
43 3 0 204 ffffa0004545b7e0 ipmi0 ipmi0
42 3 0 204 ffffa0004545b020 ipmi ipmi_poll
40 3 0 204 ffffa0004545eba0 usb7 usbevt
39 3 0 204 ffffa0004545d040 usb6 usbevt
38 3 0 204 ffffa0004545d420 usb5 usbevt
37 3 0 204 ffffa0004545b400 usb4 usbevt
36 3 0 204 ffffa0004545d800 usb0 usbevt
35 3 0 204 ffffa0004545dbe0 usb3 usbevt
34 3 0 204 ffffa00048188800 usb2 usbevt
33 3 0 204 ffffa00048188be0 usbtask-dr usbtsk
32 3 0 204 ffffa0004545f020 usbtask-hc usbtsk
31 3 0 204 ffffa0004545f400 usb1 usbevt
30 3 0 204 ffffa0004545fbc0 unpgc unpgc
29 3 0 204 ffffa0004545e000 vmem_rehash vmem_rehash
28 3 0 204 ffffa0004545e3e0 xenbus rdst
27 3 0 204 ffffa0004545e7c0 xenwatch evtsq
18 3 0 204 ffffa0004545bbc0 atabus3 atath
17 3 0 204 ffffa00045459000 atabus2 atath
16 3 0 204 ffffa000454593e0 atabus1 atath
15 3 0 204 ffffa000454597c0 atabus0 atath
14 3 0 284 ffffa00045459ba0 fw0_probe -
12 3 0 204 ffffa0004544f420 sysmon smtaskq
11 3 0 204 ffffa0004544f800 pmfevent pmfevent
10 3 0 204 ffffa0004544fbe0 nfssilly nfssilly
9 3 0 204 ffffa0004544c020 cachegc cachegc
8 3 0 204 ffffa0004544c400 vrele vrele
7 3 0 204 ffffa0004544c7e0 xcall/0 xcall
6 1 0 204 ffffa0004544cbc0 softser/0
5 1 0 204 ffffa0004544a000 softclk/0
4 1 0 204 ffffa0004544a3e0 softbio/0
3 1 0 204 ffffa0004544a7c0 softnet/0
> 2 7 0 205 ffffa0004544aba0 idle/0
1 3 0 204 ffffffff80b5a220 swapper schedule
db>
db> tr/t 0.59
trace: pid 0 lid 59 at 0xffffa00049c1b770
sleepq_block() at netbsd:sleepq_block+0xdd
cv_wait() at netbsd:cv_wait+0xf1
biowait() at netbsd:biowait+0x4f
read_sector() at netbsd:read_sector+0x23
scan_mbr() at netbsd:scan_mbr+0x40
readdisklabel() at netbsd:readdisklabel+0xfb
sdopen() at netbsd:sdopen+0x26a
spec_open() at netbsd:spec_open+0x29a
VOP_OPEN() at netbsd:VOP_OPEN+0x29
dkwedge_discover() at netbsd:dkwedge_discover+0xee
sdattach() at netbsd:sdattach+0x206
config_attach_loc() at netbsd:config_attach_loc+0x15a
scsi_probe_bus() at netbsd:scsi_probe_bus+0x4aa
scsibus_config() at netbsd:scsibus_config+0x5d
scsipi_completion_thread() at netbsd:scsipi_completion_thread+0x23
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: cyber@netbsd.org
Subject: Re: kern/46462: Hang trying to run a disklabel on an inserted USB
flash drive
Date: Sat, 19 May 2012 20:29:03 +0100
cyber@netbsd.org wrote:
> >Number: 46462
> >Category: kern
> >Synopsis: Hang trying to run a disklabel on an inserted USB flash
> >drive Confidential: no
> >Severity: serious
> >Priority: medium
> >Responsible: kern-bug-people
> >State: open
> >Class: sw-bug
> >Submitter-Id: net
> >Arrival-Date: Thu May 17 22:35:00 +0000 2012
> >Originator: Erik Berls
> <...>
OK, this thread is likely a problem:
> db> tr/t 0.59
> trace: pid 0 lid 59 at 0xffffa00049c1b770
> sleepq_block() at netbsd:sleepq_block+0xdd
> cv_wait() at netbsd:cv_wait+0xf1
> biowait() at netbsd:biowait+0x4f
> read_sector() at netbsd:read_sector+0x23
> scan_mbr() at netbsd:scan_mbr+0x40
> readdisklabel() at netbsd:readdisklabel+0xfb
> sdopen() at netbsd:sdopen+0x26a
> spec_open() at netbsd:spec_open+0x29a
> VOP_OPEN() at netbsd:VOP_OPEN+0x29
> dkwedge_discover() at netbsd:dkwedge_discover+0xee
> sdattach() at netbsd:sdattach+0x206
> config_attach_loc() at netbsd:config_attach_loc+0x15a
> scsi_probe_bus() at netbsd:scsi_probe_bus+0x4aa
> scsibus_config() at netbsd:scsibus_config+0x5d
> scsipi_completion_thread() at netbsd:scsipi_completion_thread+0x23
- scsipi_completion_thread()
This asynchronous completion thread handles the event, which is
in this case attaching of a new device. Callbacks are performed
under splbio().
-- [skipping] .. sdopen()
Acquiring sd->sc_dk.dk_openlock here. We are still at IPL_BIO.
--- [skipping] .. read_sector() -> disk_read_sectors()
Invoking sdstrategy() and entering biowait(). However, since
we are still at IPL_BIO, biodone() does not happen - thread has
deadlocked against interrupt with sd->sc_dk.dk_openlock held.
> db> tr/t 0t3707
> trace: pid 3707 lid 1 at 0xffffa00048ebc860
> sleepq_block() at netbsd:sleepq_block+0xdd
> turnstile_block() at netbsd:turnstile_block+0x2a3
> mutex_vector_enter() at netbsd:mutex_vector_enter+0xf7
> sdopen() at netbsd:sdopen+0x99
> spec_open() at netbsd:spec_open+0x15e
> VOP_OPEN() at netbsd:VOP_OPEN+0x29
> vn_open() at netbsd:vn_open+0x13d
> sys_open() at netbsd:sys_open+0xeb
> syscall() at netbsd:syscall+0xb4
> db>
This thread merely tries to acquire sd->sc_dk.dk_openlock and waits in
the turnstile for deadlocked scsipi_completion_thread().
Fix: rework code to remove top-level splbio() in scsipi_completion_thread().
Long term fix: make scsipi subsystem MP-safe.
--
Mindaugas
State-Changed-From-To: open->analyzed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sat, 19 May 2012 19:32:14 +0000
State-Changed-Why:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Mindaugas Rasiukevicius <rmind@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
netbsd-bugs@NetBSD.org, cyber@NetBSD.org
Subject: Re: kern/46462: Hang trying to run a disklabel on an inserted USB
flash drive
Date: Mon, 21 May 2012 14:45:47 +0200
On Sat, May 19, 2012 at 08:29:03PM +0100, Mindaugas Rasiukevicius wrote:
> OK, this thread is likely a problem:
>
> > db> tr/t 0.59
> > trace: pid 0 lid 59 at 0xffffa00049c1b770
> > sleepq_block() at netbsd:sleepq_block+0xdd
> > cv_wait() at netbsd:cv_wait+0xf1
> > biowait() at netbsd:biowait+0x4f
> > read_sector() at netbsd:read_sector+0x23
> > scan_mbr() at netbsd:scan_mbr+0x40
> > readdisklabel() at netbsd:readdisklabel+0xfb
> > sdopen() at netbsd:sdopen+0x26a
> > spec_open() at netbsd:spec_open+0x29a
> > VOP_OPEN() at netbsd:VOP_OPEN+0x29
> > dkwedge_discover() at netbsd:dkwedge_discover+0xee
> > sdattach() at netbsd:sdattach+0x206
> > config_attach_loc() at netbsd:config_attach_loc+0x15a
> > scsi_probe_bus() at netbsd:scsi_probe_bus+0x4aa
> > scsibus_config() at netbsd:scsibus_config+0x5d
> > scsipi_completion_thread() at netbsd:scsipi_completion_thread+0x23
>
> - scsipi_completion_thread()
> This asynchronous completion thread handles the event, which is
> in this case attaching of a new device. Callbacks are performed
> under splbio().
>
> -- [skipping] .. sdopen()
> Acquiring sd->sc_dk.dk_openlock here. We are still at IPL_BIO.
>
> --- [skipping] .. read_sector() -> disk_read_sectors()
> Invoking sdstrategy() and entering biowait(). However, since
> we are still at IPL_BIO, biodone() does not happen - thread has
> deadlocked against interrupt with sd->sc_dk.dk_openlock held.
I don't understand how you get to this conclusion. biowait() will cause the
thread to sleep, and IPL_BIO interrupts will be accepted again while
it sleeps. Otherwise, all IPL_VM and lower interrupts would be blocked,
which would probably cause not only disklabel to hang, but most probably
the complete system.
I guess the problem is more that the USB I/O doesn't completes for some
reason.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Sat, 14 Jul 2018 19:00:39 +0000
Responsible-Changed-Why:
Found this reference in PR kern/52783, keeping on me just in case it turns
out being similar underlying problem.
Responsible-Changed-From-To: jdolecek->kern-bug-people
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Fri, 10 Aug 2018 22:49:34 +0000
Responsible-Changed-Why:
kern/52783 turned out to be wd(4) specific problem, so returning back
State-Changed-From-To: analyzed->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 27 May 2023 12:21:19 +0000
State-Changed-Why:
scsipi has been made MP-safe and umass has had lots of error branch fixes,
is this still reproducible?
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.