NetBSD Problem Report #49022
From www@NetBSD.org Sat Jul 19 18:29:31 2014
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 024E7A6557
for <gnats-bugs@gnats.NetBSD.org>; Sat, 19 Jul 2014 18:29:31 +0000 (UTC)
Message-Id: <20140719182929.A4E25A6558@mollari.NetBSD.org>
Date: Sat, 19 Jul 2014 18:29:29 +0000 (UTC)
From: n54@gmx.com
Reply-To: n54@gmx.com
To: gnats-bugs@NetBSD.org
Subject: Locking issues in interaction of TTY and USB subsystems (exposed with uhso(4))
X-Send-Pr-Version: www-1.0
>Number: 49022
>Category: kern
>Synopsis: Locking issues in interaction of TTY and USB subsystems (exposed with uhso(4))
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: dead
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jul 19 18:30:00 +0000 2014
>Closed-Date: Sat Jul 11 09:54:48 +0000 2015
>Last-Modified: Sat Jul 11 09:54:48 +0000 2015
>Originator: Kamil Rytarowski
>Release: current 6.99.47
>Organization:
>Environment:
NetBSD compaq 6.99.47 NetBSD 6.99.47 (NO_DRM) #2: Fri Jul 18 09:54:57 CEST 2014 kamil@compaq:/home/kamil/NetBSD-current/src/sys/arch/amd64/compile/NO_DRM amd64
>Description:
Here is my original mail to current-users [1]:
uhso(4) is a kernel driver for Option modems. It utilizes inter alia USB and TTY subsystems in the kernel. I would like to thank to the NetBSD crew for providing a driver for my modem!
Unfortunatelly, I've got issues with the driver with LOCKDEBUG turned on (expensive locking checks/support), and these issues result in kernel panics.
Grzegorz found at least two bugs and told me that the driver has to be discussed with authors and most likely parly rewritten.
I issue: spin lock held when taking mutex
File: src/sys/kern/subr_devsw.c
int
cdev_write(dev_t dev, struct uio *uio, int flag)
{
const struct cdevsw *d;
int rv, mpflag;
if ((d = cdevsw_lookup(dev)) == NULL)
return ENXIO;
DEV_LOCK(d);
rv = (*d->d_write)(dev, uio, flag);
DEV_UNLOCK(d);
return rv;
}
DEV_LOCK() is a macro defined at the same file src/sys/kern/subr_devsw.c:
/*
* Device access methods.
*/
#define DEV_LOCK(d) \
if ((mpflag = (d->d_flag & D_MPSAFE)) == 0) { \
KERNEL_LOCK(1, NULL); \
}
#define DEV_UNLOCK(d) \
if (mpflag == 0) { \
KERNEL_UNLOCK_ONE(NULL); \
}
And uhso_tty_write() at src/sys/dev/usb/uhso.c takes mutex.
Backtrace:
#0 0xffffffff805b619f in cpu_reboot (howto=260, bootstr=<optimized out>)
at ../../../../arch/amd64/amd64/machdep.c:671
671 dumpsys();
#0 0xffffffff805b619f in cpu_reboot (howto=260, bootstr=<optimized out>)
at ../../../../arch/amd64/amd64/machdep.c:671
#1 0xffffffff807791fd in vpanic (
fmt=0xffffffff80bf982b "LOCKDEBUG: %s error: %s: %s",
ap=0xfffffe8007246950) at ../../../../kern/subr_prf.c:284
#2 0xffffffff807792b1 in panic (fmt=<unavailable>)
at ../../../../kern/subr_prf.c:200
#3 0xffffffff80770501 in lockdebug_abort1 (ld=0xffff800005989658, s=6,
func=0xffffffff80b40220 "lockdebug_barrier",
msg=0xffffffff80bf98de "spin lock held", dopanic=true)
at ../../../../kern/subr_lockdebug.c:783
#4 0xffffffff80539b1c in mutex_vector_enter (mtx=0xfffffe8006d1c010)
at ../../../../kern/kern_mutex.c:487
#5 0xffffffff802aacc2 in ehci_device_bulk_transfer (xfer=0xfffffe8006d69a10)
at ../../../../dev/usb/ehci.c:3547
#6 0xffffffff808473dc in usbd_transfer (xfer=0xfffffe8006d69a10)
at ../../../../dev/usb/usbdi.c:300
#7 0xffffffff80814fe2 in uhso_bulk_write (hp=0xfffffe80be282288)
at ../../../../dev/usb/uhso.c:1187
#8 0xffffffff80816bc6 in uhso_tty_start (tp=0xfffffe80be28c6c8)
at ../../../../dev/usb/uhso.c:1820
#9 0xffffffff807cb840 in ttstart (tp=0xfffffe80be28c6c8)
at ../../../../kern/tty.c:1680
#10 ttwrite (tp=0xfffffe80be28c6c8, uio=0xfffffe8007246de0,
flag=<optimized out>) at ../../../../kern/tty.c:2196
#11 0xffffffff80814af9 in uhso_tty_write (dev=<optimized out>,
uio=0xfffffe8007246de0, flag=16) at ../../../../dev/usb/uhso.c:1633
#12 0xffffffff80767bde in cdev_write (dev=50184, uio=0xfffffe8007246de0,
flag=16) at ../../../../kern/subr_devsw.c:886
#13 0xffffffff8074ea1c in spec_write (v=0xfffffe8007246d30)
at ../../../../miscfs/specfs/spec_vnops.c:790
#14 0xffffffff808a7fa3 in VOP_WRITE (vp=0xfffffe80bd811730,
uio=<optimized out>, ioflag=<optimized out>, cred=<optimized out>)
at ../../../../kern/vnode_if.c:430
#15 0xffffffff8089201e in vn_write (fp=<optimized out>,
offset=0xfffffe80b5b3ad40, uio=0xfffffe8007246de0,
cred=0xfffffe80b6c79c00, flags=1) at ../../../../kern/vfs_vnops.c:588
#16 0xffffffff80788206 in dofilewrite (fd=1, fp=0xfffffe80b5b3ad40,
buf=0x7f7fffffd3af, nbyte=1, offset=<optimized out>,
flags=<optimized out>, retval=0xfffffe8007246eb0)
at ../../../../kern/sys_generic.c:355
#17 0xffffffff80788333 in sys_write (l=<optimized out>,
uap=0xfffffe8007246f00, retval=0xfffffe8007246eb0)
at ../../../../kern/sys_generic.c:323
#18 0xffffffff80793a86 in sy_call (rval=0xfffffe8007246eb0,
uap=0xfffffe8007246f00, l=0xfffffe80b8e60ba0, sy=0xffffffff80ec0580)
at ../../../../sys/syscallvar.h:61
#19 sy_invoke (code=4, rval=0xfffffe8007246eb0, uap=0xfffffe8007246f00,
l=0xfffffe80b8e60ba0, sy=0xffffffff80ec0580)
at ../../../../sys/syscallvar.h:85
#20 syscall (frame=0xfffffe8007246f00)
at ../../../../arch/x86/x86/syscall.c:156
#21 0xffffffff80100691 in ?? ()
#22 0x0000000000000001 in ?? ()
#23 0x00007f7fffffd3af in ?? ()
#24 0x0000000000000001 in ?? ()
#25 0x00007f7ff743952a in ?? ()
#26 0x0000000000000000 in ?? ()
II issue is similar, because adding D_MPSAFE (for test purposes, not a real fix) to d_flag a spin lock is held in ttwrite.
Testing change of d_flag to 'D_TTY | D_MPSAFE' was done at src/sys/dev/usb/uhso.c
const struct cdevsw uhso_cdevsw = {
.d_open = uhso_tty_open,
.d_close = uhso_tty_close,
.d_read = uhso_tty_read,
.d_write = uhso_tty_write,
.d_ioctl = uhso_tty_ioctl,
.d_stop = uhso_tty_stop,
.d_tty = uhso_tty_tty,
.d_poll = uhso_tty_poll,
.d_mmap = nommap,
.d_kqfilter = ttykqfilter,
.d_flag = D_TTY,
};
I asked the author (Iain Hibbert) [2], and here is my answer to his question:
On Saturday, July 19, 2014 at 7:11 PM, "Iain Hibbert" wrote:
> Is this with LOCKDEBUG and no other changes, or with LOCKDEBUG and
> the addition of the MPSAFE flag?
Both, D_MPSAFE was an attempt to walk-around the first mentioned
panic, but we ran into another (mentioned) issue with spin lock held
when taking a mutex.
> there is USB locking and TTY locking which must interact and I am
> not sure what work has been done to make drivers MPSAFE in general.
Thank-you for your confirmantion that it's a design issue in
interaction with (at least) USB and TTY locking.
[1] http://mail-index.netbsd.org/current-users/2014/07/19/msg025295.html
[2] http://mail-index.netbsd.org/current-users/2014/07/19/msg025302.html reply for http://mail-index.netbsd.org/current-users/2014/07/19/msg025300.html
>How-To-Repeat:
1. plug a USB modem
2. # pppd call your-operator
get a core / debug with ddb.onpanic
>Fix:
Temporarily walk-around to make it work (hide bug): Disable kernel locking-correctness asserts (LOCKDEBUG).
>Release-Note:
>Audit-Trail:
From: Nick Hudson <skrll@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: n54@gmx.com, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/49022: Locking issues in interaction of TTY and USB subsystems
(exposed with uhso(4))
Date: Sun, 20 Jul 2014 12:19:14 +0100
On 07/19/14 19:30, n54@gmx.com wrote:
> #3 0xffffffff80770501 in lockdebug_abort1 (ld=0xffff800005989658, s=6,
> func=0xffffffff80b40220 "lockdebug_barrier",
> msg=0xffffffff80bf98de "spin lock held", dopanic=true)
> at ../../../../kern/subr_lockdebug.c:783
> #4 0xffffffff80539b1c in mutex_vector_enter (mtx=0xfffffe8006d1c010)
> at ../../../../kern/kern_mutex.c:487
> #5 0xffffffff802aacc2 in ehci_device_bulk_transfer (xfer=0xfffffe8006d69a10)
> at ../../../../dev/usb/ehci.c:3547
> #6 0xffffffff808473dc in usbd_transfer (xfer=0xfffffe8006d69a10)
> at ../../../../dev/usb/usbdi.c:300
> #7 0xffffffff80814fe2 in uhso_bulk_write (hp=0xfffffe80be282288)
> at ../../../../dev/usb/uhso.c:1187
> #8 0xffffffff80816bc6 in uhso_tty_start (tp=0xfffffe80be28c6c8)
> at ../../../../dev/usb/uhso.c:1820
> #9 0xffffffff807cb840 in ttstart (tp=0xfffffe80be28c6c8)
> at ../../../../kern/tty.c:1680
> #10 ttwrite (tp=0xfffffe80be28c6c8, uio=0xfffffe8007246de0,
> flag=<optimized out>) at ../../../../kern/tty.c:2196
tty_lock is held here...
2196 ttstart(tp);
2197 mutex_spin_exit(&tty_lock);
http://nxr.netbsd.org/xref/src/sys/kern/tty.c#2196
Nick
State-Changed-From-To: open->dead
State-Changed-By: kamil@NetBSD.org
State-Changed-When: Sat, 11 Jul 2015 11:54:48 +0200
State-Changed-Why:
The originator no longer has the hardware.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.