NetBSD Problem Report #52462
From he@smistad.uninett.no Fri Aug 4 11:20:33 2017
Return-Path: <he@smistad.uninett.no>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 131F67A17B
for <gnats-bugs@gnats.NetBSD.org>; Fri, 4 Aug 2017 11:20:33 +0000 (UTC)
Message-Id: <20170804112028.5C47C43EA91@smistad.uninett.no>
Date: Fri, 4 Aug 2017 13:20:28 +0200 (CEST)
From: he@NetBSD.org
Reply-To: he@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: aac driver (and possibly others) needs MPification
X-Send-Pr-Version: 3.95
>Number: 52462
>Category: kern
>Synopsis: aac driver (and possibly others) needs MPification
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Aug 04 11:25:00 +0000 2017
>Closed-Date: Mon Sep 25 16:45:51 +0000 2017
>Last-Modified: Mon Sep 25 16:45:51 +0000 2017
>Originator: he@NetBSD.org
>Release: NetBSD 8.0_BETA
>Organization:
I Try...
>Environment:
System: NetBSD 8.0_BETA (GENERIC) #10: Thu Aug 3 13:44:33 CEST 2017 he@vk.urc.uninett.no:/usr/obj/sys/arch/amd64/compile/GENERIC
Architecture: x86_64
Machine: amd64
>Description:
Between netbsd-7 and netbsd-8, it appears that the 'ld' driver
has been adapted to run multithreaded. Under 'ld', there can
be one of several hardware drivers, at least aac, cac, icp,
mlx and nvme. And .. at least the 'aac' driver is not adapted
to run multithreaded; it uses splbio() for exclusion, and
apparently that creates problems.
I discovered this when trying to upgrade a machine from
netbsd-7 to netbsd-8 which has this controller and disks:
aac0 at pci6 dev 0 function 0: IBM ServeRAID 8k
aac0:0 interrupting at ioapic0 pin 17
aac0: Enabling 64-bit address support
aac0: Enable 64-bit array support
aac0: New comm. interface enabled
aac0: MIPS 5KC at 250MHz, 32MB mem (16MB cache), optional battery not installed
ld0 at aac0 unit 0: RAID 1 (Mirror)
ld0: 232 GB, 30378 cyl, 255 head, 63 sec, 512 bytes/sect x 488036352 sectors
ld1 at aac0 unit 1: RAID 10
ld1: 465 GB, 60757 cyl, 255 head, 63 sec, 512 bytes/sect x 976072704 sectors
The 8.0_BETA kernel booted, and I managed to extract the
'base' set and 29% of the 'comp' set before it panic'ed:
fatal page fault in supervisor mode
trap type 6 code 0 rip 0xffffffff8051cee3 cs 0x8 rflags 0x10246 cr2 0 ilevel 0x6 rsp 0xfffffe810e8fecd8
curlwp 0xfffffe823f731840 pid 0.2 lowest kstack 0xfffffe810e8fc2c0
panic: trap
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x140
snprintf() at netbsd:snprintf
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0x96
aac_new_intr() at netbsd:aac_new_intr+0xb0
intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
Xintr_ioapic_level1() at netbsd:Xintr_ioapic_level1+0xf2
--- interrupt ---
x86_stihlt() at netbsd:x86_stihlt+0x6
acpicpu_cstate_idle_enter() at netbsd:acpicpu_cstate_idle_enter+0xdb
acpicpu_cstate_idle() at netbsd:acpicpu_cstate_idle+0xdb
idle_loop() at netbsd:idle_loop+0xf8
cpu0: End traceback...
Catching it in ddb points towards a null pointer de-reference inside
the expansion of the SIMPLEQ_FIRST macro in aac_ccb_enqueue, even
though this looks on the surface of it decidedly nonsensical.
So now the machine is hosed and requires a physical visit, even
though I have a remote serial console.
Apparently the aac driver needs to be properly converted to use the
multithreaded synchronization primitives, but it's not as simple as
just replacing splbio() with mutex_enter() and splx() with
mutex_exit(), because there are certain operations which are not
permitted while holding such a mutex, such as doing malloc() and
calling bus_* functions. So how this driver should be converted but
still retaining its required mutual exclusion is not trivial, and
certainly above my current abilities, so "Help!"
>How-To-Repeat:
Try to minimally stress an 8.0_BETA machine with an 'aac' disk
subsystem, and watch it panic.
>Fix:
Sorry, don't know, ref. the above.
>Release-Note:
>Audit-Trail:
From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52462: aac driver (and possibly others) needs MPification
Date: Sat, 05 Aug 2017 00:13:55 +0200 (CEST)
Hm, the list of drivers 'below' ld is incomplete at best,
e.g. twa.c also uses splbio() for mutual exclusion.
So if the missing concurrency protection is indeed what caused
the problem in the first place, there are more drivers in need of
repair.
Regards,
- H=E5vard
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/52462: aac driver (and possibly others) needs MPification
Date: Sat, 5 Aug 2017 05:44:57 -0000 (UTC)
he@NetBSD.org (Havard Eidnes) writes:
> Hm, the list of drivers 'below' ld is incomplete at best,
> e.g. twa.c also uses splbio() for mutual exclusion.
>
> So if the missing concurrency protection is indeed what caused
> the problem in the first place, there are more drivers in need of
> repair.
These are the drivers behind ld(4):
./arch/usermode/dev/ld_thunkbus.c
./dev/ata/ld_ataraid.c
./dev/i2o/ld_iop.c
./dev/ic/ld_nvme.c
./dev/ic/ld_aac.c
./dev/ic/ld_cac.c
./dev/ic/ld_icp.c
./dev/ic/ld_mlx.c
./dev/pci/ld_amr.c
./dev/pci/ld_twa.c
./dev/pci/ld_twe.c
./dev/pci/ld_virtio.c
./dev/sdmmc/ld_sdmmc.c
nvme and sdmmc are ok. virtio, iop, cac also look good.
But if that's everything, then wrapping the aac driver entry
with kernel lock should have been sufficient.
--
--
Michael van Elst
Internet: mlelstv@serpens.de
"A potential Snark may lurk in every tree."
From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@NetBSD.org, mlelstv@serpens.de
Cc: netbsd-bugs@netbsd.org
Subject: Re: kern/52462: aac driver (and possibly others) needs MPification
Date: Sat, 05 Aug 2017 13:50:58 +0200 (CEST)
> These are the drivers behind ld(4):
Thanks for the inspection of the drivers.
> But if that's everything, then wrapping the aac driver entry
> with kernel lock should have been sufficient.
Well, it provably wasn't sufficient to do
RCS file: /cvsroot/src/sys/dev/ic/ld_aac.c,v
retrieving revision 1.30
diff -p -u -r1.30 ld_aac.c
--- ld_aac.c 27 Sep 2016 03:33:32 -0000 1.30
+++ ld_aac.c 3 Aug 2017 14:20:19 -0000
@@ -314,9 +314,14 @@ ld_aac_dobio(struct ld_aac_softc *sc, vo
static int
ld_aac_start(struct ld_softc *ld, struct buf *bp)
{
+ int rc;
=
- return (ld_aac_dobio((struct ld_aac_softc *)ld, bp->b_data,
- bp->b_bcount, bp->b_rawblkno, (bp->b_flags & B_READ) =3D=3D=
0, bp));
+ KERNEL_LOCK(1, curlwp);
+ rc =3D ld_aac_dobio((struct ld_aac_softc *)ld, bp->b_data,
+ bp->b_bcount, bp->b_rawblkno, (bp->b_flags & B_READ) =3D=3D=
0, bp);
+ KERNEL_UNLOCK_ONE(curlwp);
+
+ return rc;
}
=
static void
as you provided privately. What then happens is that the kernel
goes into inactivity when the user-land startup tries to start
BIND, and when I break into DDB I repeatedly get this as the
traceback:
db{0}> trace
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x59a
Xintr_ioapic_edge8() at netbsd:Xintr_ioapic_edge8+0xee
--- interrupt ---
x86_pause() at netbsd:x86_pause
lddone() at netbsd:lddone+0x1e
aac_new_intr() at netbsd:aac_new_intr+0xed
intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
Xintr_ioapic_level1() at netbsd:Xintr_ioapic_level1+0xf2
--- interrupt ---
x86_stihlt() at netbsd:x86_stihlt+0x6
acpicpu_cstate_idle_enter() at netbsd:acpicpu_cstate_idle_enter+0xdb
acpicpu_cstate_idle() at netbsd:acpicpu_cstate_idle+0xb6
idle_loop() at netbsd:idle_loop+0x18c
db{0}> c
...
db{0}> tra =
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x59a
Xintr_ioapic_edge8() at netbsd:Xintr_ioapic_edge8+0xee
--- interrupt ---
x86_pause() at netbsd:x86_pause+0x2
lddone() at netbsd:lddone+0x1e
aac_new_intr() at netbsd:aac_new_intr+0xed
intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
Xintr_ioapic_level1() at netbsd:Xintr_ioapic_level1+0xf2
--- interrupt ---
x86_stihlt() at netbsd:x86_stihlt+0x6
acpicpu_cstate_idle_enter() at netbsd:acpicpu_cstate_idle_enter+0xdb
acpicpu_cstate_idle() at netbsd:acpicpu_cstate_idle+0xb6
idle_loop() at netbsd:idle_loop+0x18c
db{0}> call cpu_reset
and no further progress was apparently being made.
Regards,
- H=E5vard
From: Michael van Elst <mlelstv@serpens.de>
To: Havard Eidnes <he@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52462: aac driver (and possibly others) needs MPification
Date: Sat, 5 Aug 2017 15:24:20 +0200
On Sat, Aug 05, 2017 at 01:50:58PM +0200, Havard Eidnes wrote:
>
> as you provided privately. What then happens is that the kernel
> goes into inactivity when the user-land startup tries to start
> BIND, and when I break into DDB I repeatedly get this as the
> traceback:
>
> db{0}> trace
> breakpoint() at netbsd:breakpoint+0x5
> comintr() at netbsd:comintr+0x59a
> Xintr_ioapic_edge8() at netbsd:Xintr_ioapic_edge8+0xee
> --- interrupt ---
> x86_pause() at netbsd:x86_pause
> lddone() at netbsd:lddone+0x1e
> aac_new_intr() at netbsd:aac_new_intr+0xed
> intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
> Xintr_ioapic_level1() at netbsd:Xintr_ioapic_level1+0xf2
> --- interrupt ---
> x86_stihlt() at netbsd:x86_stihlt+0x6
> acpicpu_cstate_idle_enter() at netbsd:acpicpu_cstate_idle_enter+0xdb
> acpicpu_cstate_idle() at netbsd:acpicpu_cstate_idle+0xb6
> idle_loop() at netbsd:idle_loop+0x18c
> db{0}> c
The part between lddone and x86_pause is missing, thanks to
an optimizing compiler...
x86_pause is called when a CPU busy-waits for a spin-mutex
which could be anyhere, but the return address (lddone+0x1e)
points to the mutex_enter() call directly in lddone.
I.e. there is something holding the ld driver mutex.
The most likely reason would be someone calling into ld_diskstart
which holds the mutex while calling into ld_aac_start.
So:
some thread calling into ld driver:
- get mutex (in ld_diskstart)
- get kernel lock (in ld_aac_start)
the interrupt:
- get kernel lock (due to non-MPSAFE interrupt)
- get mutex (in lddone).
The wrong locking order may cause a deadlock.
Ok. For the next try: remove the patch and simply declare ld as non-mpsafe
by removing the D_MPSAFE flags.
Greetings,
--
Michael van Elst
Internet: mlelstv@serpens.de
"A potential Snark may lurk in every tree."
From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@NetBSD.org, mlelstv@serpens.de
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52462: aac driver (and possibly others) needs MPification
Date: Tue, 08 Aug 2017 14:58:16 +0200 (CEST)
> Ok. For the next try: remove the patch and simply declare ld as non-=
mpsafe
> by removing the D_MPSAFE flags.
OK, did that with
vk# cvs diff sys/dev/ld.c
Index: sys/dev/ld.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/dev/ld.c,v
retrieving revision 1.101
diff -u -r1.101 ld.c
--- sys/dev/ld.c 27 Apr 2017 17:07:22 -0000 1.101
+++ sys/dev/ld.c 8 Aug 2017 12:56:06 -0000
@@ -92,7 +92,7 @@
.d_dump =3D lddump,
.d_psize =3D ldsize,
.d_discard =3D lddiscard,
- .d_flag =3D D_DISK | D_MPSAFE
+ .d_flag =3D D_DISK
};
=
const struct cdevsw ld_cdevsw =3D {
@@ -107,7 +107,7 @@
.d_mmap =3D nommap,
.d_kqfilter =3D nokqfilter,
.d_discard =3D lddiscard,
- .d_flag =3D D_DISK | D_MPSAFE
+ .d_flag =3D D_DISK
};
=
static struct dkdriver lddkdriver =3D {
vk#
and now the machine behaves as it should, and no longer panics or
wedges.
Regards,
- H=E5vard
From: Michael van Elst <mlelstv@serpens.de>
To: Havard Eidnes <he@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52462: aac driver (and possibly others) needs MPification
Date: Tue, 8 Aug 2017 17:58:18 +0200
On Tue, Aug 08, 2017 at 02:58:16PM +0200, Havard Eidnes wrote:
> > Ok. For the next try: remove the patch and simply declare ld as non-mpsafe
> > by removing the D_MPSAFE flags.
>
> OK, did that with
> and now the machine behaves as it should, and no longer panics or
> wedges.
Good.
http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/ld.diff
is a patch that takes KERNEL_LOCK when calling into the backend
(hopefully without the locking order violation), except for
i2o,cac,nvme,virtio and sdmmc which are tagged as safe.
Greetings,
--
Michael van Elst
Internet: mlelstv@serpens.de
"A potential Snark may lurk in every tree."
From: Havard Eidnes <he@NetBSD.org>
To: mlelstv@serpens.de
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/52462: aac driver (and possibly others) needs MPification
Date: Wed, 09 Aug 2017 09:56:47 +0200 (CEST)
>> > Ok. For the next try: remove the patch and simply declare ld as n=
on-mpsafe
>> > by removing the D_MPSAFE flags.
>>
>> OK, did that with
> ...
>> and now the machine behaves as it should, and no longer panics or
>> wedges.
>
> Good.
>
> http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/ld.diff
>
> is a patch that takes KERNEL_LOCK when calling into the backend
> (hopefully without the locking order violation), except for
> i2o,cac,nvme,virtio and sdmmc which are tagged as safe.
That appears to work nicely as well; I extracted the full
7.1.0_PATCH distribution on a scratch file system, and where that
crashed after 1.29 x sets, it completed without issue, so I'd
call that a success as well, thanks!
Regards,
- H=E5vard
From: "Michael van Elst" <mlelstv@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52462 CVS commit: src/sys/dev
Date: Wed, 9 Aug 2017 16:44:40 +0000
Module Name: src
Committed By: mlelstv
Date: Wed Aug 9 16:44:40 UTC 2017
Modified Files:
src/sys/dev: ld.c ldvar.h
src/sys/dev/i2o: ld_iop.c
src/sys/dev/ic: ld_cac.c ld_nvme.c
src/sys/dev/pci: ld_virtio.c
src/sys/dev/sdmmc: ld_sdmmc.c
Log Message:
While ld(4) is MP safe, many backends are not.
Add a flag for backends that are MP safe. Take KERNEL_LOCK when calling
into a backend that doesn't have the flag set. Do the same for the
discard routine.
Fixes PR 52462.
To generate a diff of this commit:
cvs rdiff -u -r1.101 -r1.102 src/sys/dev/ld.c
cvs rdiff -u -r1.30 -r1.31 src/sys/dev/ldvar.h
cvs rdiff -u -r1.38 -r1.39 src/sys/dev/i2o/ld_iop.c
cvs rdiff -u -r1.30 -r1.31 src/sys/dev/ic/ld_cac.c
cvs rdiff -u -r1.16 -r1.17 src/sys/dev/ic/ld_nvme.c
cvs rdiff -u -r1.15 -r1.16 src/sys/dev/pci/ld_virtio.c
cvs rdiff -u -r1.31 -r1.32 src/sys/dev/sdmmc/ld_sdmmc.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->needs-pullups
State-Changed-By: snj@NetBSD.org
State-Changed-When: Sun, 20 Aug 2017 04:32:49 +0000
State-Changed-Why:
fixed. needs to be pulled up to netbsd-8.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52462 CVS commit: [netbsd-8] src/sys/dev
Date: Fri, 1 Sep 2017 09:59:11 +0000
Module Name: src
Committed By: martin
Date: Fri Sep 1 09:59:11 UTC 2017
Modified Files:
src/sys/dev [netbsd-8]: dksubr.c ld.c ldvar.h
src/sys/dev/i2o [netbsd-8]: ld_iop.c
src/sys/dev/ic [netbsd-8]: ld_cac.c ld_nvme.c
src/sys/dev/pci [netbsd-8]: ld_virtio.c
src/sys/dev/sdmmc [netbsd-8]: ld_sdmmc.c sdmmc_mem.c sdmmcvar.h
Log Message:
Pull up following revision(s) (requested by mlelstv in ticket #261):
sys/dev/sdmmc/ld_sdmmc.c: revision 1.32
sys/dev/sdmmc/ld_sdmmc.c: revision 1.33
sys/dev/sdmmc/ld_sdmmc.c: revision 1.34
sys/dev/sdmmc/sdmmc_mem.c: revision 1.62
sys/dev/i2o/ld_iop.c: revision 1.39
sys/dev/ld.c: revision 1.102
sys/dev/ld.c: revision 1.103
sys/dev/dksubr.c: revision 1.98
sys/dev/dksubr.c: revision 1.99
sys/dev/sdmmc/sdmmcvar.h: revision 1.29
sys/dev/ic/ld_nvme.c: revision 1.17
sys/dev/ldvar.h: revision 1.31
sys/dev/ldvar.h: revision 1.32
sys/dev/ic/ld_cac.c: revision 1.31
sys/dev/pci/ld_virtio.c: revision 1.16
While ld(4) is MP safe, many backends are not.
Add a flag for backends that are MP safe. Take KERNEL_LOCK when calling
into a backend that doesn't have the flag set. Do the same for the
discard routine.
Fixes PR 52462.
Defer sdmmc discard operations to the sdmmc task queue. Fixes a panic
introduced by ld.c r1.102.
validate length for discard operation and split operation when byte length
doesn't fit into 'int'.
make the sc_discard interface for the ld backend asynchronous and
signal completion through new callback lddiscardend. Use a standard
struct buf to pass disk address and range instead of two off_t values.
make lddiscard synchronous again. This is a requirement of the current
ffs discard code.
Initialize error also in the case where len=0, which just succeeds.
while here, assert that the len is indeed non-negative. this is already
confirmed by sys_fdiscard, but let's be sure.
reported by: GCC, but with different compile flags
To generate a diff of this commit:
cvs rdiff -u -r1.97 -r1.97.2.1 src/sys/dev/dksubr.c
cvs rdiff -u -r1.101 -r1.101.2.1 src/sys/dev/ld.c
cvs rdiff -u -r1.30 -r1.30.2.1 src/sys/dev/ldvar.h
cvs rdiff -u -r1.37 -r1.37.6.1 src/sys/dev/i2o/ld_iop.c
cvs rdiff -u -r1.30 -r1.30.8.1 src/sys/dev/ic/ld_cac.c
cvs rdiff -u -r1.16 -r1.16.2.1 src/sys/dev/ic/ld_nvme.c
cvs rdiff -u -r1.15 -r1.15.6.1 src/sys/dev/pci/ld_virtio.c
cvs rdiff -u -r1.26.4.4 -r1.26.4.5 src/sys/dev/sdmmc/ld_sdmmc.c
cvs rdiff -u -r1.56.4.3 -r1.56.4.4 src/sys/dev/sdmmc/sdmmc_mem.c
cvs rdiff -u -r1.23.6.3 -r1.23.6.4 src/sys/dev/sdmmc/sdmmcvar.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: needs-pullups->closed
State-Changed-By: snj@NetBSD.org
State-Changed-When: Mon, 25 Sep 2017 16:45:51 +0000
State-Changed-Why:
pulled up.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.42 2017/01/01 07:07:38 snj Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.