NetBSD Problem Report #57649

From www@netbsd.org  Mon Oct  9 23:34:34 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 5C4371A9238
	for <gnats-bugs@gnats.NetBSD.org>; Mon,  9 Oct 2023 23:34:34 +0000 (UTC)
Message-Id: <20231009233432.807191A923A@mollari.NetBSD.org>
Date: Mon,  9 Oct 2023 23:34:32 +0000 (UTC)
From: logix@foobar.franken.de
Reply-To: logix@foobar.franken.de
To: gnats-bugs@NetBSD.org
Subject: Uninitialized lock in gus.c
X-Send-Pr-Version: www-1.0

>Number:         57649
>Category:       kern
>Synopsis:       Uninitialized lock in gus.c
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 09 23:35:00 +0000 2023
>Last-Modified:  Tue Oct 10 10:35:01 +0000 2023
>Originator:     Harold Gutch
>Release:        NetBSD current
>Organization:
>Environment:
NetBSD  10.99.10 NetBSD 10.99.10 (GENERIC) #0: Sun Oct  8 08:05:08 UTC 2023  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/i386/compile/GENERIC i386
>Description:
Booting a kernel with Gravis UltraSound support on a system with a Gravis UltraSound card calls a code path ends up calling mutex_enter() on a non-initialized lock.  This is found by LOCKDEBUG and will cause such a system to panic.

Here is the relevant part of the panic:

[   1.0271147] gus0 at isa0 port 0x240-0x24f irq 7 drq 1,6                           
[   1.0271147] gus0: Gravis UltraSound, 1024KB memory        
[   1.0271147] audio0 at gus0: playback, capture, full duplex
[   1.0271147] panic: mutex_vector_enter,518: uninitialized lock (lock=0xc1bfc004, from=c0ae182c)
[   1.0271147] cpu0: Begin traceback...                                                                                                                                                        
[   1.0271147] vpanic(c1400394,c194a668,c194a69c,c0dafa92,c1400394,c12ba408,206,c1bfc004,c0ae182c,c0ae182c) at netbsd:vpanic+0x184
[   1.0271147] panic(c1400394,c12ba408,206,c1bfc004,c0ae182c,c0ae182c,206,c12ba408,8,0) at netbsd:panic+0x18
[   1.0271147] lockdebug_wantlock(c12ba408,206,c1bfc004,c0ae182c,0,c17376e0,c173eac8,c0daee22,c1664e80,c173eac8) at netbsd:lockdebug_wantlock+0x168
[   1.0271147] mutex_enter(c1bfc004,c194a748,c194a814,0,3,0,0,0,0,0) at netbsd:mutex_enter+0x22a
[   1.0271147] audio_hw_probe(c14001c4,2c,c1bf5938,c0f843da,0,c1b12280,c1735014,c0daee22,c1664e80,c1735014) at netbsd:audio_hw_probe+0x42
[   1.0271147] audioattach(c1c0d000,c1c0d200,c194a944,c1c0d000,0,c194a944,c1c0d000,c1c0d200,c194a8ec,c1671ad8) at netbsd:audioattach+0x872
[   1.0271147] config_attach_internal(c0ae3ffb,c194a8ec,0,0,c12ba3f4,c037e497,0,c14252ab,0,0) at netbsd:config_attach_internal+0x1a0
[   1.0271147] config_found_acquire(c1c0d000,c194a944,c0ae3ffb,c194a950,c194a938,ac44,c194a96c,c0ae9958,c1c0d000,c194a944) at netbsd:config_found_acquire+0xcb
[   1.0271147] config_found(c1c0d000,c194a944,c0ae3ffb,c194a950,0,c118c540,c1bfc000,1,0,0) at netbsd:config_found+0x2f
[   1.0271147] audio_attach_mi(c118c540,c1bfc000,c1c0d000,f,c037edbf,c1bfc000,c118c540,c1bfc000,c1bfc4fc,c194a9b8) at netbsd:audio_attach_mi+0x5c
[   1.0271147] gusattach(c1bdd800,c1c0d000,c194aa90,c1664e80,240,c194aa90,c1bdd800,c1c0d000,c1674a78,c194aa90) at netbsd:gusattach+0x774
[   1.0271147] config_attach_internal(c035ab85,c194aa08,240,c194aa20,c037f96c,0,0,c194aa58,0,0) at netbsd:config_attach_internal+0x1a0
[   1.0271147] config_attach(c1bdd800,c1674a78,c194aa90,c035ab85,c194aa74,7,240,10,ffffffff,0) at netbsd:config_attach+0x3f
[   1.0271147] isasearch(c1bdd800,c1674a78,c194ab90,0,c1674a78,c194ab64,c194ab54,c0d98d19,c1b3e418,c1664e80) at netbsd:isasearch+0x182
[   1.0271147] mapply(c1b3e418,c1664e80,20,c194ab48,c0d77fff,c1b3e418,20,0,c0f80457,0) at netbsd:mapply+0x2f
[   1.0271147] config_search_internal(c118bbb4,c1bdd800,c035afa8,0,c194ab90,0,0,c194abd0,c035af6e,c1bdd800) at netbsd:config_search_internal+0x189
[   1.0271147] config_search(c1bdd800,0,c194abac,0,ffffffff,0,ffffffff,0,ffffffff,ffffffff) at netbsd:config_search+0x6e
[   1.0271147] isarescan(c1bdd800,0,c118bbb4,c1bdd800,c1bad400,c1bdd800,c194aca4,c1675d50,c194ac2c,c0d9a819) at netbsd:isarescan+0xa8
[   1.0271147] isaattach(c1bad400,c1bdd800,c194aca4,c1bad400,0,c194aca4,c1bad400,c1bdd800,c194ac4c,c1674850) at netbsd:isaattach+0xc4
[   1.0271147] config_attach_internal(c035b52f,c194ac4c,c1664e80,c1735014,c1735014,c194ac88,0,c12dad30,0,0) at netbsd:config_attach_internal+0x1a0
[   1.0271147] config_found_acquire(c1bad400,c194aca4,c035b52f,c194acb8,c12dad30,c12dad30,c194acdc,c035680c,c1bad400,c194aca4) at netbsd:config_found_acquire+0xcb
[   1.0271147] config_found(c1bad400,c194aca4,c035b52f,c194acb8,0,c16303e0,c16303c0,c1630dc0,c168b000,1) at netbsd:config_found+0x2f
[   1.0271147] pcibrescan(c1bad400,c12dad30,0,c194ad18,c0d99e1d,c1bad400,0,c1664e80,4,0) at netbsd:pcibrescan+0x9d
[   1.0271147] pcib_callback(c1bad400,0,c1664e80,4,0,c1bad000,c1bad000,c1675d50,c194ad4c,c0d9a896) at netbsd:pcib_callback+0x21
[   1.0271147] config_process_deferred(c1735014,c1bad000,c194ae28,c1af7200,0,c194ae28,c1af7200,c1bad000,c194ad6c,c1673a40) at netbsd:config_process_deferred+0x9d
[   1.0271147] config_attach_internal(c019b2b5,c194ad6c,0,c194ad94,c0d5bc90,8,0,c12dad1c,0,0) at netbsd:config_attach_internal+0x21d
[   1.0271147] config_found_acquire(c1af7200,c194ae28,c019b2b5,c194adcc,c1b4e100,c1af7200,c194adf4,c04cadeb,c1af7200,c194ae28) at netbsd:config_found_acquire+0xcb
[   1.0271147] config_found(c1af7200,c194ae28,c019b2b5,c194adcc,c1af9a00,0,1,0,0,c12dad1c) at netbsd:config_found+0x2f
[   1.0271147] mp_pci_scan(c1af7200,c194ae28,c019b2b5,c194ae0c,1,0,0,c12dad01,0,0) at netbsd:mp_pci_scan+0xab
[   1.0271147] i386_mainbus_rescan(c1af7200,c12dad1c,0,c1af7200,c0138dce,0,c1af7200,0,c0db2a73,1) at netbsd:i386_mainbus_rescan+0x2a4
[   1.0271147] i386_mainbus_attach(0,c1af7200,0,c1664e80,c0d98b8e,0,0,c1af7200,c1673a28,0) at netbsd:i386_mainbus_attach+0x96
[   1.0271147] config_attach_internal(0,c194af1c,0,0,0,0,0,0,0,0) at netbsd:config_attach_internal+0x1a0
[   1.0271147] config_attach(0,c1673a28,0,0,0,cb2,3f7f000,0,c194af6c,c01278fe) at netbsd:config_attach+0x3f
[   1.0271147] config_rootfound(c12d91c1,0,c194afb0,c0f8621f,3,0,64,0,0,0) at netbsd:config_rootfound+0x59
[   1.0271147] cpu_configure(3,0,64,0,0,0,0,0,2a54000,0) at netbsd:cpu_configure+0x3e
[   1.0271147] main(0,0,0,0,0,0,0,0,0,0) at netbsd:main+0x32f
[   1.0271147] cpu0: End traceback...
[   1.0271147] fatal breakpoint trap in supervisor mode
[   1.0271147] trap type 1 code 0 eip 0xc0127eb4 cs 0x8 eflags 0x202 cr2 0 ilevel 0x8 esp 0xc194a64c
[   1.0271147] curlwp 0xc1664e80 pid 0 lid 0 lowest kstack 0xc19482c0
Stopped in pid 0.0 (system) at  netbsd:breakpoint+0x4:  popl    %ebp

>How-To-Repeat:
[ Note:  the following uses i386 because this was found while investigating a related panic reported in private by Eirik Øverby who experienced a GUS related panic on i386 - it should apply the same to other amd64.  Also, according to a quick grep, wss(4) and ym(4) might have the same problem. ]

1)
Build a kernel with GUS support at port 0x240 (to correspond with Qemu) and LOCKDEBUG

src$ cat > sys/arch/i386/conf/MINE << EOF
include "arch/i386/conf/GENERIC"
gus0    at isa? port 0x240 irq 7 drq 1 drq2 6   # Gravis Ultra Sound
options LOCKDEBUG
EOF
src$ ./build.sh -O ../obj.i386 -u -U -m i386 -j 16 tools kernel=MINE

2)
Set up a Qemu i386 VM:
$ qemu-img create -f qcow2 hdd.img 10G
$ qemu-system-i386 -boot d -cdrom NetBSD-10.99.10-i386.iso -m 64 -hda hdd.img
Do installation - once done, boot VM:
$ qemu-system-i386 -boot c -m 64 -hda hdd.img -device gus

3) Transfer kernel built in 1) over to VM, reboot VM, watch panic
>Fix:
Inspired by src/sys/dev/isa/sb_isa.c which initializes the two sc_lock and sc_intr_lock mutexes in the hardware dependent caller (there in sb_isa_attach() ):

--- src/sys/dev/isa/gus.c	2021-02-06 08:16:18.000000000 +0100
+++ src/sys/dev/isa/gus.c.mod	2023-10-10 00:38:29.522201667 +0200
@@ -826,6 +826,9 @@ gusattach(device_t parent, device_t self
 	sc->sc_lock = sc->sc_codec.sc_ad1848.sc_lock;
 	sc->sc_intr_lock = sc->sc_codec.sc_ad1848.sc_intr_lock;

+	mutex_init(&sc->sc_lock, MUTEX_DEFAULT, IPL_NONE);
+	mutex_init(&sc->sc_intr_lock, MUTEX_DEFAULT, IPL_AUDIO);
+
 	sc->sc_iot = iot = ia->ia_iot;
 	sc->sc_ic = ia->ia_ic;
 	iobase = ia->ia_io[0].ir_addr;


>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Harold Gutch <logix@foobar.franken.de>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/57649: Uninitialized lock in gus.c
Date: Tue, 10 Oct 2023 00:12:40 +0000

 I think this can't be right.

 sc->sc_lock and sc->sc_intr_lock appear to be totally unused.  We
 should just flush them, and delete the bogus

 	sc->sc_lock = sc->sc_codec.sc_ad1848.sc_lock;
 	sc->sc_intr_lock = sc->sc_codec.sc_ad1848.sc_intr_lock;

 initalization, which has never made sense.

 The real mutexes, returned by .get_locks = ad1848_get_locks, are
 already initialized by ad1848_init_locks when gusattach calls that
 early on.

 So why are they uninitialized?  Not sure!  Maybe something is trying
 to use sc->sc_mixer and sc->sc_codec (which are in a union) at the
 same time and things are stomping all over each other?  Time to printf
 all the things?

From: Harold Gutch <logix@foobar.franken.de>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57649: Uninitialized lock in gus.c
Date: Tue, 10 Oct 2023 12:31:09 +0200

 On Tue, Oct 10, 2023 at 12:15:02AM +0000, Taylor R Campbell wrote:
 > The following reply was made to PR kern/57649; it has been noted by GNATS.
 > 
 > From: Taylor R Campbell <riastradh@NetBSD.org>
 > To: Harold Gutch <logix@foobar.franken.de>
 > Cc: gnats-bugs@NetBSD.org
 > Subject: Re: kern/57649: Uninitialized lock in gus.c
 > Date: Tue, 10 Oct 2023 00:12:40 +0000
 > 
 >  I think this can't be right.
 >  
 >  sc->sc_lock and sc->sc_intr_lock appear to be totally unused.  We
 >  should just flush them, and delete the bogus

 They are used down the line in audio.c.


 >  	sc->sc_lock = sc->sc_codec.sc_ad1848.sc_lock;
 >  	sc->sc_intr_lock = sc->sc_codec.sc_ad1848.sc_intr_lock;
 >  
 >  initalization, which has never made sense.
 >  
 >  The real mutexes, returned by .get_locks = ad1848_get_locks, are
 >  already initialized by ad1848_init_locks when gusattach calls that
 >  early on.

 Ah, indeed, I missed the call to ad1848_init_locks() just a few lines
 above...


 >  So why are they uninitialized?  Not sure!  Maybe something is trying
 >  to use sc->sc_mixer and sc->sc_codec (which are in a union) at the
 >  same time and things are stomping all over each other?  Time to printf
 >  all the things?

 They might not be uninitialized - what would be an easy way to check
 this?

 This might just be LOCKDEBUG messing up because it keeps tracks of
 locks via the variables used to address them - and the two variables
 sc->sc_lock and sc->sc_codec.sc_ad1848.sc_lock differ, even if their
 value is the same.  A GENERIC kernel with "gus?" on the right base
 port but without LOCKDEBUG works for me.  It is "only" LOCKDEBUG that
 then makes it panic in audio.c because that thinks sc->sc_lock is
 uninitialized.

 Is there a reasonable way of "copying" a lock like gus.c wants to do
 in the two lines you quoted?


 thanks,
   Harold

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.