NetBSD Problem Report #57350

From www@netbsd.org  Thu Apr 13 15:35:05 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 9EC7F1A9239
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 13 Apr 2023 15:35:05 +0000 (UTC)
Message-Id: <20230413153503.DA4371A923A@mollari.NetBSD.org>
Date: Thu, 13 Apr 2023 15:35:03 +0000 (UTC)
From: palle@lyckegaard.dk
Reply-To: palle@lyckegaard.dk
To: gnats-bugs@NetBSD.org
Subject: Panic during shutdown on sun fire v445 system
X-Send-Pr-Version: www-1.0

>Number:         57350
>Category:       port-sparc64
>Synopsis:       Panic during shutdown on sun fire v445 system
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-sparc64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Apr 13 15:40:00 +0000 2023
>Last-Modified:  Mon Apr 17 19:25:01 +0000 2023
>Originator:     Palle Lyckegaard
>Release:        NetBSD 10.99.2 (GENERIC.DEBUG) #5: Wed Apr 12 16:27:00 UTC 2023
>Organization:
NetBSD
>Environment:
NetBSD 10.99.2 (GENERIC.DEBUG) #5: Wed Apr 12 16:27:00 UTC 2023
>Description:
During shutdown of 4-cpu v445 system, the following panic occurs:


...

v445# 
v445# halt
Apr 12 18:51:42 v445 halt: halted by root
Apr 12 18:51:42 v445 syslogd[846]: Exiting on signal 15
[ 116.6049802] syncing disks... done
[ 116.6849767] unmounting 0x1095f3000 /var/shm (tmpfs)...
[ 116.7549736] unmounting 0x108abd000 /proc (procfs)...
[ 116.8149715] unmounting 0x108a5f000 /dev/pts (ptyfs)...
[ 116.8749683] unmounting 0x1095f2000 /kern (kernfs)...
[ 116.9349658] unmounting 0x108a5e000 /tmp (tmpfs)...
[ 116.9849636] unmounting 0x108956000 / (/dev/sd1a)...
[ 117.7449313] unmounting 0x108956000 / (/dev/sd1a)...
[ 117.8049279] uhub4: detached
[ 117.8349265] cd0: detached
[ 117.8753897] atapibus0: detached
[ 117.9049235] uhub3: detached
[ 117.9449217] uhub2: detached
[ 117.9749205] uhub1: detached
[ 118.0049191] uhub0: detached
[ 118.0449174] pci14: detached
[ 118.0749161] atabus1: detached
[ 118.1149143] atabus0: detached
[ 118.1449132] usb3: detached
[ 118.1749117] usb2: detached
[ 118.2149100] usb1: detached
[ 118.2449086] usb0: detached
[ 118.2749073] pci4: detached
[ 118.3049060] sd4: detached
[ 118.3349047] dk3 at sd3 (61794d71-0f8b-2764-dfe4-8bc137a269b1) deleted
[ 118.4149014] dk2 at sd3 (3f5c5123-7966-3346-b2bf-d1774016c2bc) deleted
[ 118.4972738] sd3: detached
[ 118.5285251] dk1 at sd2 (4113af4f-dcd1-7760-f006-c2821bb8a1d8) deleted
[ 118.6056071] dk0 at sd2 (zfs) deleted
[ 118.6448915] sd2: detached
[ 118.6748901] sd0: detached
[ 118.7048887] ppb12: detached
[ 118.7548865] brgphy3: detached
[ 118.7948894] bge3: detached
[ 118.8348885] brgphy2: detached
[ 118.8748815] bge2: detached
[ 118.9048802] Skipping crash dump on recursive panic
[ 118.9048802] panic: mutex_vector_enter,511: uninitialized lock (lock=0x108643018, from=01645508)
[ 118.9048802] cpu0: Begin traceback...
[ 118.9048802] cpu0: End traceback...
[ 119.1579070] Mutex error: mutex_vector_enter,511: assertion failed: !cpu_intr_p()

[ 119.1579070] lock address : netbsd:xc_high_pri
[ 119.1579070] type         : sleep/adaptive
[ 119.1579070] initialized  : netbsd:xc_init_cpu+0xf4
[ 119.1579070] shared holds :                  0 exclusive:                  0
[ 119.1579070] shares wanted:                  0 exclusive:                  0
[ 119.1579070] relevant cpu :                  0 last held:                  2
[ 119.1579070] relevant lwp : 0x0000000106ea5140 last held: 000000000000000000
[ 119.1579070] last locked  : netbsd:xc_wait+0x9c
[ 119.1579070] unlocked*    : netbsd:cv_enter+0x13c
[ 119.1579070] owner field  : 000000000000000000 wait/spin:                0/0
[ 119.1579070] Turnstile: no active turnstile for this lock.

[ 119.9980289] Skipping crash dump on recursive panic
[ 120.0574031] panic: LOCKDEBUG: Mutex error: mutex_vector_enter,511: assertion failed: !cpu_intr_p()
[ 120.1646952] cpu0: Begin traceback...
[ 120.2074041] cpu0: End traceback...
[ 130.2043880] cpu0[6 softser/0]: hogging kernel lock
[ 130.2597738] cpu0[6 softser/0]: hogging kernel lock
[ 130.2597738] cpu0[6 softser/0]: hogging kernel lock
[ 130.3743583] cpu0[6 softser/0]: hogging kernel lock
[ 130.3743583] cpu0[6 softser/0]: hogging kernel lock
[ 130.3743583] cpu0[6 softser/0]: hogging kernel lock
[ 130.5462354] cpu0[6 softser/0]: hogging kernel lock
[ 130.6035291] cpu0[6 softser/0]: hogging kernel lock
[ 130.6608210] cpu0[6 softser/0]: hogging kernel lock
[ 130.6608210] cpu0[6 softser/0]: hogging kernel lock
[ 130.6608210] cpu0[6 softser/0]: hogging kernel lock
[ 130.8326987] cpu0[6 softser/0]: hogging kernel lock
[ 130.8326987] cpu0[6 softser/0]: hogging kernel lock
[ 130.9472843] cpu0[6 softser/0]: hogging kernel lock
[ 131.0045776] cpu0[6 softser/0]: hogging kernel lock
[ 131.0618701] cpu0[6 softser/0]: hogging kernel lock
[ 131.1191624] cpu0[6 softser/0]: hogging kernel lock
[ 131.1191624] cpu0[6 softser/0]: hogging kernel lock
[ 131.2337479] cpu0[6 softser/0]: hogging kernel lock
[ 131.2910415] cpu0[6 softser/0]: hogging kernel lock
[ 131.2910415] cpu0[6 softser/0]: hogging kernel lock
[ 131.4056265] cpu0[6 softser/0]: hogging kernel lock
[ 131.4056265] cpu0[6 softser/0]: hogging kernel lock
[ 131.5202120] cpu0[6 softser/0]: hogging kernel lock
[ 131.5775045] cpu0[6 softser/0]: hogging kernel lock
[ 131.5775045] cpu0[6 softser/0]: hogging kernel lock
[ 131.6920900] cpu0[6 softser/0]: hogging kernel lock
[ 
>How-To-Repeat:
Problem seems to be present in NetBSD-9 as well.

Using recent sources the issue happends during each shutdown of the system, causing the root fs to be fsck'ed during the next reboot.
>Fix:
N/A

>Audit-Trail:
From: Palle Lyckegaard <palle@lyckegaard.dk>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc64/57350: Panic during shutdown on sun fire v445
 system
Date: Thu, 13 Apr 2023 19:13:43 +0000 (UTC)

 Looks like PR 51133 is very similar

 On Thu, 13 Apr 2023, gnats-admin@netbsd.org wrote:

 > Date: Thu, 13 Apr 2023 15:40:00 +0000 (UTC)
 > From: gnats-admin@netbsd.org
 > Reply-To: gnats-bugs@netbsd.org
 > To: palle@lyckegaard.dk
 > Subject: Re: port-sparc64/57350: Panic during shutdown on sun fire v445 system
 > 
 > Thank you very much for your problem report.
 > It has the internal identification `port-sparc64/57350'.
 > The individual assigned to look at your
 > report is: port-sparc64-maintainer.
 >
 >> Category:       port-sparc64
 >> Responsible:    port-sparc64-maintainer
 >> Synopsis:       Panic during shutdown on sun fire v445 system
 >> Arrival-Date:   Thu Apr 13 15:40:00 +0000 2023
 >
 >

From: Taylor R Campbell <riastradh@NetBSD.org>
To: palle@lyckegaard.dk
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-sparc64/57350: Panic during shutdown on sun fire v445 system
Date: Thu, 13 Apr 2023 20:14:37 +0000

 > [ 118.9048802] panic: mutex_vector_enter,511: uninitialized lock (lock=3D=
 0x108643018, from=3D01645508)

 Do you have netbsd.gdb?  Can you get `list *(0x01645508)' output from
 it in gdb?  Is 0x01645508 a reasonable kernel text address on sparc64?

 > [ 118.9048802] cpu0: Begin traceback...
 > [ 118.9048802] cpu0: End traceback...

 Seems bad.  Do stack traces work on this machine normally?  Can you
 test with `sysctl -w debug.crashme_enable=3D1 debug.crashme.panic=3D1'?
 (This will deliberately panic the system.)

 > [ 119.1579070] Mutex error: mutex_vector_enter,511: assertion failed: !cp=
 u_intr_p()
 > [ 119.1579070] lock address : netbsd:xc_high_pri

 Looks like something is trying to issue a cross-call from interrupt
 context.  I wonder what could be doing that?  Too bad we don't have a
 stack trace...

From: Palle Lyckegaard <palle@lyckegaard.dk>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc64/57350: Panic during shutdown on sun fire v445
 system
Date: Mon, 17 Apr 2023 19:17:16 +0000 (UTC)

 And if I disable 'ehci' in the kernel configuration, the shutdown is ok...
 (without ehci disabled the crash occours right after bge2 is detached)

 ...
 v445# halt
 Apr 16 19:34:21 v445 halt: halted by root
 Apr 16 19:34:21 [ 4177.0857584] syncing disks... done
 [ 4177.4057446] unmounting 0x109604000 /var/shm (tmpfs)...
 [ 4177.4657419] unmounting 0x108ad3000 /proc (procfs)...
 [ 4177.5357391] unmounting 0x108aa8000 /dev/pts (ptyfs)...
 [ 4177.5957362] unmounting 0x108aa9000 /kern (kernfs)...
 [ 4177.6557337] unmounting 0x108ad2000 /tmp (tmpfs)...
 [ 4177.7157311] unmounting 0x108947000 / (/dev/sd1a)...
 [ 4182.4455256] unmounting 0x108947000 / (/dev/sd1a)...
 [ 4182.4955224] uhub3: detached
 [ 4182.5355208] cd0: detached
 [ 4182.5655195] atapibus0: detached
 [ 4182.6255182] uhub2: detached
 [ 4182.6555155] uhub1: detached
 [ 4182.6855142] uhub0: detached
 [ 4182.7281882] pci14: detached
 [ 4182.7555128] atabus1: detached
 [ 4182.7955094] atabus0: detached
 [ 4182.8355078] usb2: detached
 [ 4182.8688109] usb1: detached
 [ 4182.9021442] usb0: detached
 [ 4182.9355033] pci4: detached
 [ 4182.9655020] sd4: detached
 [ 4182.9955024] dk3 at sd3 (61794d71-0f8b-2764-dfe4-8bc137a269b1) deleted
 [ 4183.0754971] dk2 at sd3 (3f5c5123-7966-3346-b2bf-d1774016c2bc) deleted
 [ 4183.1554938] sd3: detached
 [ 4183.1896511] dk1 at sd2 (4113af4f-dcd1-7760-f006-c2821bb8a1d8) deleted
 [ 4183.2677784] dk0 at sd2 (zfs) deleted
 [ 4183.3054873] sd2: detached
 [ 4183.3354859] sd0: detached
 [ 4183.3754847] ppb12: detached
 [ 4183.4154826] brgphy3: detached
 [ 4183.4654812] bge3: detached
 [ 4183.5054787] brgphy2: detached
 [ 4183.5456978] bge2: detached
 [ 4183.5835129] ohci2: detached
 [ 4183.6178874] ohci1: detached
 [ 4183.6522631] ohci0: detached
 [ 4183.6866375] ppb3: detached
 [ 4183.7254729] brgphy1: detached
 [ 4183.7686184] bge1: detached
 [ 4183.8354722] brgphy0: detached
 [ 4183.8754637] bge0: detached
 [ 4183.9054611] pci13: detached
 [ 4183.9454593] pci3: detached
 [ 4183.9754584] ppb11: detached
 [ 4184.0154563] ppb2: detached
 [ 4184.0454550] pci16: detached
 [ 4184.0831053] pci15: detached
 [ 4184.1174783] pci12: detached
 [ 4184.1454519] pci7: detached
 [ 4184.1754493] pci6: detached
 [ 4184.2154477] pci2: detached
 [ 4184.2454476] ppb14: detached
 [ 4184.2862322] ppb13: detached
 [ 4184.3206071] ppb10: detached
 [ 4184.3554422] ppb6: detached
 [ 4184.3854403] ppb5: detached
 [ 4184.4154403] ppb1: detached
 [ 4184.4554371] unmounting 0x108947000 / (/dev/sd1a)...
 [ 4184.5154361] forcefully unmounting / (/dev/sd1a)...
 [ 4184.6054311] turning off swap...
 [ 4184.6354293] turning off swap on /dev/sd1b... done
 [ 4184.7054270] sd1: detached
 [ 4184.7393674] scsibus0: detached
 [ 4184.7754232] cpu0: shutting down
 [ 4184.7754232] cpu3: shutting down
 [ 4184.7754232] cpu2: shutting down
 [ 4184.7754232] cpu1: halted

 Program terminated
 {0} ok
 ...



 On Thu, 13 Apr 2023, Palle Lyckegaard wrote:

 > Date: Thu, 13 Apr 2023 19:15:02 +0000 (UTC)
 > From: Palle Lyckegaard <palle@lyckegaard.dk>
 > Reply-To: gnats-bugs@netbsd.org
 > To: port-sparc64-maintainer@netbsd.org, gnats-admin@netbsd.org,
 >     netbsd-bugs@netbsd.org, palle@lyckegaard.dk
 > Subject: Re: port-sparc64/57350: Panic during shutdown on sun fire v445 system
 > 
 > The following reply was made to PR port-sparc64/57350; it has been noted by GNATS.
 >
 > From: Palle Lyckegaard <palle@lyckegaard.dk>
 > To: gnats-bugs@netbsd.org
 > Cc:
 > Subject: Re: port-sparc64/57350: Panic during shutdown on sun fire v445
 > system
 > Date: Thu, 13 Apr 2023 19:13:43 +0000 (UTC)
 >
 > Looks like PR 51133 is very similar
 >
 > On Thu, 13 Apr 2023, gnats-admin@netbsd.org wrote:
 >
 > > Date: Thu, 13 Apr 2023 15:40:00 +0000 (UTC)
 > > From: gnats-admin@netbsd.org
 > > Reply-To: gnats-bugs@netbsd.org
 > > To: palle@lyckegaard.dk
 > > Subject: Re: port-sparc64/57350: Panic during shutdown on sun fire v445 system
 > >
 > > Thank you very much for your problem report.
 > > It has the internal identification `port-sparc64/57350'.
 > > The individual assigned to look at your
 > > report is: port-sparc64-maintainer.
 > >
 > >> Category:       port-sparc64
 > >> Responsible:    port-sparc64-maintainer
 > >> Synopsis:       Panic during shutdown on sun fire v445 system
 > >> Arrival-Date:   Thu Apr 13 15:40:00 +0000 2023
 > >
 > >
 >
 >

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.