NetBSD Problem Report #56413

From martin@aprisoft.de  Tue Sep 21 09:00:27 2021
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id A782C1A921F
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 21 Sep 2021 09:00:27 +0000 (UTC)
Message-Id: <20210921090018.C861E5CC803@emmas.aprisoft.de>
Date: Tue, 21 Sep 2021 11:00:18 +0200 (CEST)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: bad mutex owner in soreceive
X-Send-Pr-Version: 3.95

>Number:         56413
>Category:       kern
>Synopsis:       bad mutex owner in soreceive
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Sep 21 09:05:00 +0000 2021
>Last-Modified:  Tue Jan 10 11:35:01 +0000 2023
>Originator:     Martin Husemann
>Release:        NetBSD 9.99.88
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD gethsemane.aprisoft.de 9.99.88 NetBSD 9.99.88 (GETHSEMANE) #129: Tue Sep 14 11:00:39 CEST 2021 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/macppc/compile/GETHSEMANE macppc
Architecture: powerpc
Machine: macppc
>Description:

While scp'ing some files over to this macppc I got an assertion failure:

[ 120.1950992] Mutex error: mutex_vector_exit,753: assertion failed: MUTEX_OWNER(mtx->mtx_owner) == curthread

[ 120.2151045] lock address : 0x000000005fbcb080
[ 120.2251102] current cpu  :                  1
[ 120.2351143] current lwp  : 0x0000000010c8e400
[ 120.2451197] owner field  : 000000000000000000 wait/spin:                0/0

[ 120.2551264] panic: lock error: Mutex: mutex_vector_exit,753: assertion failed: MUTEX_OWNER(mtx->mtx_owner) == curthread: lock 0x5fbcb080 cpu 1 lwp 0x10c8e400
[ 120.2851394] cpu1: Begin traceback...
[ 120.2951449] 0x10e56d20: at vpanic+0x12c
[ 120.3051490] 0x10e56d50: at panic+0x50
[ 120.3151540] 0x10e56d90: at lockdebug_abort+0xe4
[ 120.3251596] 0x10e56db0: at mutex_spin_exit+0x104
[ 120.3351645] 0x10e56dc0: at soreceive+0x7a0
[ 120.3451697] 0x10e56e60: at dofileread+0x88
[ 120.3551755] 0x10e56eb0: at syscall+0x350
[ 120.3651792] 0x10e56f20: user SC trap #3 by 0xfd0d09a8: srr1=0xd032
[ 120.3751842]             r1=0xffffa3e0 cr=0x48000442 xer=0x20000000 ctr=0xfd0d09a0
[ 120.3851899] cpu1: End traceback...
[ 120.3951946] Failed to pause: cpu0
Stopped in pid 1082.1082 (sshd) at      netbsd:vpanic+0x130:    or      r3, r26,
 r26
db{1}> show lock 0x000000005fbcb080
Sorry, kernel not built with the LOCKDEBUG option.


>How-To-Repeat:
s/a (not reliably reproducable)

>Fix:
n/a

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/56413: bad mutex owner in soreceive
Date: Tue, 10 Jan 2023 12:30:12 +0100

 I have filed two other reports probably showing the same error:
 PR port-macppc/56912 and PR port-powerpc/56941 - probably makes
 more sense to keep this all in a single PR - will close the other two.

 Today's variant of the issue happened while extracting *.tgz sets from
 an NFS share:

 [ 2566.1971149] Mutex error: mutex_vector_exit,752: assertion failed: MUTEX_OWNER(mtx->mtx_owner) == curthread

 [ 2566.2371317] lock address : 5fbcb080
 [ 2566.2471366] current cpu  :                  1
 [ 2566.2471366] current lwp  : 0x00000000144efd00
 [ 2566.2571418] owner field  : 0x000000005fb28340 wait/spin:                0/0

 [ 2566.2771540] panic: lock error: Mutex: mutex_vector_exit,752: assertion failed: MUTEX_OWNER(mtx->mtx_owner) == curthread: lock 0x5fbcb080 cpu 1 lwp 0x144efd00
 [ 2566.3071678] cpu1: Begin traceback...
 [ 2566.3071678] 0x1495ca90: at vpanic+0x158
 [ 2566.3171716] 0x1495cac0: at panic+0x50
 [ 2566.3271774] 0x1495cb00: at lockdebug_abort+0xf8
 [ 2566.3371820] 0x1495cba0: at mutex_spin_exit+0x104
 [ 2566.3471872] 0x1495cbb0: at soreceive+0x7dc
 [ 2566.3571917] 0x1495cc80: at nfs_request+0xea0
 [ 2566.3671976] 0x1495cdc0: at nfs_readrpc+0x1e4
 [ 2566.3772035] 0x1495ce30: at nfs_doio+0x7ec
 [ 2566.3872074] 0x1495cee0: at nfssvc_iod+0x2ac
 [ 2566.3972118] 0x1495cf20: at cpu_lwp_bootstrap+0xc
 [ 2566.4072175] 0x1495cfe8: at 0xfffffffc
 [ 2566.4172225] cpu1: End traceback...
 [ 2566.4272269] Failed to pause: cpu0
 Stopped in pid 0.549 (system) at        netbsd:vpanic+0x15c:    or      r3, r26,
  r26
 db{1}> ps/a
 PID   COMMAND               STRUCT PROC *            UAREA *     VMSPACE/VM_MAP
 1341  tar                        11d48740           12071000           10d16ab0
 1465  sh                         10466880           10714000           10b059c8
 1640  tcsh                       111a8180           10fa0000           10473720
 1078  login                      1145f780           12404000           10b05c68
 1077  getty                      10fbec40           10f10000           10d16110
 1067  getty                      111a8c80           11498000           10b05728
 1254  getty                      111a8700           10fb4000           10473d40
 1056  getty                      10466300           104c4000           104733a0
 1088  cron                       111a8440           10fa4000           10473e20
 1098  inetd                      10fbe6c0           10e91000           10d16b90
 1127  sshd                       1145f4c0           12340000           10b05e28
 1125  upsmon                     11d48a00           12152000           10d16570
 976   upsmon                     11d48cc0           12163000           10b05808
 942   upsd                       11d481c0           11c40000           10b05488
 929   usbhid-ups                 10fbe980           10ec0000           10d162d0
 973   powerd                     111a89c0           11030000           10b05648
 863   ntpd                       11d48480           11c50000           10b05568
 587   syslogd                    10ac93c0           10ca4000           10473aa0
 307   dhcpcd                     10fbe140           10d40000           10b05d48
 303   dhcpcd                     10ac9c00           10d30000           10b05b88
 305   dhcpcd                     10ac9940           10cc1000           10b05aa8
 304   dhcpcd                     10ac9680           10cb2000           10b052c8
 206   wdogctl                    10825340           1085a000           104738e0
 1     init                       10466040           1045c000           10473020
 0     system                       c232c0           14ac8000             c54a90

 db{1}> mach cpu
 CPU CPUID STATE CPUINFO  CPL INT MTX IPIS
   0 0x000 R-P-- 0xbe56c0   7   1   0 0x00000000
   1 0x001 ----- 0xbe6140   7  -1   0 0x00000000
 db{1}> mach cpu 0
 Stopped in pid 0.3 (system) at  netbsd:cpu_pause+0xa4:  b       netbsd:cpu_pause
 +0x74
 db{0}> bt
 0x10007a30: at ipi_intr+0xa4
 0x10007a50: at intr_deliver+0x98
 0x10007a90: at pic_handle_intr+0x108
 0x10007af0: at trapstart+0x6b0
 0x10007bc0: at pics+0x1c
 0x10007c00: at intr_deliver+0xc8
 0x10007c40: at pic_handle_intr+0x108
 0x10007ca0: at trapstart+0x6b0
 0x10007d70: at sysctl_net_inet_ip_forwsrcrt+0x114
 0x10007db0: at ipintr+0x68
 0x10007ea0: at softint_thread+0x1a4
 0x10007f20: at cpu_lwp_bootstrap+0xc
 0x10007fe8: at 0x932fafc

 Martin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.