NetBSD Problem Report #57481

From www@netbsd.org  Thu Jun 22 17:01:53 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id B9BF51A923E
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 22 Jun 2023 17:01:53 +0000 (UTC)
Message-Id: <20230622170121.989C11A9241@mollari.NetBSD.org>
Date: Thu, 22 Jun 2023 17:01:21 +0000 (UTC)
From: brandon@burn.net
Reply-To: brandon@burn.net
To: gnats-bugs@NetBSD.org
Subject: Sparcstation 10 + cgsix gfx results in panic on shutdown
X-Send-Pr-Version: www-1.0

>Number:         57481
>Category:       port-sparc
>Synopsis:       Sparcstation 10 + cgsix gfx results in panic on shutdown
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    port-sparc-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jun 22 17:05:00 +0000 2023
>Closed-Date:    Tue Jun 27 06:33:20 +0000 2023
>Last-Modified:  Tue Jun 27 06:33:20 +0000 2023
>Originator:     Brandon Applegate
>Release:        9.3
>Organization:
>Environment:
NetBSD ss10.internal.burn.net 9.3 NetBSD 9.3 (GENERIC.MP) #0: Thu Aug  4 15:30:37 UTC 2022  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/sparc/compile/GENERIC.MP sparc
>Description:
Here is my mailing list thread on this:

https://mail-index.netbsd.org/port-sparc/2023/05/20/msg002577.html

SS10, TGX (cgsix) FB.  If I run in console only (i.e. never start X) - shutdowns go clean.

If I run X at all - both shutting down from inside X as well as dropping back to console and shutting down result in panics.

I also have an SS20 running 9.3 - however that machine has SX graphics (8mb VSIMM).  Never have this issue over there.

As suggested in the mailing list thread - I also tried a nightly 10 kernel + modules.  Unfortunately same behavior.

I've had this issue on 2 separate SS10 machines with different video cards as well (both cgsix / TGX, just 2 physical different cards).

Apologies in advance, I've never submitted a PR involving a panic.  If I'm missing pertinent info please let me know.

I'm including the log lines just before the panic as well to illustrate the timing (i.e. just as shutdown is proceeding, umounts, etc).  Here are 2 panics for example I managed to catch:

May 19 00:27:50 ss10 upsmon[364]: Signal 15: exiting
May 19 00:27:50 ss10 upsmon[317]: upsmon parent: read
May 19 00:28:04 ss10 syslogd[163]: Exiting on signal 15
May 19 00:30:29 ss10 syslogd[207]: restart
May 19 00:30:29 ss10 /netbsd: [ 2692.9486587] syncing disks... done
May 19 00:30:29 ss10 /netbsd: [ 2693.0107884] unmounting file systems...
May 19 00:30:29 ss10 /netbsd: [ 2693.0287015] unmounted /dev/sd1a on /mnt/sd1a type ffs
May 19 00:30:29 ss10 /netbsd: [ 2693.0386959] unmounted procfs on /proc type procfs
May 19 00:30:29 ss10 /netbsd: [ 2693.0486897] unmounted ptyfs on /dev/pts type ptyfs
May 19 00:30:29 ss10 /netbsd: [ 2693.0586951] unmounted kernfs on /kern type kernfs
May 19 00:30:29 ss10 /netbsd: [ 2693.6187362] Skipping crash dump on recursive panic
May 19 00:30:29 ss10 /netbsd: [ 2693.6379128] panic: pmap_remove_all: empty vreg
May 19 00:30:29 ss10 /netbsd: [ 2693.6586471] cpu0: Begin traceback...
May 19 00:30:29 ss10 /netbsd: [ 2693.6588026] 0x0(0xf03ba350, 0xf7f755f8, 0xf0484800, 0xf0485800, 0xf04856c8, 0x4) at netbsd:panic+0x20
May 19 00:30:29 ss10 /netbsd: [ 2693.7087483] panic(0xf03ba350, 0x0, 0x4, 0xf045f07c, 0xf0fb6700, 0x0) at netbsd:pmap_page_protect4m+0x320
May 19 00:30:29 ss10 /netbsd: [ 2693.7587541] pmap_page_protect4m(0xf057897c, 0x0, 0x2c, 0xf0a78bc0, 0x7fffffff, 0xfffff000) at netbsd:genfs_do_putpages+0x820
May 19 00:30:29 ss10 /netbsd: [ 2693.7987515] genfs_do_putpages(0x201a, 0x0, 0x0, 0x1fffff, 0xf0578940, 0x1) at netbsd:genfs_putpages+0x24
May 19 00:30:29 ss10 /netbsd: [ 2693.8487617] genfs_putpages(0xf7f758e0, 0x0, 0xf0fb6700, 0x0, 0x0, 0x0) at netbsd:VOP_PUTPAGES+0x38
May 19 00:30:29 ss10 /netbsd: [ 2693.8887526] VOP_PUTPAGES(0xf0d1dcb8, 0x0, 0x0, 0x0, 0x0, 0x201b) at netbsd:vinvalbuf+0x34
May 19 00:30:29 ss10 /netbsd: [ 2693.9187557] vinvalbuf(0xf0d1dcb8, 0x1, 0xffffffff, 0xf0fb6700, 0x0, 0x0) at netbsd:vcache_reclaim+0x74
May 19 00:30:29 ss10 /netbsd: [ 2693.9687754] vcache_reclaim(0xf0d1dcb8, 0x1, 0x8, 0xf09ee000, 0xf7f759d8, 0xf0fb6700) at netbsd:vrecycle+0xdc
May 19 00:30:29 ss10 /netbsd: [ 2694.0087733] vrecycle(0xf0d1dcb8, 0x0, 0x0, 0x0, 0x0, 0x3) at netbsd:vflush+0x164
May 19 00:30:29 ss10 /netbsd: [ 2694.0387859] vflush(0xf09ee000, 0x0, 0x2, 0xf0480d80, 0x41b8f, 0xf0d1dcb8) at netbsd:ffs_flushfiles+0x11c
May 19 00:30:29 ss10 /netbsd: [ 2694.0787818] ffs_flushfiles(0xf09ee000, 0x2, 0xf0fb6700, 0xf0468c00, 0xf08ed880, 0x0) at netbsd:ffs_unmount+0x48
May 19 00:30:29 ss10 /netbsd: [ 2694.1187842] ffs_unmount(0xf09ee000, 0x80000, 0xf0fb6700, 0xf09e8800, 0xf08ed880, 0xf09ee000) at netbsd:VFS_UNMOUNT+0x10
May 19 00:30:29 ss10 /netbsd: [ 2694.1587881] VFS_UNMOUNT(0xf09ee000, 0x80000, 0xf0465400, 0x0, 0xf0e2c7a0, 0x0) at netbsd:dounmount+0x8c
May 19 00:30:29 ss10 /netbsd: [ 2694.1987930] dounmount(0xf09ee000, 0x80000, 0xf0fb6700, 0x0, 0x740, 0x5000) at netbsd:vfs_unmountall1+0x58
May 19 00:30:29 ss10 /netbsd: [ 2694.2487947] vfs_unmountall1(0xf0fb6700, 0xf03eda80, 0x1, 0x80000, 0x0, 0xf09ee000) at netbsd:cpu_reboot+0x17c
May 19 00:30:29 ss10 /netbsd: [ 2694.2988000] cpu_reboot(0x808, 0x0, 0x0, 0x0, 0xf0480c00, 0x0) at netbsd:sys_reboot+0x44
May 19 00:30:29 ss10 /netbsd: [ 2694.3388084] sys_reboot(0x0, 0xf7f75f30, 0xf7f75f28, 0x808, 0x0, 0x14752c) at netbsd:syscall+0xe8
May 19 00:30:29 ss10 /netbsd: [ 2694.3888066] syscall(0xcd0, 0xf7f75fb0, 0xedcd4398, 0xd0, 0x4e, 0xf0fb6700) at netbsd:memfault_sun4m+0x3f4
May 19 00:30:29 ss10 /netbsd: [ 2694.4274910] cpu0: End traceback...
May 19 00:30:29 ss10 /netbsd: [ 2694.4375343] rebooting

Jun 21 18:35:41 ss10 /netbsd: [ 2998.6503038] syncing disks... done
Jun 21 18:35:41 ss10 /netbsd: [ 2998.7403108] unmounting file systems...
Jun 21 18:35:41 ss10 /netbsd: [ 2998.8403554] unmounted /dev/sd1a on /mnt/sd1a type ffs
Jun 21 18:35:41 ss10 /netbsd: [ 2998.8903237] unmounted procfs on /proc type procfs
Jun 21 18:35:41 ss10 /netbsd: [ 2998.9403850] unmounted ptyfs on /dev/pts type ptyfs
Jun 21 18:35:41 ss10 /netbsd: [ 2998.9803281] unmounted kernfs on /kern type kernfs
Jun 21 18:35:41 ss10 /netbsd: [ 3000.8304806] Skipping crash dump on recursive panic
Jun 21 18:35:41 ss10 /netbsd: [ 3000.8501879] panic: pmap_remove_all: empty vreg
Jun 21 18:35:41 ss10 /netbsd: [ 3000.8668687] cpu0: Begin traceback...
Jun 21 18:35:41 ss10 /netbsd: [ 3000.8730953] 0x0(0xf03c20e8, 0xf9ce4600, 0xf048d000, 0xf048dc00, 0xf048dcf8, 0x4) at netbsd:panic+0x20
Jun 21 18:35:41 ss10 /netbsd: [ 3000.9104919] panic(0xf03c20e8, 0xf048dc00, 0x4, 0xf0002000, 0x138, 0x0) at netbsd:pmap_page_protect4m+0x2c8
Jun 21 18:35:41 ss10 /netbsd: [ 3000.9404788] pmap_page_protect4m(0xf09a8570, 0xf09a85ac, 0xf09f785c, 0x1b, 0xf16e3c80, 0xf0c29b80) at netbsd:genfs_do_putpages+0x820
Jun 21 18:35:41 ss10 /netbsd: [ 3000.9804845] genfs_do_putpages(0x201a, 0x0, 0x0, 0x1fffff, 0xf09a8570, 0x1) at netbsd:genfs_putpages+0x24
Jun 21 18:35:41 ss10 /netbsd: [ 3001.0205026] genfs_putpages(0xf9ce48e0, 0x0, 0xf0e35960, 0x0, 0x0, 0x0) at netbsd:VOP_PUTPAGES+0x48
Jun 21 18:35:41 ss10 /netbsd: [ 3001.0504801] VOP_PUTPAGES(0xf17cff10, 0x0, 0x0, 0x0, 0x0, 0x201b) at netbsd:vinvalbuf+0x34
Jun 21 18:35:41 ss10 /netbsd: [ 3001.0707704] vinvalbuf(0xf17cff10, 0x1, 0xffffffff, 0xf0e35960, 0x0, 0x0) at netbsd:vcache_reclaim+0x74
Jun 21 18:35:41 ss10 /netbsd: [ 3001.1205162] vcache_reclaim(0xf17cff10, 0x1, 0x8, 0xf0c3a000, 0xf9ce49d8, 0xf0e35960) at netbsd:vrecycle+0xdc
Jun 21 18:35:41 ss10 /netbsd: [ 3001.1406985] vrecycle(0xf17cff10, 0x0, 0x0, 0x0, 0x0, 0x3) at netbsd:vflush+0x164
Jun 21 18:35:41 ss10 /netbsd: [ 3001.1613034] vflush(0xf0c3a000, 0x0, 0x2, 0xf0489380, 0x49390, 0xf17cff10) at netbsd:ffs_flushfiles+0x11c
Jun 21 18:35:41 ss10 /netbsd: [ 3001.2005025] ffs_flushfiles(0xf0c3a000, 0x2, 0xf0e35960, 0xf048dc00, 0xf0e2e100, 0x0) at netbsd:ffs_unmount+0x48
Jun 21 18:35:41 ss10 /netbsd: [ 3001.2405376] ffs_unmount(0xf0c3a000, 0x80000, 0xf0e35960, 0xf0e31000, 0xf0e2e100, 0xf0c3a000) at netbsd:VFS_UNMOUNT+0x18
Jun 21 18:35:41 ss10 /netbsd: [ 3001.2705291] VFS_UNMOUNT(0xf0c3a000, 0x80000, 0xf046d800, 0x0, 0xf0fd99d8, 0xf0c29bc0) at netbsd:dounmount+0x8c
Jun 21 18:35:41 ss10 /netbsd: [ 3001.3004866] dounmount(0xf0c3a000, 0x80000, 0xf0e35960, 0x0, 0x740, 0x5000) at netbsd:vfs_unmountall1+0x58
Jun 21 18:35:41 ss10 /netbsd: [ 3001.3405162] vfs_unmountall1(0xf0e35960, 0xf03f58d8, 0x1, 0x80000, 0x0, 0xf0c3a000) at netbsd:cpu_reboot+0x18c
Jun 21 18:35:41 ss10 /netbsd: [ 3001.3805404] cpu_reboot(0x0, 0x0, 0xf0e35960, 0xf05c2000, 0xf0489000, 0xf046ee80) at netbsd:sys_reboot+0x54
Jun 21 18:35:41 ss10 /netbsd: [ 3001.4104998] sys_reboot(0x0, 0xf9ce4f30, 0xf9ce4f28, 0x0, 0x0, 0x0) at netbsd:syscall+0xe8
Jun 21 18:35:41 ss10 /netbsd: [ 3001.4405083] syscall(0xcd0, 0xf9ce4fb0, 0xedc74398, 0xd0, 0x4e, 0xf0e35960) at

(That most recent one got truncated by the system...)
>How-To-Repeat:
Start X on SS10 running 9.3.  Run for some time, issue "shutdown -h/-p now".  This can be deon from inside X or from console.
>Fix:

>Release-Note:

>Audit-Trail:
From: Brandon Applegate <brandon@burn.net>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc/57481: Sparcstation 10 + cgsix gfx results in panic on
 shutdown
Date: Mon, 26 Jun 2023 23:43:43 -0400

 --Apple-Mail=_F0F6EFF0-2F61-4F09-AA1F-F0D623B9AC8F
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=utf-8

 I=E2=80=99m a bit embarrassed to say I think I=E2=80=99ve figured this =
 out and it seems to be a hardware issue (mix of DIMM sizes (or possibly =
 bad DIMMs ?)).  Ever since reconfiguring the DIMM installation to be =
 homogeneous I haven=E2=80=99t had a panic (knock on wood).

 We can close this PR - sorry for the noise.

 --Apple-Mail=_F0F6EFF0-2F61-4F09-AA1F-F0D623B9AC8F
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----

 iQIzBAEBCgAdFiEEBkHShaNvUzpz5SVBSSBTPMYWcDoFAmSaWu8ACgkQSSBTPMYW
 cDpYUg/+Phzs7fIM6q+waF72fBBywfMVcMYupcbQqzFXs7bvAI7bEqISwmloculh
 8fSToMB6NS4Jr+gRax5Dm0Sbb2qUiEexbmQxIeKUm2ND0dLaIq/krNS83QREHGBl
 4w9iodutInTw7e61VcNBHdl77jwZhvkJBpOCG3EqJ/hv0nmt+2pgRq+5kWkvOo9k
 hiLPwpmc1hSrxvvJZ+a00NIT75oRNepzhqzyfDQ0SkEToB4A9NbAJsoAlnfyXdjU
 iNoVrJA1DDIm2xDoK4wwwglcFakKoQw8PRzyyN0c6OOkO4U9spq5HtCy3vB773hN
 rA9vDdf4VJ8kZ+GLnxubcLdM5PuPSuAhWInHEAgJZljEwhc9YY2wo8oGy0tRQmXb
 d+eDv9GgXKxfGRouZbcm1/PveWSMneAVcQzpZW0cEl628FHyRw30zNg/Ed4zyQqA
 4BB5ees/1dc4QfavZ61eDaNdIcKeQWuvnM0X0KsRZ6pCu9XOtcQLpw8APu5NJ1Yp
 XrWn90HRft5Zt4y3DHlykgDWz2cAItpyPSCTlEKkkDxxx233Bq7kvIJkv81MvpmQ
 t6uJKSGbHpYROi6oioLLXnEgruj8bfFVzaI5DzG0DGWaFq2jIQgq6MaAH1dZ1EBI
 sDX6T1JT6+jDmpq3+cMYxUGhgMqBhdqNIFGxnII1N91cO7w6M4g=
 =IdHw
 -----END PGP SIGNATURE-----

 --Apple-Mail=_F0F6EFF0-2F61-4F09-AA1F-F0D623B9AC8F--

State-Changed-From-To: open->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Tue, 27 Jun 2023 06:33:20 +0000
State-Changed-Why:
Probably HW issue, closing on submitters request.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.