NetBSD Problem Report #56087
From he@smistad.uninett.no Thu Apr 1 08:56:02 2021
Return-Path: <he@smistad.uninett.no>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 1BE6C1A9217
for <gnats-bugs@gnats.NetBSD.org>; Thu, 1 Apr 2021 08:56:02 +0000 (UTC)
Message-Id: <20210401085556.940CE43FC5B@smistad.uninett.no>
Date: Thu, 1 Apr 2021 10:55:56 +0200 (CEST)
From: he@NetBSD.org
Reply-To: he@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: dual-CPU macppc panic: pr_phinpage_check: [pmap_upvopl] ...
X-Send-Pr-Version: 3.95
>Number: 56087
>Category: kern
>Synopsis: dual-CPU macppc panic: pr_phinpage_check: [pmap_upvopl] ...
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Apr 01 09:00:01 +0000 2021
>Last-Modified: Thu Apr 01 13:10:01 +0000 2021
>Originator: he@NetBSD.org
>Release: NetBSD 9.99.81
>Organization:
I try...
>Environment:
System: NetBSD bramley.urc.uninett.no 9.99.81 NetBSD 9.99.81 (GENERIC.MP) #0: Tue Mar 30 19:45:04 UTC 2021 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/macppc/compile/GENERIC.MP macppc
Architecture: powerpc
Machine: macppc
>Description:
I recently had to re-install this "mirror drive doors" G4
powermac because I got errors writing to the (old, re-used)
drive.
While extracting the new src.tar.gz, and both running "systat
-w 5 vm" and just after having started "top", I see out of the
corner of my eye the console screen go black, and once it's
back up again, I see this backtrace in the message buffer:
[ 4454.5805633] panic: pr_phinpage_check: [pmap_upvopl] item 0x4c87ce0 poolid 36 != 1
[ 4454.5805633] cpu0: Begin traceback...
[ 4454.5805633] 0x37707a80: at vpanic+0x12c
[ 4454.5805633] 0x37707ab0: at panic+0x50
[ 4454.5805633] 0x37707af0: at pool_put+0x580
[ 4454.5805633] 0x37707b40: at pmap_pvo_free_list.isra.0+0x6c
[ 4454.5805633] 0x37707b60: at pmap_remove+0x10c
[ 4454.5805633] 0x37707b90: at uvm_pagermapout+0x24
[ 4454.5805633] 0x37707bc0: at genfs_getpages+0x12d4
[ 4454.5805633] 0x37707cd0: at VOP_GETPAGES+0x6c
[ 4454.5805633] 0x37707d10: at ufs_balloc_range+0x184
[ 4454.5805633] 0x37707d80: at ffs_write+0x67c
[ 4454.5805633] 0x37707e10: at VOP_WRITE+0x50
[ 4454.5805633] 0x37707e40: at vn_write+0x140
[ 4454.5805633] 0x37707e70: at dofilewrite+0x8c
[ 4454.5805633] 0x37707ec0: at syscall+0x2a4
[ 4454.5805633] 0x37707f20: user SC trap #4 by 0xfdd2451c: srr1=0xd032
[ 4454.5805633] r1=0xffffe4e0 cr=0x44444484 xer=0 ctr=0xfdd24514
[ 4454.5805633] cpu0: End traceback...
[ 4454.5805633] halting CPU 1
[ 4454.5805633] dumpsys: TBD
[ 4454.5805633] rebooting
With this particular drive I've not seen any write errors on
the console, so I don't think that is the cause of this problem.
I'm running with both CPUs active if that makes any
difference:
[ 1.0000000] mainbus0 (root)
[ 1.0000000] cpu0 at mainbus0: 7455 (Revision 3.3), ID 0 (primary)
[ 1.0000000] cpu0: HID0 0x84d0c1bc<EMCP,TBEN,HIGH_BAT_EN,NAP,DPM,ICE,DCE,XBSEN,SGE,BTIC,LRSTK,FOLD,BHT>, powersave: 1
[ 1.0000000] cpu0: 1250.00 MHz, 256KB L2 cache no parity, 2MB no-parity L3 cache (DDR SRAM) at 6:1 ratio
[ 1.0000000] cpu1 at mainbus0: 7455 (Revision 3.3), ID 1
[ 1.0000000] cpu1: HID0 0x84d0c1bc<EMCP,TBEN,HIGH_BAT_EN,NAP,DPM,ICE,DCE,XBSEN,SGE,BTIC,LRSTK,FOLD,BHT>, powersave: 1
[ 1.0000000] cpu1: 1250.00 MHz, 256KB L2 cache no parity, 2MB no-parity L3 cache (DDR SRAM) at 6:1 ratio
I realize that this *may* be a powerpc- or even macppc-
specific problem, but so far I've categorized is as a "kern"
bug. (I did notice that the 9.1 MP kernel doesn't really come
properly up, seemingly stuck before initializing IPsec, though
that's probably a different problem.)
>How-To-Repeat:
See above.
>Fix:
No idea, sorry...
>Audit-Trail:
From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@netbsd.org, Havard Eidnes <he@netbsd.org>
Cc:
Subject: Re: kern/56087: dual-CPU macppc panic: pr_phinpage_check:
[pmap_upvopl] ...
Date: Thu, 1 Apr 2021 18:06:12 +0900
Probably same problem as reported by port-powerpc/55325:
http://gnats.netbsd.org/55325
The patch attached to the PR may improve the situation, but
we need real fix for pmap for powepc/oea...
Thanks,
rin
From: Havard Eidnes <he@NetBSD.org>
To: rokuyama.rk@gmail.com
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/56087: dual-CPU macppc panic: pr_phinpage_check:
[pmap_upvopl] ...
Date: Thu, 01 Apr 2021 15:08:17 +0200 (CEST)
> Probably same problem as reported by port-powerpc/55325:
> http://gnats.netbsd.org/55325
>
> The patch attached to the PR may improve the situation, but
> we need real fix for pmap for powepc/oea...
First, thanks for the suggestion!
Sadly, in my case it didn't. I applied the patch quoted in the
PR to -current, and built the GENERIC.MP kernel and tried booting
it. It seemed to have a negative influence on the interrupt
system, and I didn't fully get the new kernel up; the console
showed (transcribed from a photo of the screen):
[ 4.6999901] virq != 0, value 10
[ 4.6999901] virq != 0, value 10
[ 4.6999901] virq != 0, value 10
[ 4.6999901] virq != 0, value 10
[ 4.6999901] virq != 0, value 10
[ 4.7100758] virq != 0, value 10
[ 4.7100758] virq != 0, value 10
[ 4.7100758] virq != 0, value 10
[ 4.7100758] virq != 0, value 10
[ 4.7100758] virq != 0, value 10
[ 4.7200755] uhidev 0 at uhub2 port 1 configuration 1
[ 4.7200755] uhidev0: Mitsumi Electric (0x05ac) Apple Extended USB Keyboard
[ 4.7200755] virq != 0, value 10
[ 4.7200755] virq != 0, value 10
[ 4.7200755] virq != 0, value 10
[ 4.7200755] ukbd0 at uhiddev0
[ 4.7200755] virq != 0, value 1
[ 4.7200755] virq != 0, value 10
[ 4.7200755] virq != 0, value 1
[ 4.7200755] virq != 0, value 10
etc. ending in
[ 5.0999900] virq != 0, value 10
[ 5.1299891] virq != 0, value 1
[ 5.1299891] wskbd0 at ukbd0: console keyboard using wsdisplay0
[ 5.1299891] uhidev1 at uhub2 port 1 configurat
and that's it; it seems to be wedged there.
Also, as far as I know there is no serial console possibility on
this machine, so I can't capture the start of the boot messages,
and the power button is the only possible next action.
Aand... I just now heard the "gong" sound from my basement where
this Mac is placed, indicating that this, perhaps unsurprisingly,
is also a problem for single-CPU systems (I got the same
backtrace as originally reported in this PR in my kernel message
buffer).
Regards,
- Havard
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.