NetBSD Problem Report #56999
From www@netbsd.org Mon Sep 5 15:41:12 2022
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id E81F61A921F
for <gnats-bugs@gnats.NetBSD.org>; Mon, 5 Sep 2022 15:41:11 +0000 (UTC)
Message-Id: <20220905154110.715491A9244@mollari.NetBSD.org>
Date: Mon, 5 Sep 2022 15:41:10 +0000 (UTC)
From: piotr@durlej.net
Reply-To: piotr@durlej.net
To: gnats-bugs@NetBSD.org
Subject: NetBSD installer crash (kernel page fault)
X-Send-Pr-Version: www-1.0
>Number: 56999
>Category: kern
>Synopsis: NetBSD installer crash (kernel page fault)
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Sep 05 15:45:00 +0000 2022
>Last-Modified: Sun Sep 11 21:05:01 +0000 2022
>Originator: Piotr Durlej
>Release: 9.3
>Organization:
Piotr Durlej
>Environment:
Unable to obtain.
>Description:
NetBSD installer image reliably crashes during boot on HP Microserver Gen10+.
This is a kernel page fault at svs_lwp_switch+0x2c.
Please see http://home.durlej.net/pdurlej/nbsd-hp-crash.png
>How-To-Repeat:
Try booting NetBSD 9.3/amd64 USB installation image on a HP Microserver Gen10+
>Fix:
>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56999: NetBSD installer crash (kernel page fault)
Date: Sat, 10 Sep 2022 04:48:59 +0000
On Mon, Sep 05, 2022 at 03:45:00PM +0000, piotr@durlej.net wrote:
> Please see http://home.durlej.net/pdurlej/nbsd-hp-crash.png
Transcript:
(without the timestamps, they are all 1.0077928)
ppb6 at pci0 dev 29 function 3: vendor 8086 product a333 (rev. 0xf0)
ppb6: PCI Express capability version 2 <Root Port of PCI-E Root Complex> x1 @ 8.0GT/s
pci7 at ppb6 bus 5
pcib0 at pci0 dev 31 function 0: vendor 8086 product a30a (rev. 0x10)
vendor 8086 product a324 (miscellaneous serial bus, revision 0x10) at pci0 dev 31 function 5 not configured
isa0 at pcib0
pckbc0 at isa0 port 0x60-0x64
uhub0 at usb0: NetBSD (0000) xHCI root hub (0000), class 9/0, rev 3.00/1.00, addr 0
uhub1 at usb1: NetBSD (0000) xHCI root hub (0000), class 9/0, rev 2.00/1.00, addr 0
uvm_fault(0xffffa5edf01e8000, 0x0, 2) -> e
fatal page fault in supervisor mode
trap type 6 code 0x2 rip 0xfffffff8024b927 cs 0x8 rflags 0x10246 cr2 0x8 ilevel ix8 rsp 0xffffcb013c26ffc8
curlwp 0xffffa5edf0200a80 pid 1.1 lowest kstack 0xffffcb013c26d2c0
kernel: page fault trap, code=0
Stopped in pid1.1 (system) at netbsd:svs_lwp_switch+0x2c: movq $0,8(%rcx)
That is, %rcx contains 0.
Can you get a backtrace? (type "bt" at the ddb prompt; if it prints
too much you can tell it how many entries to print with e.g. "bt,6")
The offending code is "utls->scratch = 0" on line 672 of svs.c, which
seems to mean that utls isn't set; but we also shouldn't be getting
userlevel traps at this point during boot so I'm not sure why we're
even getting to that code. Consequently, a backtrace would be very
helpful...
--
David A. Holland
dholland@netbsd.org
From: Piotr Durlej <piotr@durlej.net>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56999: NetBSD installer crash (kernel page fault)
Date: Sat, 10 Sep 2022 16:31:28 +0200
On 9/10/2022 6:50 AM, David Holland wrote:
> Can you get a backtrace? (type "bt" at the ddb prompt; if it prints
> too much you can tell it how many entries to print with e.g. "bt,6")
Unfortunately not.
The machine doesn't have an iLO Enabled Kit and the system does not
respond to the USB keyboard after the crash happens.
Kind regards,
Piotr Durlej
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56999: NetBSD installer crash (kernel page fault)
Date: Sun, 11 Sep 2022 21:02:58 +0000
On Sat, Sep 10, 2022 at 04:20:01PM +0000, Piotr Durlej wrote:
> On 9/10/2022 6:50 AM, David Holland wrote:
> > Can you get a backtrace? (type "bt" at the ddb prompt; if it prints
> > too much you can tell it how many entries to print with e.g. "bt,6")
>
> Unfortunately not.
>
> The machine doesn't have an iLO Enabled Kit and the system does not
> respond to the USB keyboard after the crash happens.
Drat. Oh well... hopefully someone here has some ideas about what
might have triggered it. (I don't)
--
David A. Holland
dholland@netbsd.org
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2022
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.