NetBSD Problem Report #56999

From www@netbsd.org  Mon Sep  5 15:41:12 2022
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id E81F61A921F
	for <gnats-bugs@gnats.NetBSD.org>; Mon,  5 Sep 2022 15:41:11 +0000 (UTC)
Message-Id: <20220905154110.715491A9244@mollari.NetBSD.org>
Date: Mon,  5 Sep 2022 15:41:10 +0000 (UTC)
From: piotr@durlej.net
Reply-To: piotr@durlej.net
To: gnats-bugs@NetBSD.org
Subject: NetBSD installer crash (kernel page fault)
X-Send-Pr-Version: www-1.0

>Number:         56999
>Category:       kern
>Synopsis:       NetBSD installer crash (kernel page fault)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Sep 05 15:45:00 +0000 2022
>Last-Modified:  Sun Sep 11 21:05:01 +0000 2022
>Originator:     Piotr Durlej
>Release:        9.3
>Organization:
Piotr Durlej
>Environment:
Unable to obtain.
>Description:
NetBSD installer image reliably crashes during boot on HP Microserver Gen10+.

This is a kernel page fault at svs_lwp_switch+0x2c.

Please see http://home.durlej.net/pdurlej/nbsd-hp-crash.png
>How-To-Repeat:
Try booting NetBSD 9.3/amd64 USB installation image on a HP Microserver Gen10+
>Fix:

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56999: NetBSD installer crash (kernel page fault)
Date: Sat, 10 Sep 2022 04:48:59 +0000

 On Mon, Sep 05, 2022 at 03:45:00PM +0000, piotr@durlej.net wrote:
  > Please see http://home.durlej.net/pdurlej/nbsd-hp-crash.png

 Transcript:
 (without the timestamps, they are all 1.0077928)

 ppb6 at pci0 dev 29 function 3: vendor 8086 product a333 (rev. 0xf0)
 ppb6: PCI Express capability version 2 <Root Port of PCI-E Root Complex> x1 @ 8.0GT/s
 pci7 at ppb6 bus 5
 pcib0 at pci0 dev 31 function 0: vendor 8086 product a30a (rev. 0x10)
 vendor 8086 product a324 (miscellaneous serial bus, revision 0x10) at pci0 dev 31 function 5 not configured
 isa0 at pcib0
 pckbc0 at isa0 port 0x60-0x64
 uhub0 at usb0: NetBSD (0000) xHCI root hub (0000), class 9/0, rev 3.00/1.00, addr 0
 uhub1 at usb1: NetBSD (0000) xHCI root hub (0000), class 9/0, rev 2.00/1.00, addr 0
 uvm_fault(0xffffa5edf01e8000, 0x0, 2) -> e
 fatal page fault in supervisor mode
 trap type 6 code 0x2 rip 0xfffffff8024b927 cs 0x8 rflags 0x10246 cr2 0x8 ilevel ix8 rsp 0xffffcb013c26ffc8
 curlwp 0xffffa5edf0200a80 pid 1.1 lowest kstack 0xffffcb013c26d2c0
 kernel: page fault trap, code=0
 Stopped in pid1.1 (system) at  netbsd:svs_lwp_switch+0x2c:    movq    $0,8(%rcx)


 That is, %rcx contains 0.

 Can you get a backtrace? (type "bt" at the ddb prompt; if it prints
 too much you can tell it how many entries to print with e.g. "bt,6")

 The offending code is "utls->scratch = 0" on line 672 of svs.c, which
 seems to mean that utls isn't set; but we also shouldn't be getting
 userlevel traps at this point during boot so I'm not sure why we're
 even getting to that code. Consequently, a backtrace would be very
 helpful...

 -- 
 David A. Holland
 dholland@netbsd.org

From: Piotr Durlej <piotr@durlej.net>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56999: NetBSD installer crash (kernel page fault)
Date: Sat, 10 Sep 2022 16:31:28 +0200

 On 9/10/2022 6:50 AM, David Holland wrote:
 >   Can you get a backtrace? (type "bt" at the ddb prompt; if it prints
 >   too much you can tell it how many entries to print with e.g. "bt,6")

 Unfortunately not.

 The machine doesn't have an iLO Enabled Kit and the system does not 
 respond to the USB keyboard after the crash happens.

 Kind regards,
 Piotr Durlej

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56999: NetBSD installer crash (kernel page fault)
Date: Sun, 11 Sep 2022 21:02:58 +0000

 On Sat, Sep 10, 2022 at 04:20:01PM +0000, Piotr Durlej wrote:
  >  On 9/10/2022 6:50 AM, David Holland wrote:
  >  >   Can you get a backtrace? (type "bt" at the ddb prompt; if it prints
  >  >   too much you can tell it how many entries to print with e.g. "bt,6")
  >  
  >  Unfortunately not.
  >  
  >  The machine doesn't have an iLO Enabled Kit and the system does not 
  >  respond to the USB keyboard after the crash happens.

 Drat. Oh well... hopefully someone here has some ideas about what
 might have triggered it. (I don't)

 -- 
 David A. Holland
 dholland@netbsd.org

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2022 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.