NetBSD Problem Report #59371
From imil@netbsd.org Mon Apr 28 05:37:06 2025
Return-Path: <imil@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
client-signature RSA-PSS (2048 bits) client-digest SHA256)
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 0925C1A9239
for <gnats-bugs@gnats.NetBSD.org>; Mon, 28 Apr 2025 05:37:06 +0000 (UTC)
Message-Id: <20250428053704.7AD151A923C@mollari.NetBSD.org>
Date: Mon, 28 Apr 2025 05:37:04 +0000 (UTC)
From: imil@home.imil.net
Reply-To: imil@home.imil.net
To: gnats-bugs@NetBSD.org
Subject: Xen domU uvm_fault since FPU state allocation patch
X-Send-Pr-Version: 3.95
>Number: 59371
>Category: kern
>Synopsis: Xen domU uvm_fault since FPU state allocation patch
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: riastradh
>State: feedback
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Apr 28 05:40:01 +0000 2025
>Closed-Date:
>Last-Modified: Mon Apr 28 13:05:01 +0000 2025
>Originator: Emile `iMil' Heitor
>Release: NetBSD 10.99.14
>Organization:
NetBSD
>Environment:
System: NetBSD outcast 10.99.14 NetBSD 10.99.14 (XEN3_DOM0) #1: Wed Apr 23 12:03:35 CEST 2025 imil@tatooine:/home/imil/src/github.com/NetBSD-src/sys/arch/amd64/compile/obj/XEN3_DOM0 amd64
Architecture: x86_64
Machine: amd64
>Description:
Starting April 25, NetBSD domU kernel would crash with an uvm_fault:
[ 1.0000000] cpu_rng: rdrand/rdseed
[ 1.0000000] entropy: ready
[ 1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
[ 1.0000000] 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
[ 1.0000000] 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023,
[ 1.0000000] 2024, 2025
[ 1.0000000] The NetBSD Foundation, Inc. All rights reserved.
[ 1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[ 1.0000000] The Regents of the University of California. All rights reserved.
[ 1.0000000] NetBSD 10.99.14 (XEN3_DOMU) #7: Mon Apr 28 06:06:42 CEST 2025
[ 1.0000000] imil@tatooine:/home/imil/src/NetBSD/src/sys/arch/amd64/compile/obj/XEN3_DOMU
[ 1.0000000] total memory = 256 MB
[ 1.0000000] avail memory = 235 MB
[ 1.0000000] mainbus0 (root)
[ 1.0000000] hypervisor0 at mainbus0: Xen version 4.18.4_20241221nb0
[ 1.0000000] vcpu0 at hypervisor0
[ 1.0000000] vcpu0: Intel(R) Core(TM) i5-7300U CPU @ 2.60GHz, id 0x806e9
[ 1.0000000] vcpu0: node 0, package 0, core 0, smt 0
[ 1.0000000] xenbus0 at hypervisor0: Xen Virtual Bus Interface
[ 1.0000000] xencons0 at hypervisor0: Xen Virtual Console Driver
[ 1.0000030] xenbus0: can't get state for device/suspend/event-channel (2)
[ 1.0000030] uvm_fault(0xffffffff8094a300, 0x0, 2) -> e
[ 1.0000030] fatal page fault in supervisor mode
[ 1.0000030] trap type 6 code 0x2 rip 0xffffffff8062795c cs 0xe030 rflags 0x10202 cr2 0 ilevel 0 rsp 0xffffffff80adad38
[ 1.0000030] curlwp 0xffffffff8078f880 pid 0.0 lowest kstack 0xffffffff80ad62c0
kernel: page fault trap, code=0
Stopped in pid 0.0 (system) at netbsd:memset+0x2c: repe stosq %es:(%rdi)
memset() at netbsd:memset+0x2c
lwp_create() at netbsd:lwp_create+0x2f1
fork1() at netbsd:fork1+0x42c
main() at netbsd:main+0x44f
ds 40
es 100
fs 1
gs 107
rdi 0
rsi 200
rbp ffffffff80adad90
rbx ffff930042b2e000
rdx 200
rcx 40
rax 0
r8 ffffffff8047a38c start_init
r9 0
r10 fffffe00
r11 ffffffff80adabcc
r12 0
r13 ffff93000092e800
r14 ffffffff8078f880 lwp0
r15 0
rip ffffffff8062795c memset+0x2c
cs e030
rflags 10202
rsp ffffffff80adad38
ss e02b
netbsd:memset+0x2c: repe stosq %es:(%rdi)
db{0}>
This behavior seems linked to this commit:
https://mail-index.netbsd.org/source-changes/2025/04/24/msg156552.html
Riastradh@ suggested that I try this workaround:
--- sys/arch/x86/x86/fpu.c 2025-04-24 16:57:38.905367169 +0200
+++ sys/arch/x86/x86/fpu.c.patch 2025-04-24 16:58:39.368608934 +0200
@@ -475,6 +475,9 @@
return;
}
+#ifdef XENPV
+ pcb2->pcb_savefpu = &pcb2->pcb_savefpusmall;
+#endif
/* For init(8). */
if (__predict_false(l1->l_flag & LW_SYSTEM)) {
memset(pcb2->pcb_savefpu, 0, x86_fpu_save_size);
which indeed permitted the domU to proceed with boot.
>How-To-Repeat:
Boot a NetBSD Xen/domU kernel from April 25 2025
>Fix:
Please.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->riastradh
Responsible-Changed-By: riastradh@NetBSD.org
Responsible-Changed-When: Mon, 28 Apr 2025 12:39:57 +0000
Responsible-Changed-Why:
my bug, will provide better workaround shortly and then propose a
proper fix to be considered
State-Changed-From-To: open->analyzed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Mon, 28 Apr 2025 12:46:46 +0000
State-Changed-Why:
This happens because amd64 defines __HAVE_CPU_UAREA_ROUTINES only under
#if !defined(XENPV):
114 #if defined(__x86_64__) && !defined(XENPV)
115 #if !defined(KASAN) && !defined(KMSAN)
116 #define __HAVE_PCPU_AREA 1
117 #define __HAVE_DIRECT_MAP 1
118 #endif
119 #define __HAVE_CPU_UAREA_ROUTINES 1
120 #endif
https://nxr.netbsd.org/xref/src/sys/arch/amd64/include/types.h?r=1.71#114
So the logic in x86/vm_machdep.c's cpu_uarea_alloc/free to initialize
the FPU save area of a newly allocated uarea doesn't kick in. It's not
clear to me why separate uarea allocation is disabled under XENPV --
they were introduced by maxv@ for a stack guard page (`redzone') back
in March 2020:
https://mail-index.netbsd.org/port-amd64/2020/03/14/msg003179.html
https://mail-index.netbsd.org/source-changes/2020/03/17/msg115178.html
--- a/sys/arch/amd64/include/types.h
+++ b/sys/arch/amd64/include/types.h
...
@@ -114,10 +114,11 @@ typedef unsigned char __cpu_simple_lock_nv_t;
#if defined(__x86_64__) && !defined(XENPV)
#if !defined(KASAN) && !defined(KMSAN)
#define __HAVE_PCPU_AREA 1
#define __HAVE_DIRECT_MAP 1
#endif
+#define __HAVE_CPU_UAREA_ROUTINES 1
#if !defined(NO_PCI_MSI_MSIX)
#define __HAVE_PCI_MSI_MSIX
#endif
#endif
It's possible maxv@ just didn't want to think about XENPV implications
or test it. It's possible that the guard page works badly in XENPV for
some reason, or somehow hurts performance, I dunno.
So, as a stop-gap measure, let's first just use the same approach as
i386 (no pointer to savefpu area, disable Intel AMX), and then think
about enabling the guard page -- and if the guard page doesn't work for
some reason, we can either leave it at that or make the uarea routine
skip the guard page under XENPV.
State-Changed-From-To: analyzed->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Mon, 28 Apr 2025 13:02:32 +0000
State-Changed-Why:
workaround committed to HEAD, please test
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/59371 CVS commit: src/sys/arch
Date: Mon, 28 Apr 2025 13:01:28 +0000
Module Name: src
Committed By: riastradh
Date: Mon Apr 28 13:01:28 UTC 2025
Modified Files:
src/sys/arch/amd64/include: pcb.h
src/sys/arch/x86/include: specialreg.h
src/sys/arch/x86/x86: fpu.c
Log Message:
xen: Stop-gap FPU PCB fix; disable Intel AMX for now.
Since the custom cpu_uarea_alloc/free are disabled under XENPV,
nothing would initialize struct pcb::pcb_savefpu to point either to
struct pcb::pcb_savefpusmall, or to a separately allocated large area
on machines with Intel AMX TILECFG/TILEDATA requiring it. So the
memset in fpu_lwp_fork would crash on null pointer dereference:
[ 1.0000030] uvm_fault(0xffffffff8094a300, 0x0, 2) -> e
[ 1.0000030] fatal page fault in supervisor mode
[ 1.0000030] trap type 6 code 0x2 rip 0xffffffff8062795c cs 0xe030 rflags 0x10202 cr2 0 ilevel 0 rsp 0xffffffff80adad38
[ 1.0000030] curlwp 0xffffffff8078f880 pid 0.0 lowest kstack 0xffffffff80ad62c0
kernel: page fault trap, code=0
Stopped in pid 0.0 (system) at netbsd:memset+0x2c: repe stosq %es:(%rdi)
memset() at netbsd:memset+0x2c
lwp_create() at netbsd:lwp_create+0x2f1
fork1() at netbsd:fork1+0x42c
main() at netbsd:main+0x44f
In order to support Intel AMX TILECFG/TILEDATA, or any other CPU
extensions that increase the XSAVE area beyond what fits in a single
page after struct pcb, we would need to enable the the custom
cpu_uarea_alloc/free. Currently that would imply allocating stack
guard pages (`redzone') under XENPV; if there's some reason the stack
guard pages don't work, we could also push #ifdef XENPV conditionals
into cpu_uarea_alloc/free to cover the guard pages -- to be
considered.
PR kern/59371: Xen domU uvm_fault since FPU state allocation patch
PR port-amd64/57661: Crash when booting on Xeon Silver 4416+ in
KVM/Qemu
To generate a diff of this commit:
cvs rdiff -u -r1.34 -r1.35 src/sys/arch/amd64/include/pcb.h
cvs rdiff -u -r1.218 -r1.219 src/sys/arch/x86/include/specialreg.h
cvs rdiff -u -r1.90 -r1.91 src/sys/arch/x86/x86/fpu.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.