NetBSD Problem Report #55790
From www@netbsd.org Fri Nov 6 22:58:44 2020
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 5717C1A9246
for <gnats-bugs@gnats.NetBSD.org>; Fri, 6 Nov 2020 22:58:44 +0000 (UTC)
Message-Id: <20201106225843.043621A925D@mollari.NetBSD.org>
Date: Fri, 6 Nov 2020 22:58:42 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Floating point exception crashes DIAGNOSTIC kernel
X-Send-Pr-Version: www-1.0
>Number: 55790
>Category: port-arm
>Synopsis: Floating point exception crashes DIAGNOSTIC kernel
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-arm-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Nov 06 23:00:00 +0000 2020
>Closed-Date: Sun Sep 05 06:15:35 +0000 2021
>Last-Modified: Sun Sep 05 06:15:35 +0000 2021
>Originator: Rin Okuyama
>Release: 9.99.75
>Organization:
Department of Physics, Meiji University
>Environment:
NetBSD rpi0w 9.99.75 NetBSD 9.99.75 (RPI) #68: Sat Nov 7 07:24:01 JST 2020 rin@latipes:/sys/arch/evbarm/compile/RPI evbarm earmv6hf
>Description:
For CPU which raises floating point exceptions, e.g., ARM1176 in RPI[01],
a FPE crashes DIAGNOSTIC kernels:
----
$ cat fpe.c
#include <fenv.h>
#include <stdlib.h>
int
main(void)
{
volatile double x, y;
feenableexcept(FE_ALL_EXCEPT);
x = atoi("1");
y = atoi("0");
return (int)(x / y);
}
$ cc fpe.c -lm && ./a.out
[ 16754.2789480] panic: kernel diagnostic assertion "(armreg_fpexc_read() & VFP_FPEXC_EN) == 0" failed: file "../../../../arch/arm/vfp/vfp_init.c", line 547
[ 16754.2789480] cpu0: Begin traceback...
[ 16754.2789480] 0xc9c11d94: netbsd:db_panic+0xc
[ 16754.2789480] 0xc9c11dac: netbsd:vpanic+0xc4
[ 16754.2789480] 0xc9c11dc4: netbsd:kern_assert+0x3c
[ 16754.2789480] 0xc9c11dfc: netbsd:vfp_state_load+0x144
[ 16754.2789480] 0xc9c11e5c: netbsd:pcu_load+0x1e0
[ 16754.2789480] 0xc9c11ef4: netbsd:vfp_handler+0x64
[ 16754.2789480] 0xc9c11fac: netbsd:undefinedinstruction+0x124
[ 16754.2789480] cpu0: End traceback...
Stopped in pid 309.309 (a.out) at netbsd:cpu_Debugger+0x4: bx r14
----
In vfp_handler(),
https://nxr.netbsd.org/xref/src/sys/arch/arm/vfp/vfp_init.c#421
421 /* The real handler for VFP bounces. */
422 static int
423 vfp_handler(u_int address, u_int insn, trapframe_t *frame, int fault_code)
...
437 /*
438 * If we already own the FPU and it's enabled (and no exception), raise
439 * SIGILL. If there is an exception, drop through to raise a SIGFPE.
440 */
441 if (curcpu()->ci_pcu_curlwp[PCU_FPU] == curlwp
442 && (armreg_fpexc_read() & (VFP_FPEXC_EX|VFP_FPEXC_EN)) == VFP_FPEXC_EN)
443 return 1;
444
445 /*
446 * Make sure we own the FP.
447 */
448 pcu_load(&arm_vfp_ops);
449
450 uint32_t fpexc = armreg_fpexc_read();
451 if (fpexc & VFP_FPEXC_EX) {
... ----> raise SIGFPE
as comment says, when curlwp owns FPU and it is enabled, if there's no
exception, it raises SIGILL. Otherwise, it falls through in order to raise
SIGFPE. However, since curlwp already owns enabled FPU, the KASSERT fires
in vfp_state_load() called from pcu_load(), as described above.
Therefore, we need to skip pcu_load() in this case:
http://www.netbsd.org/~rin/vfp_init_20201107.patch
----
Index: sys/arch/arm/vfp/vfp_init.c
===================================================================
RCS file: /home/netbsd/src/sys/arch/arm/vfp/vfp_init.c,v
retrieving revision 1.72
diff -p -u -r1.72 vfp_init.c
--- sys/arch/arm/vfp/vfp_init.c 30 Oct 2020 18:54:37 -0000 1.72
+++ sys/arch/arm/vfp/vfp_init.c 6 Nov 2020 21:59:29 -0000
@@ -423,6 +423,7 @@ static int
vfp_handler(u_int address, u_int insn, trapframe_t *frame, int fault_code)
{
struct cpu_info * const ci = curcpu();
+ uint32_t fpexc;
/* This shouldn't ever happen. */
if (fault_code != FAULT_USER &&
@@ -438,20 +439,30 @@ vfp_handler(u_int address, u_int insn, t
* If we already own the FPU and it's enabled (and no exception), raise
* SIGILL. If there is an exception, drop through to raise a SIGFPE.
*/
- if (curcpu()->ci_pcu_curlwp[PCU_FPU] == curlwp
- && (armreg_fpexc_read() & (VFP_FPEXC_EX|VFP_FPEXC_EN)) == VFP_FPEXC_EN)
- return 1;
+ if (curlwp->l_pcu_cpu[PCU_FPU] == ci) {
+ KASSERT(ci->ci_pcu_curlwp[PCU_FPU] == curlwp);
+
+ fpexc = armreg_fpexc_read();
+ if (fpexc & VFP_FPEXC_EN) {
+ if ((fpexc & VFP_FPEXC_EX) == 0) {
+ return 1; /* SIGILL */
+ } else {
+ goto fpe; /* SIGFPE; skip pcu_load(9) */
+ }
+ }
+ }
/*
* Make sure we own the FP.
*/
pcu_load(&arm_vfp_ops);
- uint32_t fpexc = armreg_fpexc_read();
+ fpexc = armreg_fpexc_read();
if (fpexc & VFP_FPEXC_EX) {
ksiginfo_t ksi;
KASSERT(fpexc & VFP_FPEXC_EN);
+fpe:
curcpu()->ci_vfp_evs[2].ev_count++;
/*
----
Then, the process receives SIGFPE as expected. Note that the original
code checks ``curcpu()->ci_pcu_curlwp[PCU_FPU] == curlwp'' to determine
whether curlwp owns FPU. But, code in pcu_load(9) checks
``curlwp->l_pcu_cpu[PCU_FPU] == curcpu()'' for this purpose, and asserts
the former condition:
https://nxr.netbsd.org/xref/src/sys/kern/subr_pcu.c#305
In this patch, I follow this manner for the clarity.
No new regression in tests/lib/libm is observed with this patch on RPI0.
>How-To-Repeat:
Described above; cause FPE on Raspberry Pi 1 or ZERO.
>Fix:
Described above; http://www.netbsd.org/~rin/vfp_init_20201107.patch
>Release-Note:
>Audit-Trail:
From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/55790 CVS commit: src/sys/arch/arm/vfp
Date: Tue, 1 Jun 2021 00:13:19 +0000
Module Name: src
Committed By: rin
Date: Tue Jun 1 00:13:19 UTC 2021
Modified Files:
src/sys/arch/arm/vfp: vfp_init.c
Log Message:
PR port-arm/55790
Style fix for clarity, in preparation of main fix.
Replace condition ``curcpu()->ci_pcu_curlwp[PCU_FPU] == curlwp'' with
``curlwp->l_pcu_cpu[PCU_FPU] == curcpu()''. And add KASSERT to check
the two conditions are equivalent, as done for MI pcu code:
https://nxr.netbsd.org/xref/src/sys/kern/subr_pcu.c#323
No functional changes.
To generate a diff of this commit:
cvs rdiff -u -r1.72 -r1.73 src/sys/arch/arm/vfp/vfp_init.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/55790 CVS commit: src/sys/arch/arm/vfp
Date: Tue, 1 Jun 2021 00:30:22 +0000
Module Name: src
Committed By: rin
Date: Tue Jun 1 00:30:22 UTC 2021
Modified Files:
src/sys/arch/arm/vfp: vfp_init.c
Log Message:
PR port-arm/55790
Fix KASSERT failure with floating-point exception in userland.
Consider the case in which curlwp owns enabled FPU in vfp_handler().
If FPE is raised, we must skip pcu_load(9) rather than just falling
through. Otherwise, KASSERT fires in vfp_state_load(), since curlwp
already owns enabled FPU.
No regression for ATF is introduced.
To generate a diff of this commit:
cvs rdiff -u -r1.73 -r1.74 src/sys/arch/arm/vfp/vfp_init.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: skrll@NetBSD.org
State-Changed-When: Sun, 05 Sep 2021 06:15:35 +0000
State-Changed-Why:
patch applied.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.