NetBSD Problem Report #58135

From www@netbsd.org  Wed Apr 10 03:32:28 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C28B51A9239
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 10 Apr 2024 03:32:27 +0000 (UTC)
Message-Id: <20240410033226.473881A923A@mollari.NetBSD.org>
Date: Wed, 10 Apr 2024 03:32:26 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: reproducible pmap KASSERT failure for armv7 with NFS root
X-Send-Pr-Version: www-1.0

>Number:         58135
>Category:       port-arm
>Synopsis:       reproducible pmap KASSERT failure for armv7 with NFS root
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-arm-maintainer
>State:          feedback
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Apr 10 03:35:00 +0000 2024
>Closed-Date:    
>Last-Modified:  Sat Apr 13 12:29:01 +0000 2024
>Originator:     Rin Okuyama
>Release:        10.99.10 and 10.0
>Organization:
Internet Initiative Japan Inc.
>Environment:
NetBSD console 10.99.10 NetBSD 10.99.10 (SUNXI_NET_MPSAFE) #1: Thu Feb  8 23:00:33 JST 2024  rin@sakaizumii.local:/home/rin/src/sys/arch/evbarm/compile/SUNXI_NET_MPSAFE evbarm earmv7hfeb
>Description:
Running armv7-based system with NFS root, KASSERT failure for
pmap can be easily triggered for few hours. For example:

$ cd /usr/pkgsrc/lang/perl5 && make install
...
Finding dependencies for av.o
[ 28405.9114372] panic: kernel diagnostic assertion "(((uintptr_t)ptep / sizeof(pte)) & (L2_L_SIZE / L2_S_SIZE - 1)) == 0" failed: file "./arm/arm32/pmap.h", line 603 0x90061930
[ 28405.9114372] cpu0: Begin traceback...
[ 28405.9114372] 0xa876fbac: netbsd:db_panic+0x14
[ 28405.9114372] 0xa876fbcc: netbsd:vpanic+0x114
[ 28405.9114372] 0xa876fbe4: netbsd:kern_assert+0x40
[ 28405.9114372] 0xa876fc6c: netbsd:pmap_clearbit+0x490
[ 28405.9114372] 0xa876fc84: netbsd:pmap_page_protect+0x78
[ 28405.9114372] 0xa876fcac: netbsd:nfs_gop_write+0x48
[ 28405.9114372] 0xa876fe1c: netbsd:genfs_do_putpages+0x7b8
[ 28405.9114372] 0xa876fe44: netbsd:genfs_putpages+0x34
[ 28405.9114372] 0xa876fe7c: netbsd:VOP_PUTPAGES+0x78
[ 28405.9114372] 0xa876fea4: netbsd:nfs_fsync+0x60
[ 28405.9114372] 0xa876fedc: netbsd:VOP_FSYNC+0x7c
[ 28405.9114372] 0xa876ff24: netbsd:nfs_sync+0x80
[ 28405.9114372] 0xa876ff44: netbsd:VFS_SYNC+0x3c
[ 28405.9114372] 0xa876ffac: netbsd:sched_sync+0xdc
[ 28405.9114372] cpu0: End traceback...

This takes place both for -current and 10.0.

I've found this can be explained as follows:

(1) nfs_gop_write() unconditionally calls pmap_page_protect:

    280 	for (i = 0; i < npages; i++) {
    281 		pmap_page_protect(pgs[i], VM_PROT_READ);
    282 	}
https://nxr.netbsd.org/xref/src/sys/nfs/nfs_node.c#281

and

(2) pmap_clearbit() turns on L2_XS_XN == __BIT(0) without checking
the target page is valid (i.e., L2_TYPE_S == __BIT(1) is turned on):

   2326 static void
   2327 pmap_clearbit(struct vm_page_md *md, paddr_t pa, u_int maskbits)
   2328 {
...
   2333 #ifdef ARM_MMU_EXTENDED
   2334 	const u_int execbits = (maskbits & PVF_EXEC) ? L2_XS_XN : 0;
   2335 #else
...
   2422 		pt_entry_t * const ptep = &l2b->l2b_kva[l2pte_index(va)];
   2423 		const pt_entry_t opte = *ptep;
   2424 		pt_entry_t npte = opte | execbits;

https://nxr.netbsd.org/xref/src/sys/arch/arm/arm32/pmap.c#2334
https://nxr.netbsd.org/xref/src/sys/arch/arm/arm32/pmap.c#2424

L2_XS_XN is regarded as L2_TYPE_L if L2_TYPE_S is cleared, which
results in the KASSERT above.

I'm not sure whether NFS code is optimal, but (2) should be fixed
anyway.
>How-To-Repeat:
Boot armv7 system with NFS root, and build some pkgsrc.
>Fix:
Check page validity (as well as executability) before set L2_XS_XN:

diff --git a/sys/arch/arm/arm32/pmap.c b/sys/arch/arm/arm32/pmap.c
index 64797e18a0c..1305344be5f 100644
--- a/sys/arch/arm/arm32/pmap.c
+++ b/sys/arch/arm/arm32/pmap.c
@@ -2330,15 +2330,10 @@ pmap_clearbit(struct vm_page_md *md, paddr_t pa, u_int maskbits)
 #ifdef PMAP_CACHE_VIPT
 	const bool want_syncicache = PV_IS_EXEC_P(md->pvh_attrs);
 	bool need_syncicache = false;
-#ifdef ARM_MMU_EXTENDED
-	const u_int execbits = (maskbits & PVF_EXEC) ? L2_XS_XN : 0;
-#else
-	const u_int execbits = 0;
+#ifndef ARM_MMU_EXTENDED
 	bool need_vac_me_harder = false;
 #endif
-#else
-	const u_int execbits = 0;
-#endif
+#endif /* PMAP_CACHE_VIPT */

 	UVMHIST_FUNC(__func__);
 	UVMHIST_CALLARGS(maphist, "md %#jx pa %#jx maskbits %#jx",
@@ -2421,7 +2416,14 @@ pmap_clearbit(struct vm_page_md *md, paddr_t pa, u_int maskbits)

 		pt_entry_t * const ptep = &l2b->l2b_kva[l2pte_index(va)];
 		const pt_entry_t opte = *ptep;
-		pt_entry_t npte = opte | execbits;
+		pt_entry_t npte = opte;
+
+#if defined(ARM_MMU_EXTENDED) && defined(PMAP_CACHE_VIPT)
+		if ((maskbits & PVF_EXEC) != 0 && l2pte_valid_p(opte)) {
+			KASSERT((opte & L2_TYPE_S) != 0);
+			npte |= L2_XS_XN;
+		}
+#endif

 #ifdef ARM_MMU_EXTENDED
 		KASSERT((opte & L2_XS_nG) == (pm == pmap_kernel() ? 0 : L2_XS_nG));

>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->feedback
State-Changed-By: skrll@NetBSD.org
State-Changed-When: Sat, 13 Apr 2024 12:29:01 +0000
State-Changed-Why:
I've committed a slightly simplified patch. If you're happy I'll request pullups.
Thanks


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.