NetBSD Problem Report #56381
From www@netbsd.org Mon Aug 30 01:49:40 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 322181A9239
for <gnats-bugs@gnats.NetBSD.org>; Mon, 30 Aug 2021 01:49:40 +0000 (UTC)
Message-Id: <20210830014939.0352C1A923A@mollari.NetBSD.org>
Date: Mon, 30 Aug 2021 01:49:38 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Infinite loop in pmap_page_protect()
X-Send-Pr-Version: www-1.0
>Number: 56381
>Category: port-sh3
>Synopsis: Infinite loop in pmap_page_protect()
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-sh3-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Aug 30 01:50:00 +0000 2021
>Closed-Date: Fri Jul 21 08:56:28 +0000 2023
>Last-Modified: Fri Jul 21 08:56:28 +0000 2023
>Originator: Rin Okuyama
>Release: 9.99.88
>Organization:
Department of Physics, Meiji University
>Environment:
NetBSD hdlu 9.99.88 NetBSD 9.99.88 (GENERIC) #2: Thu Aug 26 20:34:10 JST 2021 rin@latipes:/sys/arch/landisk/compile/GENERIC landisk
>Description:
Sometimes kernel is trapped into infinite loop in pmap_page_protect(),
where the user cannot do anything but entering DDB from console.
This patch seems to work around the problem:
----
RCS file: /home/netbsd/src/sys/arch/sh3/sh3/pmap.c,v
retrieving revision 1.85
diff -p -u -r1.85 pmap.c
--- sys/arch/sh3/sh3/pmap.c 26 Jul 2021 21:43:11 -0000 1.85
+++ sys/arch/sh3/sh3/pmap.c 24 Aug 2021 04:35:22 -0000
@@ -723,10 +723,28 @@ pmap_page_protect(struct vm_page *pg, vm
default:
/* Remove all */
s = splvm();
+#if 0 /* XXXRO */
while ((pv = SLIST_FIRST(&pvh->pvh_head)) != NULL) {
va = pv->pv_va;
pmap_remove(pv->pv_pmap, va, va + PAGE_SIZE);
}
+#else
+ while ((pv = SLIST_FIRST(&pvh->pvh_head)) != NULL) {
+ pt_entry_t *pte;
+ pmap = pv->pv_pmap;
+ va = pv->pv_va;
+ while ((pte = __pmap_pte_lookup(pmap, va)) == NULL ||
+ *pte == 0) {
+ pv = SLIST_NEXT(pv, pv_link);
+ if (pv == NULL)
+ goto out;
+ pmap = pv->pv_pmap;
+ va = pv->pv_va;
+ }
+ pmap_remove(pmap, va, va + PAGE_SIZE);
+ }
+ out:
+#endif
splx(s);
}
}
----
Note that similar check for pte and *pte is already done for
pmap_extract(). However,
(1) this may cause memory leak for pv, and moreover,
(2) I'm afraid that (pte == NULL || *pte == 0) indicates that
pmap_enter() is partially executed at when pmap_remove() is invoked.
Also,
(3) usage of SLIST, and internal interfaces around pmap_remove() seems
suboptimal.
Although, (3) should be addressed after the main problem is fixed.
>How-To-Repeat:
Make pkgsrc's for few hours to one day on landisk.
>Fix:
Workaround is provided above, but probably not a correct fix...
>Release-Note:
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-sh3/56381: Infinite loop in pmap_page_protect()
Date: Mon, 30 Aug 2021 12:22:59 +0200
It seems I have run into this too and added instrumentation (see below),
but then never got that to fire again.
Martin
Index: pmap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/sh3/sh3/pmap.c,v
retrieving revision 1.85
diff -u -p -r1.85 pmap.c
--- pmap.c 26 Jul 2021 21:43:11 -0000 1.85
+++ pmap.c 30 Aug 2021 10:20:59 -0000
@@ -325,6 +325,10 @@ pmap_enter(pmap_t pmap, vaddr_t va, padd
pt_entry_t entry, *pte;
bool kva = (pmap == pmap_kernel());
+ UVMHIST_FUNC(__func__);
+ UVMHIST_CALLARGS(maphist, "pmap %#jx va %#jx pa %#jx flags %#jx",
+ (uintptr_t)pmap, (uintptr_t)va, (uintptr_t)pa, (uintptr_t)flags);
+
/* "flags" never exceed "prot" */
KDASSERT(prot != 0 && ((flags & VM_PROT_ALL) & ~prot) == 0);
@@ -345,6 +349,8 @@ pmap_enter(pmap_t pmap, vaddr_t va, padd
entry |= PG_V;
pvh->pvh_flags |= PVH_REFERENCED;
}
+ UVMHIST_LOG(maphist, "added PVH_REFERENCED pg: %#jx pvh_flags %#jx",
+ (uintptr_t)pg, (uintptr_t)pvh->pvh_flags, 0, 0);
/* Protection */
if ((prot & VM_PROT_WRITE) && (pvh->pvh_flags & PVH_MODIFIED)) {
@@ -391,6 +397,7 @@ pmap_enter(pmap_t pmap, vaddr_t va, padd
}
}
+ KDASSERT(entry != 0);
*pte = entry;
if (pmap->pm_asid != -1)
@@ -441,6 +448,7 @@ __pmap_map_change(pmap_t pmap, vaddr_t v
if (oentry & _PG_WIRED) {
if (!(entry & _PG_WIRED)) {
/* wired -> unwired */
+ KDASSERT(entry != 0);
*pte = entry;
/* "wired" is software bits. no need to update TLB */
pmap->pm_stats.wired_count--;
@@ -500,7 +508,10 @@ __pmap_pv_enter(pmap_t pmap, struct vm_p
}
void
-pmap_remove(pmap_t pmap, vaddr_t sva, vaddr_t eva)
+pmap_remove_int(pmap_t pmap, vaddr_t sva, vaddr_t eva, bool must_exist);
+
+void
+pmap_remove_int(pmap_t pmap, vaddr_t sva, vaddr_t eva, bool must_exist)
{
struct vm_page *pg;
pt_entry_t *pte, entry;
@@ -509,11 +520,19 @@ pmap_remove(pmap_t pmap, vaddr_t sva, va
KDASSERT((sva & PGOFSET) == 0);
for (va = sva; va < eva; va += PAGE_SIZE) {
- if ((pte = __pmap_pte_lookup(pmap, va)) == NULL ||
- (entry = *pte) == 0)
+ pte = __pmap_pte_lookup(pmap, va);
+ KDASSERTMSG(!must_exist || pte != NULL, "va: %lx", va);
+ if (pte == NULL)
continue;
- if ((pg = PHYS_TO_VM_PAGE(entry & PG_PPN)) != NULL)
+ entry = *pte;
+ KDASSERTMSG(!must_exist || entry != 0, "va: %lx", va);
+ if (entry == 0)
+ continue;
+
+ pg = PHYS_TO_VM_PAGE(entry & PG_PPN);
+ KDASSERTMSG(!must_exist || pg != NULL, "va: %lx", va);
+ if (pg != NULL)
__pmap_pv_remove(pmap, pg, va);
if (entry & _PG_WIRED)
@@ -530,6 +549,12 @@ pmap_remove(pmap_t pmap, vaddr_t sva, va
}
}
+void
+pmap_remove(pmap_t pmap, vaddr_t sva, vaddr_t eva)
+{
+ pmap_remove_int(pmap, sva, eva, false);
+}
+
/*
* void __pmap_pv_remove(pmap_t pmap, struct vm_page *pg, vaddr_t vaddr):
* Remove physical-virtual map from vm_page.
@@ -564,7 +589,9 @@ __pmap_pv_remove(pmap_t pmap, struct vm_
#ifdef DEBUG
/* Check duplicated map. */
SLIST_FOREACH(pv, &pvh->pvh_head, pv_link)
- KDASSERT(!(pv->pv_pmap == pmap && pv->pv_va == vaddr));
+ KDASSERTMSG(!(pv->pv_pmap == pmap && pv->pv_va == vaddr),
+ "pmap: %p pv->pv_pmap: %p, pv->pv_va: %lx, vaddr: %lx\n",
+ pmap, pv->pv_pmap, pv->pv_va, vaddr);
#endif
splx(s);
}
@@ -589,6 +616,7 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
pte = __pmap_kpte_lookup(va);
KDASSERT(*pte == 0);
+ KDASSERT(entry != 0);
*pte = entry;
sh_tlb_update(0, va, entry);
@@ -684,6 +712,7 @@ pmap_protect(pmap_t pmap, vaddr_t sva, v
}
entry = (entry & ~PG_PR_MASK) | protbits;
+ KDASSERT(entry != 0);
*pte = entry;
if (pmap->pm_asid != -1)
@@ -725,7 +754,7 @@ pmap_page_protect(struct vm_page *pg, vm
s = splvm();
while ((pv = SLIST_FIRST(&pvh->pvh_head)) != NULL) {
va = pv->pv_va;
- pmap_remove(pv->pv_pmap, va, va + PAGE_SIZE);
+ pmap_remove_int(pv->pv_pmap, va, va + PAGE_SIZE, true);
}
splx(s);
}
@@ -801,6 +830,8 @@ pmap_clear_reference(struct vm_page *pg)
pmap_t pmap;
vaddr_t va;
int s;
+ UVMHIST_FUNC(__func__);
+ UVMHIST_CALLARGS(maphist, "pg: %#jx", (uintptr_t)pg, 0, 0, 0);
if ((pvh->pvh_flags & PVH_REFERENCED) == 0)
return (false);
@@ -1016,6 +1047,9 @@ __pmap_pte_load(pmap_t pmap, vaddr_t va,
struct vm_page *pg;
pt_entry_t *pte;
pt_entry_t entry;
+ UVMHIST_FUNC(__func__);
+ UVMHIST_CALLARGS(maphist, "pmap: %#jx va: %#jx flags: %#jx",
+ (uintptr_t)pmap, (uintptr_t)va, (uintptr_t)flags, 0);
KDASSERT((((int)va < 0) && (pmap == pmap_kernel())) ||
(((int)va >= 0) && (pmap != pmap_kernel())));
@@ -1034,6 +1068,7 @@ __pmap_pte_load(pmap_t pmap, vaddr_t va,
if (flags & PVH_REFERENCED) {
pvh->pvh_flags |= PVH_REFERENCED;
entry |= PG_V;
+ UVMHIST_LOG(maphist, "cleared PVH_REFERENCED",0,0,0,0);
}
if (flags & PVH_MODIFIED) {
pvh->pvh_flags |= PVH_MODIFIED;
From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56381 CVS commit: src/sys/arch/sh3/sh3
Date: Thu, 2 Sep 2021 07:55:56 +0000
Module Name: src
Committed By: rin
Date: Thu Sep 2 07:55:56 UTC 2021
Modified Files:
src/sys/arch/sh3/sh3: pmap.c
Log Message:
PR port-sh3/56381
pmap_enter() returns ENOMEM if __pmap_pte_alloc() fails and PMAP_CANFAIL
flag is specified. In this case, remove pv via __pmap_pv_remove() if it is
added to p-v map list via __pmap_pv_enter().
Otherwise, pmap becomes an inconsistent state, which results in an infinite
loop in pmap_page_protect(), as reported in the PR.
Also, KASSERT's are added for sure, in order to detect the infinite loops.
Great thanks to chs@ for finding out this bug!!
To generate a diff of this commit:
cvs rdiff -u -r1.85 -r1.86 src/sys/arch/sh3/sh3/pmap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56381 CVS commit: src/sys/arch/sh3/sh3
Date: Wed, 8 Sep 2021 00:24:29 +0000
Module Name: src
Committed By: rin
Date: Wed Sep 8 00:24:29 UTC 2021
Modified Files:
src/sys/arch/sh3/sh3: pmap.c
Log Message:
Turn KASSERT's added for PR port-sh3/56381 into KDASSERT's;
they are less likely expected to fire again.
To generate a diff of this commit:
cvs rdiff -u -r1.87 -r1.88 src/sys/arch/sh3/sh3/pmap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: rin@NetBSD.org
State-Changed-When: Fri, 21 Jul 2023 08:56:28 +0000
State-Changed-Why:
Fix will be shipped with NetBSD 10.0.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.