NetBSD Problem Report #56200

From www@netbsd.org  Sun May 23 15:26:02 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 04CAC1A9266
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 23 May 2021 15:26:02 +0000 (UTC)
Message-Id: <20210523152600.98B671A929D@mollari.NetBSD.org>
Date: Sun, 23 May 2021 15:26:00 +0000 (UTC)
From: thorpej@me.com
Reply-To: thorpej@me.com
To: gnats-bugs@NetBSD.org
Subject: TLB shootdown assertion in pmap_enter() L2-delref case
X-Send-Pr-Version: www-1.0

>Number:         56200
>Category:       port-alpha
>Synopsis:       TLB shootdown assertion in pmap_enter() L2-delref case
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    thorpej
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun May 23 15:30:00 +0000 2021
>Closed-Date:    Sun May 23 19:17:17 +0000 2021
>Last-Modified:  Sun May 23 19:17:17 +0000 2021
>Originator:     Jason Thorpe
>Release:        NetBSD 9.99.82
>Organization:
RISCy Business
>Environment:
NetBSD alpha-vm 9.99.82 NetBSD 9.99.82 (GENERIC-$Revision: 1.410 $) #1: Sat May 22 11:30:30 PDT 2021  thorpej@the-ripe-vessel:/space/src/sys/arch/alpha/compile/GENERIC.QEMU alpha

Configured for 32Mb of RAM.
>Description:
[ 2391.0239881] panic: kernel diagnostic assertion "tlbctx->t_pmap != NULL" failed: file "../../../../arch/alpha/alpha/pmap.c", line 934 
[ 2391.0239881] cpu0: Begin traceback...
[ 2391.0239881] alpha trace requires known PC =eject=
[ 2391.0239881] cpu0: End traceback...
Stopped in pid 8438.8438 (sh) at        netbsd:cpu_Debugger+0x4:        ret     
zero,(ra)
db{0}> trace
cpu_Debugger() at netbsd:cpu_Debugger+0x4
db_panic() at netbsd:db_panic+0xc8
vpanic() at netbsd:vpanic+0x108
kern_assert() at netbsd:kern_assert+0x70
pmap_tlb_shootnow.part.0() at netbsd:pmap_tlb_shootnow.part.0+0x3b0
pmap_enter_l2pt_delref() at netbsd:pmap_enter_l2pt_delref+0x7c
pmap_enter() at netbsd:pmap_enter+0x814
uvm_fault_internal() at netbsd:uvm_fault_internal+0x1884
trap() at netbsd:trap+0x51c
XentMM() at netbsd:XentMM+0x20
--- memory management fault (from ipl 0) ---
--- user mode ---
db{0}>
>How-To-Repeat:
I encountered this running ATF tests on 32Mb RAM configuration.
>Fix:
N/A

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: port-alpha-maintainer->thorpej
Responsible-Changed-By: thorpej@NetBSD.org
Responsible-Changed-When: Sun, 23 May 2021 18:05:22 +0000
Responsible-Changed-Why:
Take.


State-Changed-From-To: open->analyzed
State-Changed-By: thorpej@NetBSD.org
State-Changed-When: Sun, 23 May 2021 18:05:22 +0000
State-Changed-Why:
I understand the root cause.


From: "Jason R Thorpe" <thorpej@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56200 CVS commit: src/sys/arch/alpha/alpha
Date: Sun, 23 May 2021 19:13:27 +0000

 Module Name:	src
 Committed By:	thorpej
 Date:		Sun May 23 19:13:27 UTC 2021

 Modified Files:
 	src/sys/arch/alpha/alpha: pmap.c

 Log Message:
 Fix a bug in pmap_tlb_shootdown_all_user(), where it was not
 stashing away the pointer to the pmap in the TLB context structure
 like pmap_tlb_shootdown() was doing.  This would result in the
 following failure scenario:

 - Page fault handler calls pmap_enter() to map a page.  Mapping
   is the first one for that L2 PT L3 PT, meaning that an L2 PT
   and an L3 PT must be allocated.
 - L2 PT allocation succeeds.
 - L3 PT allocation fails under memory pressure.
 - pmap_enter() goes to drop the reference on the L2 PT, which, because
   it was the first of its mappings, frees the L2 PT.  Becuse PALcode
   may have already tried to service a TLB miss though that L2 PT, we
   must issue an all-user-VA shootdown, and call pmap_tlb_shootdown_all_user()
   to do so.
 - pmap_tlb_shootnow() is called and an assert fires because the TLB
   context structure does not point to a pmap.

 This did not fail in the pmap_remove() scenario because the TLB context
 would have already had at least one call to pmap_tlb_shootdown(), which
 was initializing the pmap pointer properly.

 PR port-alpha/56200


 To generate a diff of this commit:
 cvs rdiff -u -r1.276 -r1.277 src/sys/arch/alpha/alpha/pmap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->closed
State-Changed-By: thorpej@NetBSD.org
State-Changed-When: Sun, 23 May 2021 19:17:17 +0000
State-Changed-Why:
Fixed.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.