NetBSD Problem Report #49061
From martin@duskware.de Fri Aug 1 18:37:39 2014
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 8CC33A5808
for <gnats-bugs@gnats.NetBSD.org>; Fri, 1 Aug 2014 18:37:39 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: armv4 crashes inside network interrupt
X-Send-Pr-Version: 3.95
>Number: 49061
>Category: port-arm
>Synopsis: armv4 crashes inside network interrupt
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-arm-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Aug 01 18:40:00 +0000 2014
>Closed-Date: Thu Oct 30 21:32:23 +0000 2014
>Last-Modified: Thu Oct 30 21:32:23 +0000 2014
>Originator: Martin Husemann
>Release: NetBSD 6.99.49
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD night-rest.duskware.de 6.99.49 NetBSD 6.99.49 (EGENERIC) #21: Fri Aug 1 09:39:08 CEST 2014 martin@seven-days-to-the-wolves.aprisoft.de:/usr/src/sys/arch/shark/compile/EGENERIC shark
Architecture: earmv4
Machine: shark
>Description:
Under load + using the network I can "often" (but not really reproducably)
crash my armv4 machine. Matt Thomas gave me a small patch as a workaround,
which is why the panic below is slightly odd - with that patch the problem
happens less often, but it is not solved.
Problem always shows a cs_interrupt causing some pmap action that hits
the KASSERT then.
Latest sample, from slightly patched kernel (see above):
data_abort_handler: data_aborts fsr=0x8621003 far=0x4030c071
panic: kernel diagnostic assertion "md == NULL || page_locked_p || !pmap_page_lo
cked_p(md)" failed: file "../../../../arch/arm/arm32/pmap.c", line 3786
db> bt
0xf607cb4c: netbsd:cpu_Debugger+0xc
0xf607cb64: netbsd:db_panic+0x24
0xf607cb8c: netbsd:vpanic+0x20c
0xf607cba4: netbsd:kern_assert+0x38
0xf607cc24: netbsd:pmap_kenter_pa+0x4b4
0xf607cc7c: netbsd:uvm_km_kmem_alloc+0x254
0xf607cca4: netbsd:pool_page_alloc+0x5c
0xf607ccc4: netbsd:pool_allocator_alloc+0x3c
0xf607cce4: netbsd:pool_grow+0x3c
0xf607cd04: netbsd:pool_catchup+0x2c
0xf607cd34: netbsd:pool_get+0x724
0xf607cd84: netbsd:pool_cache_get_slow+0x328
0xf607cdc4: netbsd:pool_cache_get_paddr+0x31c
0xf607cdec: netbsd:m_get+0x80
0xf607ce0c: netbsd:m_gethdr+0x24
0xf607ce54: netbsd:cs_process_rx_dma+0x1e8
0xf607ce6c: netbsd:cs_buffer_event+0x68
0xf607ce94: netbsd:cs_intr+0x224
0xf607cf14: netbsd:irq_entry+0x16c
0xf607cf34: netbsd:uvmpdpol_pagedeactivate+0x1a8
0xf607cf5c: netbsd:uvmpdpol_balancequeue+0xb8
0xf607cf74: netbsd:uvmpd_scan+0xc4
0xf607cfac: netbsd:uvm_pageout+0x1e8
>How-To-Repeat:
Stress your poor little armv4 while using the network.
>Fix:
n/a
>Release-Note:
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-arm/49061: armv4 crashes inside network interrupt
Date: Mon, 4 Aug 2014 09:22:39 +0200
Here is the actual patch I'm using - mostly from Matt, but #ifdef mess
added by me to allow all evbarm kernels to compile.
Martin
Index: pmap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/arm/arm32/pmap.c,v
retrieving revision 1.295
diff -u -p -r1.295 pmap.c
--- pmap.c 25 Jul 2014 15:09:43 -0000 1.295
+++ pmap.c 4 Aug 2014 07:20:17 -0000
@@ -559,7 +559,9 @@ pmap_release_page_lock(struct vm_page_md
mutex_exit(&pmap_lock);
}
-#ifdef DIAGNOSTIC
+
+#if defined(DIAGNOSTIC) || \
+ (defined(PMAP_CACHE_VIPT) && !defined(ARM_MMU_EXTENDED))
static inline int
pmap_page_locked_p(struct vm_page_md *md)
{
@@ -3644,6 +3646,11 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
#endif
#endif
struct vm_page_md *md = pg != NULL ? VM_PAGE_TO_MD(pg) : NULL;
+#if defined(PMAP_CACHE_VIPT) && !defined(ARM_MMU_EXTENDED)
+ const bool page_locked_p = md ? pmap_page_locked_p(md) : false;
+#elif defined(DIAGNOSTIC)
+ const bool page_locked_p = false;
+#endif
UVMHIST_FUNC(__func__);
@@ -3685,9 +3692,13 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
KASSERT((omd->pvh_attrs & PVF_KMPAGE) == 0);
KASSERT((flags & PMAP_KMPAGE) == 0);
#ifndef ARM_MMU_EXTENDED
- pmap_acquire_page_lock(omd);
- pv = pmap_kremove_pg(opg, va);
- pmap_release_page_lock(omd);
+ if (pmap_page_locked_p(omd)) {
+ pv = pmap_kremove_pg(opg, va);
+ } else {
+ pmap_acquire_page_lock(omd);
+ pv = pmap_kremove_pg(opg, va);
+ pmap_release_page_lock(omd);
+ }
#endif
}
#endif
@@ -3752,7 +3763,8 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
pv = pool_get(&pmap_pv_pool, PR_NOWAIT);
KASSERT(pv != NULL);
}
- pmap_acquire_page_lock(md);
+ if (!page_locked_p)
+ pmap_acquire_page_lock(md);
pmap_enter_pv(md, pa, pv, pmap_kernel(), va,
PVF_WIRED | PVF_KENTRY
| (prot & VM_PROT_WRITE ? PVF_WRITE : 0));
@@ -3761,7 +3773,8 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
md->pvh_attrs |= PVF_DIRTY;
KASSERT((prot & VM_PROT_WRITE) == 0 || (md->pvh_attrs & (PVF_DIRTY|PVF_NC)));
pmap_vac_me_harder(md, pa, pmap_kernel(), va);
- pmap_release_page_lock(md);
+ if (!page_locked_p)
+ pmap_release_page_lock(md);
#endif
}
#if defined(PMAP_CACHE_VIPT) && !defined(ARM_MMU_EXTENDED)
@@ -3770,7 +3783,7 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
pool_put(&pmap_pv_pool, pv);
#endif
}
- KASSERT(md == NULL || !pmap_page_locked_p(md));
+ KASSERT(md == NULL || page_locked_p || !pmap_page_locked_p(md));
if (pmap_initialized) {
UVMHIST_LOG(maphist, " <-- done (ptep %p: %#x -> %#x)",
ptep, opte, npte, 0);
From: Masao Uebayashi <uebayasi@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-arm/49061: armv4 crashes inside network interrupt
Date: Mon, 4 Aug 2014 17:56:32 +0900
> data_abort_handler: data_aborts fsr=0x8621003 far=0x4030c071
> panic: kernel diagnostic assertion "md == NULL || page_locked_p || !pmap_page_lo
> cked_p(md)" failed: file "../../../../arch/arm/arm32/pmap.c", line 3786
> db> bt
> 0xf607cb4c: netbsd:cpu_Debugger+0xc
> 0xf607cb64: netbsd:db_panic+0x24
> 0xf607cb8c: netbsd:vpanic+0x20c
> 0xf607cba4: netbsd:kern_assert+0x38
> 0xf607cc24: netbsd:pmap_kenter_pa+0x4b4
> 0xf607cc7c: netbsd:uvm_km_kmem_alloc+0x254
> 0xf607cca4: netbsd:pool_page_alloc+0x5c
> 0xf607ccc4: netbsd:pool_allocator_alloc+0x3c
> 0xf607cce4: netbsd:pool_grow+0x3c
> 0xf607cd04: netbsd:pool_catchup+0x2c
> 0xf607cd34: netbsd:pool_get+0x724
> 0xf607cd84: netbsd:pool_cache_get_slow+0x328
> 0xf607cdc4: netbsd:pool_cache_get_paddr+0x31c
> 0xf607cdec: netbsd:m_get+0x80
> 0xf607ce0c: netbsd:m_gethdr+0x24
> 0xf607ce54: netbsd:cs_process_rx_dma+0x1e8
> 0xf607ce6c: netbsd:cs_buffer_event+0x68
> 0xf607ce94: netbsd:cs_intr+0x224
> 0xf607cf14: netbsd:irq_entry+0x16c
> 0xf607cf34: netbsd:uvmpdpol_pagedeactivate+0x1a8
> 0xf607cf5c: netbsd:uvmpdpol_balancequeue+0xb8
> 0xf607cf74: netbsd:uvmpd_scan+0xc4
> 0xf607cfac: netbsd:uvm_pageout+0x1e8
This is quite a complex task being execute from a network interface interrupt...
How about just giving up pool_cache_get_slow path in interrupt context
on those platforms without "direct" pool pages?
From: "Matt Thomas" <matt@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/49061 CVS commit: src/sys/arch/arm/arm32
Date: Wed, 13 Aug 2014 15:06:28 +0000
Module Name: src
Committed By: matt
Date: Wed Aug 13 15:06:28 UTC 2014
Modified Files:
src/sys/arch/arm/arm32: pmap.c
Log Message:
Fix for PR/49061
only kassert in pmap_kenter_pa if PMAP_CACHE_PIVT && !ARM_MMU_EXTENDED
To generate a diff of this commit:
cvs rdiff -u -r1.296 -r1.297 src/sys/arch/arm/arm32/pmap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Thu, 30 Oct 2014 21:32:23 +0000
State-Changed-Why:
Matt adjusted the ASSERT some time ago
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.