NetBSD Problem Report #49061

From martin@duskware.de  Fri Aug  1 18:37:39 2014
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 8CC33A5808
	for <gnats-bugs@gnats.NetBSD.org>; Fri,  1 Aug 2014 18:37:39 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: armv4 crashes inside network interrupt
X-Send-Pr-Version: 3.95

>Number:         49061
>Category:       port-arm
>Synopsis:       armv4 crashes inside network interrupt
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-arm-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 01 18:40:00 +0000 2014
>Closed-Date:    Thu Oct 30 21:32:23 +0000 2014
>Last-Modified:  Thu Oct 30 21:32:23 +0000 2014
>Originator:     Martin Husemann
>Release:        NetBSD 6.99.49
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD night-rest.duskware.de 6.99.49 NetBSD 6.99.49 (EGENERIC) #21: Fri Aug 1 09:39:08 CEST 2014 martin@seven-days-to-the-wolves.aprisoft.de:/usr/src/sys/arch/shark/compile/EGENERIC shark
Architecture: earmv4
Machine: shark
>Description:

Under load + using the network I can "often" (but not really reproducably)
crash my armv4 machine. Matt Thomas gave me a small patch as a workaround,
which is why the panic below is slightly odd - with that patch the problem
happens less often, but it is not solved.

Problem always shows a cs_interrupt causing some pmap action that hits
the KASSERT then.

Latest sample, from slightly patched kernel (see above):

data_abort_handler: data_aborts fsr=0x8621003 far=0x4030c071
panic: kernel diagnostic assertion "md == NULL || page_locked_p || !pmap_page_lo
cked_p(md)" failed: file "../../../../arch/arm/arm32/pmap.c", line 3786        
db> bt                                                                 
0xf607cb4c: netbsd:cpu_Debugger+0xc
0xf607cb64: netbsd:db_panic+0x24   
0xf607cb8c: netbsd:vpanic+0x20c 
0xf607cba4: netbsd:kern_assert+0x38
0xf607cc24: netbsd:pmap_kenter_pa+0x4b4
0xf607cc7c: netbsd:uvm_km_kmem_alloc+0x254
0xf607cca4: netbsd:pool_page_alloc+0x5c   
0xf607ccc4: netbsd:pool_allocator_alloc+0x3c
0xf607cce4: netbsd:pool_grow+0x3c           
0xf607cd04: netbsd:pool_catchup+0x2c
0xf607cd34: netbsd:pool_get+0x724   
0xf607cd84: netbsd:pool_cache_get_slow+0x328
0xf607cdc4: netbsd:pool_cache_get_paddr+0x31c
0xf607cdec: netbsd:m_get+0x80                
0xf607ce0c: netbsd:m_gethdr+0x24
0xf607ce54: netbsd:cs_process_rx_dma+0x1e8
0xf607ce6c: netbsd:cs_buffer_event+0x68   
0xf607ce94: netbsd:cs_intr+0x224       
0xf607cf14: netbsd:irq_entry+0x16c
0xf607cf34: netbsd:uvmpdpol_pagedeactivate+0x1a8
0xf607cf5c: netbsd:uvmpdpol_balancequeue+0xb8   
0xf607cf74: netbsd:uvmpd_scan+0xc4           
0xf607cfac: netbsd:uvm_pageout+0x1e8

>How-To-Repeat:
Stress your poor little armv4 while using the network.

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/49061: armv4 crashes inside network interrupt
Date: Mon, 4 Aug 2014 09:22:39 +0200

 Here is the actual patch I'm using - mostly from Matt, but #ifdef mess
 added by me to allow all evbarm kernels to compile.

 Martin

 Index: pmap.c
 ===================================================================
 RCS file: /cvsroot/src/sys/arch/arm/arm32/pmap.c,v
 retrieving revision 1.295
 diff -u -p -r1.295 pmap.c
 --- pmap.c	25 Jul 2014 15:09:43 -0000	1.295
 +++ pmap.c	4 Aug 2014 07:20:17 -0000
 @@ -559,7 +559,9 @@ pmap_release_page_lock(struct vm_page_md
  	mutex_exit(&pmap_lock);
  }

 -#ifdef DIAGNOSTIC
 +
 +#if defined(DIAGNOSTIC) || \
 +	(defined(PMAP_CACHE_VIPT) && !defined(ARM_MMU_EXTENDED))
  static inline int
  pmap_page_locked_p(struct vm_page_md *md)
  {
 @@ -3644,6 +3646,11 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
  #endif
  #endif
  	struct vm_page_md *md = pg != NULL ? VM_PAGE_TO_MD(pg) : NULL;
 +#if defined(PMAP_CACHE_VIPT) && !defined(ARM_MMU_EXTENDED)
 +	const bool page_locked_p = md ? pmap_page_locked_p(md) : false;
 +#elif defined(DIAGNOSTIC)
 +	const bool page_locked_p = false;
 +#endif

  	UVMHIST_FUNC(__func__);

 @@ -3685,9 +3692,13 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
  			KASSERT((omd->pvh_attrs & PVF_KMPAGE) == 0);
  			KASSERT((flags & PMAP_KMPAGE) == 0);
  #ifndef ARM_MMU_EXTENDED
 -			pmap_acquire_page_lock(omd);
 -			pv = pmap_kremove_pg(opg, va);
 -			pmap_release_page_lock(omd);
 +			if (pmap_page_locked_p(omd)) {
 +				pv = pmap_kremove_pg(opg, va);
 +			} else {
 +				pmap_acquire_page_lock(omd);
 +				pv = pmap_kremove_pg(opg, va);
 +				pmap_release_page_lock(omd);
 +			}
  #endif
  		}
  #endif
 @@ -3752,7 +3763,8 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
  				pv = pool_get(&pmap_pv_pool, PR_NOWAIT);
  				KASSERT(pv != NULL);
  			}
 -			pmap_acquire_page_lock(md);
 +			if (!page_locked_p)
 +				pmap_acquire_page_lock(md);
  			pmap_enter_pv(md, pa, pv, pmap_kernel(), va,
  			    PVF_WIRED | PVF_KENTRY
  			    | (prot & VM_PROT_WRITE ? PVF_WRITE : 0));
 @@ -3761,7 +3773,8 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
  				md->pvh_attrs |= PVF_DIRTY;
  			KASSERT((prot & VM_PROT_WRITE) == 0 || (md->pvh_attrs & (PVF_DIRTY|PVF_NC)));
  			pmap_vac_me_harder(md, pa, pmap_kernel(), va);
 -			pmap_release_page_lock(md);
 +			if (!page_locked_p)
 +				pmap_release_page_lock(md);
  #endif
  		}
  #if defined(PMAP_CACHE_VIPT) && !defined(ARM_MMU_EXTENDED)
 @@ -3770,7 +3783,7 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
  			pool_put(&pmap_pv_pool, pv);
  #endif
  	}
 -	KASSERT(md == NULL || !pmap_page_locked_p(md));
 +	KASSERT(md == NULL || page_locked_p || !pmap_page_locked_p(md));
  	if (pmap_initialized) {
  		UVMHIST_LOG(maphist, "  <-- done (ptep %p: %#x -> %#x)",
  		    ptep, opte, npte, 0);

From: Masao Uebayashi <uebayasi@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/49061: armv4 crashes inside network interrupt
Date: Mon, 4 Aug 2014 17:56:32 +0900

 > data_abort_handler: data_aborts fsr=0x8621003 far=0x4030c071
 > panic: kernel diagnostic assertion "md == NULL || page_locked_p || !pmap_page_lo
 > cked_p(md)" failed: file "../../../../arch/arm/arm32/pmap.c", line 3786
 > db> bt
 > 0xf607cb4c: netbsd:cpu_Debugger+0xc
 > 0xf607cb64: netbsd:db_panic+0x24
 > 0xf607cb8c: netbsd:vpanic+0x20c
 > 0xf607cba4: netbsd:kern_assert+0x38
 > 0xf607cc24: netbsd:pmap_kenter_pa+0x4b4
 > 0xf607cc7c: netbsd:uvm_km_kmem_alloc+0x254
 > 0xf607cca4: netbsd:pool_page_alloc+0x5c
 > 0xf607ccc4: netbsd:pool_allocator_alloc+0x3c
 > 0xf607cce4: netbsd:pool_grow+0x3c
 > 0xf607cd04: netbsd:pool_catchup+0x2c
 > 0xf607cd34: netbsd:pool_get+0x724
 > 0xf607cd84: netbsd:pool_cache_get_slow+0x328
 > 0xf607cdc4: netbsd:pool_cache_get_paddr+0x31c
 > 0xf607cdec: netbsd:m_get+0x80
 > 0xf607ce0c: netbsd:m_gethdr+0x24
 > 0xf607ce54: netbsd:cs_process_rx_dma+0x1e8
 > 0xf607ce6c: netbsd:cs_buffer_event+0x68
 > 0xf607ce94: netbsd:cs_intr+0x224
 > 0xf607cf14: netbsd:irq_entry+0x16c
 > 0xf607cf34: netbsd:uvmpdpol_pagedeactivate+0x1a8
 > 0xf607cf5c: netbsd:uvmpdpol_balancequeue+0xb8
 > 0xf607cf74: netbsd:uvmpd_scan+0xc4
 > 0xf607cfac: netbsd:uvm_pageout+0x1e8

 This is quite a complex task being execute from a network interface interrupt...

 How about just giving up pool_cache_get_slow path in interrupt context
 on those platforms without "direct" pool pages?

From: "Matt Thomas" <matt@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49061 CVS commit: src/sys/arch/arm/arm32
Date: Wed, 13 Aug 2014 15:06:28 +0000

 Module Name:	src
 Committed By:	matt
 Date:		Wed Aug 13 15:06:28 UTC 2014

 Modified Files:
 	src/sys/arch/arm/arm32: pmap.c

 Log Message:
 Fix for PR/49061
 only kassert in pmap_kenter_pa if PMAP_CACHE_PIVT && !ARM_MMU_EXTENDED


 To generate a diff of this commit:
 cvs rdiff -u -r1.296 -r1.297 src/sys/arch/arm/arm32/pmap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Thu, 30 Oct 2014 21:32:23 +0000
State-Changed-Why:
Matt adjusted the ASSERT some time ago


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.