NetBSD Problem Report #46224

From petar@starling.smokva.net  Mon Mar 19 02:29:38 2012
Return-Path: <petar@starling.smokva.net>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id A32CC63B946
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 19 Mar 2012 02:29:38 +0000 (UTC)
Message-Id: <20120319022943.C33CD17830A0@starling.smokva.net>
Date: Mon, 19 Mar 2012 03:29:43 +0100 (CET)
From: Petar Bogdanovic <petar@smokva.net>
To: gnats-bugs@gnats.NetBSD.org
Subject: fatal page fault, kernfs_readdir()
X-Send-Pr-Version: 3.95

>Number:         46224
>Category:       kern
>Synopsis:       kernel crash: fatal page fault in kernfs_readdir()
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 19 02:30:01 +0000 2012
>Last-Modified:  Sun Apr 15 14:54:17 +0000 2012
>Originator:     Petar Bogdanovic
>Release:        NetBSD 6.0_BETA (16.03.2012)
>Organization:
>Environment:
amd64
>Description:
	a pretty recent netbsd-6 kernel (date: 16.03., arch: amd64) just
	crashed several times.  The bug seems reproducible and does not
	appear, when no kernfs is involved:

	$ mount
	/dev/raid0a on / type ffs (log, NFS exported, local)
	kernfs on /kern type kernfs (local)

	$ sudo find / -name '*,v'
	/etc/mtree/special.local,v
	(...many more lines...)
	/var/backups/boot.cfg.current,v
	uvm_fault(0xfffffe8114c4dbd0, 0x0, 1) -> e
	fatal page fault in supervisor mode
	trap type 6 code 0 rip ffffffff804f4ceb cs 8 rflags 10297 cr2  0 cpl 0
	rsp fffffe80016077a0
	kernel: page fault trap, code=0
	Stopped in pid 847.1 (find) at  netbsd:kernfs_readdir+0x687:    movq
	7fb0b30e
	(%rip),%rdi
	db{1}> bt
	kernfs_readdir() at netbsd:kernfs_readdir+0x687
	VOP_READDIR() at netbsd:VOP_READDIR+0x65
	vn_readdir() at netbsd:vn_readdir+0xf6
	sys___getdents30() at netbsd:sys___getdents30+0x76
	syscall() at netbsd:syscall+0xc4


	The same situation yields a slightly different result when
	ddb.onpanic=0 and ends with what seems to be a complete meltdown
	after the core was successfully dumped:

	uvm_fault(0xfffffe811556ad40, 0x0, 1) -> e
	fatal page fault in supervisor mode
	trap type 6 code 0 rip ffffffff804f4ceb cs 8 rflags 10297 cr2  0 cpl 0 rsp fffffe80015b77a0
	panic: trap
	cpu1: Begin traceback...
	printf_nolog() at netbsd:printf_nolog
	startlwp() at netbsd:startlwp
	alltraps() at netbsd:alltraps+0xa2
	VOP_READDIR() at netbsd:VOP_READDIR+0x65
	vn_readdir() at netbsd:vn_readdir+0xf6
	sys___getdents30() at netbsd:sys___getdents30+0x76
	syscall() at netbsd:syscall+0xc4
	cpu1: End traceback...

	(..dump begins, finishes..)

	pmap_kenter_pa: mapping already present
	pmap_kenter_pa: mapping already present
	pmap_kenter_pa: mapping already present

	(..many, many more identical lines..)
	(..takes as long as the core dump..)

	pmap_kenter_pa: mapping already present
	pmap_kenter_pa: mapping already present
	pmap_kenter_pa: mapping already present
	succeeded


	Skipping crash dump on recursive panic
	panic: wdc_exec_command: polled command not done
	cpu1: Begin traceback...
	printf_nolog() at netbsd:printf_nolog
	wdccommand() at netbsd:wdccommand
	wd_flushcache() at netbsd:wd_flushcache+0xd7
	wd_shutdown() at netbsd:wd_shutdown+0x3e
	pmf_system_shutdown() at netbsd:pmf_system_shutdown+0x81
	cpu_reboot() at netbsd:cpu_reboot+0x2c
	vpanic() at netbsd:vpanic+0x1dd
	printf_nolog() at netbsd:printf_nolog
	startlwp() at netbsd:startlwp
	alltraps() at netbsd:alltraps+0xa2
	VOP_READDIR() at netbsd:VOP_READDIR+0x65
	vn_readdir() at netbsd:vn_readdir+0xf6
	sys___getdents30() at netbsd:sys___getdents30+0x76
	syscall() at netbsd:syscall+0xc4
	cpu1: End traceback...
	rebooting...

>How-To-Repeat:
	find /kern -ls
>Fix:
	none

>Release-Note:

>Audit-Trail:

From: Greg Oster <oster@cs.usask.ca>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46224: fatal page fault, kernfs_readdir()
Date: Mon, 19 Mar 2012 08:53:57 -0600

 I don't know if I'm seeing quite the same error, but I've been chasing
 a similar issue the last few days... What I see is:

 fatal breakpoint trap in supervisor
 mode trap type 1 code 0 rip ffffffff80133415 cs e030 rflags 282 cr2
 7f7ff7327080 cpl 0 rsp
 ffffa0005b72d9a0 Stopped in pid 396.1 (find) at
 netbsd:breakpoint+0x5:  leave breakpoint() at netbsd:breakpoint+0x5
 pool_cache_put_paddr() at netbsd:pool_cache_put_paddr+0x25
 static_qc_pools() at ffffffff80661100
 static_qc_pools() at ffffffff80661480
 Bad frame pointer: 0xffffffff8078a7e0
 ds          ffff
 es          a14a
 fs          0
 gs          b278
 rdi         0
 rsi         fffffffe
 rbp         ffffa0005b72d9a0
 rbx         ffffa0005b72dad0
 rdx         1000000
 rcx         ffffa0000456b000
 rax         ffffffff80d0b0c0
 r8          ffffa0000456b000
 r9          400
 r10         2
 r11         ffffa0000460308d
 r12         ffffa00004603000
 r13         ffffa0000739a870
 r14         ffffffffffffffff
 r15         ffffa00004603098
 rip         ffffffff80133415    breakpoint+0x5
 cs          e030
 rflags      282
 rsp         ffffa0005b72d9a0
 ss          e02b
 netbsd:breakpoint+0x5:  leave
 db{3}> 

 and I can trigger it on-demand with a:  find -x / -name "ajsdf" -print
 The kernel is a netbsd-6 XEN3_DOMU kernel on amd64, with DEBUG and
 debug_freecheck turned on.  

 Later...

 Greg Oster

From: Lars Heidieker <lars@heidieker.de>
To: gnats-bugs@NetBSD.org, oster@cs.usask.ca
Cc: 
Subject: Re: kern/46224: fatal page fault, kernfs_readdir()
Date: Thu, 22 Mar 2012 17:58:39 +0100

 If I haven't missed anything debug_freecheck is broken. I hacked my way
 around two problems first the disable logic if running out of slots is
 the wrong way round, if that is corrected startup fails as I circumvent
 that by a hack the system kept running until running out of slots (which
 I made to panic so I couldn't miss it).

 The bug(s) that are out there aren't those indicated by debug_freecheck.

 Lars

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.