NetBSD Problem Report #39548

From simonb@thistledown.com.au  Mon Sep 15 01:25:22 2008
Return-Path: <simonb@thistledown.com.au>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 94E6F63B842
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 15 Sep 2008 01:25:22 +0000 (UTC)
Message-Id: <20080915012520.469BDAFD04@thoreau.thistledown.com.au>
Date: Mon, 15 Sep 2008 11:25:20 +1000 (EST)
From: Simon Burge <simonb@NetBSD.org>
Reply-To: Simon Burge <simonb@NetBSD.org>
To: gnats-bugs@gnats.NetBSD.org
Subject: kernel debugging assertion "(vp->v_flag & VONWORKLST)" failed
X-Send-Pr-Version: 3.95

>Number:         39548
>Category:       kern
>Synopsis:       kernel debugging assertion "(vp->v_flag & VONWORKLST)" failed
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    kern-bug-people
>State:          analyzed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Sep 15 01:30:00 +0000 2008
>Closed-Date:    
>Last-Modified:  Wed May 06 23:15:03 +0000 2009
>Originator:     Simon Burge
>Release:        NetBSD 4.0_STABLE (sources from netbsd-4 branch on Jul 15 2008)
>Organization:
>Environment:
System: NetBSD thoreau 4.0_STABLE NetBSD 4.0_STABLE (THOREAU) #6: Tue Jul 15 00:48:08 EST 2008 simonb@thoreau:/usr/obj/sys/arch/i386/compile/THOREAU i386
Architecture: i386
Machine: i386
>Description:
	I got a

panic: kernel debugging assertion "(vp->v_flag & VONWORKLST)" failed: file "./sys/miscfs/genfs/genfs_vnops.c", line 1283

	panic just after starting to move some large files from one
	filesystem to another.  I was in X at the time, but do have a
	crash dump.  The backtrace is:

#17 0xc045978c in panic (
    fmt=0xc099211c "kernel %sassertion \"%s\" failed: file \"%s\", line %d")
    at ./sys/kern/subr_prf.c:235
#18 0xc075a8a6 in __assert (t=0xc08f1884 "debugging ", 
    f=0xc095eb88 "./sys/miscfs/genfs/genfs_vnops.c", l=1283, 
    e=0xc09053c6 "(vp->v_flag & VONWORKLST)")
    at ../.././sys/lib/libkern/__assert.c:45
#19 0xc0499f6f in genfs_do_putpages (vp=0xd4c3d938, startoff=259100672, 
    endoff=259104768, flags=9, busypg=0x0)
    at ./sys/miscfs/genfs/genfs_vnops.c:1283
#20 0xc0499fe9 in genfs_putpages (v=0xcfe81b60)
    at ./sys/miscfs/genfs/genfs_vnops.c:1055
#21 0xc0496ccc in VOP_PUTPAGES (vp=0xd4c3d938, offlo=0, offhi=0, flags=9)
    at ./sys/kern/vnode_if.c:1592
#22 0xc03f3cc5 in uvn_put (uobj=0xd4c3d938, offlo=259100672, offhi=259104768, 
    flags=9) at ./sys/uvm/uvm_vnode.c:273
#23 0xc03ef9b3 in uvm_pageout (arg=0xcf418c1c) at ./sys/uvm/uvm_pdaemon.c:732
#24 0xc01002e1 in proc_trampoline ()

	The eariler frames are for a secondary "panic: wdc_exec_command:
	polled command not done" panic that happened after the dump (and
	then it proceeded to dump again!).

	This is with a UP kernel on a machine with 2GB of RAM, a mix of
	raidframe and standalone disks (swap is to a RF mirror).  The
	kernel has DIAGNOSTIC, LOCKDEBUG, DEBUG.

>How-To-Repeat:
	Not sure - only observed once so far.

>Fix:
	None given.

>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->analyzed
State-Changed-By: pooka@NetBSD.org
State-Changed-When: Wed, 17 Sep 2008 18:58:39 +0300
State-Changed-Why:
some sort of analysis attempted.

I'd like to re-prioritize this as "low", since it won't likely affect
people running without DEBUG.


From: Antti Kantee <pooka@cs.hut.fi>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/39548: kernel debugging assertion "(vp->v_flag & VONWORKLST)" failed
Date: Wed, 17 Sep 2008 18:57:24 +0300

 On Mon Sep 15 2008 at 01:30:00 +0000, Simon Burge wrote:
 > System: NetBSD thoreau 4.0_STABLE NetBSD 4.0_STABLE (THOREAU) #6: Tue Jul 15 00:48:08 EST 2008 simonb@thoreau:/usr/obj/sys/arch/i386/compile/THOREAU i386
 > 
 > panic: kernel debugging assertion "(vp->v_flag & VONWORKLST)" failed: file "./sys/miscfs/genfs/genfs_vnops.c", line 1283
 > 
 > 	panic just after starting to move some large files from one
 > 	filesystem to another.  I was in X at the time, but do have a
 > 	crash dump.  The backtrace is:
 > 
 > #17 0xc045978c in panic (
 >     fmt=0xc099211c "kernel %sassertion \"%s\" failed: file \"%s\", line %d")
 >     at ./sys/kern/subr_prf.c:235
 > #18 0xc075a8a6 in __assert (t=0xc08f1884 "debugging ", 
 >     f=0xc095eb88 "./sys/miscfs/genfs/genfs_vnops.c", l=1283, 
 >     e=0xc09053c6 "(vp->v_flag & VONWORKLST)")
 >     at ../.././sys/lib/libkern/__assert.c:45
 > #19 0xc0499f6f in genfs_do_putpages (vp=0xd4c3d938, startoff=259100672, 
 >     endoff=259104768, flags=9, busypg=0x0)
 >     at ./sys/miscfs/genfs/genfs_vnops.c:1283
 > #20 0xc0499fe9 in genfs_putpages (v=0xcfe81b60)
 >     at ./sys/miscfs/genfs/genfs_vnops.c:1055
 > #21 0xc0496ccc in VOP_PUTPAGES (vp=0xd4c3d938, offlo=0, offhi=0, flags=9)
 >     at ./sys/kern/vnode_if.c:1592
 > #22 0xc03f3cc5 in uvn_put (uobj=0xd4c3d938, offlo=259100672, offhi=259104768, 
 >     flags=9) at ./sys/uvm/uvm_vnode.c:273
 > #23 0xc03ef9b3 in uvm_pageout (arg=0xcf418c1c) at ./sys/uvm/uvm_pdaemon.c:732
 > #24 0xc01002e1 in proc_trampoline ()
 > 
 > 	The eariler frames are for a secondary "panic: wdc_exec_command:
 > 	polled command not done" panic that happened after the dump (and
 > 	then it proceeded to dump again!).
 > 
 > 	This is with a UP kernel on a machine with 2GB of RAM, a mix of
 > 	raidframe and standalone disks (swap is to a RF mirror).  The
 > 	kernel has DIAGNOSTIC, LOCKDEBUG, DEBUG.

 Looks like a bad case of race condition between moving the vnode on
 and off the worklist and a conflict between the syncer and pagedaemon.
 I think you need to get memory exhausted and the pagedaemon interested
 in the situation and a case of very bad luck.

 Since this code has changed much for -current, I'd say a) sacrifice
 more chicken (poulet de bresse preferred) b) run without DEBUG or c)
 remove the KDASSERT and compile a new kernel.

 But, you could still print the vnode and append to the PR.  Also,
 applying meditation and trying to figure out what exactly protects
 VONWORKLST can't hurt.

From: Simon Burge <simonb@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, Simon Burge <simonb@NetBSD.org>
Subject: Re: kern/39548: kernel debugging assertion "(vp->v_flag & VONWORKLST)" failed 
Date: Thu, 18 Sep 2008 15:16:19 +1000

 Antti Kantee wrote:

 >  Looks like a bad case of race condition between moving the vnode on
 >  and off the worklist and a conflict between the syncer and pagedaemon.
 >  I think you need to get memory exhausted and the pagedaemon interested
 >  in the situation and a case of very bad luck.
 >  
 >  Since this code has changed much for -current, I'd say a) sacrifice
 >  more chicken (poulet de bresse preferred) b) run without DEBUG or c)
 >  remove the KDASSERT and compile a new kernel.

 I might run with c) for now.  If that KDASSERT is subject to a race,
 should we just remove it on netbsd-4?  And is it still potentially a
 problem with -current too?  I guess some meditation might apply here...

 >  But, you could still print the vnode and append to the PR.

 (gdb) print *(struct vnode *)0xd4c3d938
 $3 = {v_uobj = {vmobjlock = {lock_data = 1, 
       lock_file = 0xc08ff85b "./sys/uvm/uvm_pdaemon.c", 
       unlock_file = 0xc095eb88 "./sys/miscfs/genfs/genfs_vnops.c", 
       lock_line = 406, unlock_line = 1464, list = {tqe_next = 0x0, 
         tqe_prev = 0xc09a6f70}, lock_holder = 0}, pgops = 0xc09a484c, memq = {
       tqh_first = 0xc30c0a68, tqh_last = 0xc216c938}, uo_npages = 191, 
     uo_refs = 1}, v_size = 367638528, v_flag = 65664, v_numoutput = 0, 
   v_writecount = 0, v_holdcnt = 4, v_mount = 0xc37c7000, v_op = 0xc39d4000, 
   v_freelist = {tqe_next = 0xda0aa590, tqe_prev = 0xde71d6dc}, v_mntvnodes = {
     tqe_next = 0xd2010d04, tqe_prev = 0xdfb2c33c}, v_cleanblkhd = {
     lh_first = 0xd3ea85dc}, v_dirtyblkhd = {lh_first = 0x0}, 
   v_synclist_slot = 31, v_synclist = {tqe_next = 0xdd5be3b0, 
     tqe_prev = 0xc3753ef8}, v_dnclist = {lh_first = 0x0}, v_nclist = {
     lh_first = 0x0}, v_un = {vu_mountedhere = 0xe9540408, 
     vu_socket = 0xe9540408, vu_specinfo = 0xe9540408, 
     vu_fifoinfo = 0xe9540408, vu_ractx = 0xe9540408}, v_lease = 0x0, 
   v_type = VREG, v_tag = VT_UFS, v_lock = {lk_interlock = {lock_data = 0, 
       lock_file = 0xc0901702 "./sys/kern/kern_lock.c", 
       unlock_file = 0xc0901702 "./sys/kern/kern_lock.c", lock_line = 568, 
       unlock_line = 920, list = {tqe_next = 0x0, tqe_prev = 0x0}, 
       lock_holder = 4294967295}, lk_flags = 0, lk_sharecount = 0, 
     lk_exclusivecount = 0, lk_recurselevel = 0, lk_waitcount = 0, 
     lk_wmesg = 0xc0904d6e "vnlock", lk_un = {lk_un_sleep = {
         lk_sleep_lockholder = -1, lk_sleep_locklwp = 0, lk_sleep_prio = 20, 
         lk_sleep_timo = 0, lk_newlock = 0x0}, lk_un_spin = {
         lk_spin_cpu = 4294967295, lk_spin_list = {tqe_next = 0x0, 
           tqe_prev = 0x14}}}, 
     lk_lock_file = 0xc095eb88 "./sys/miscfs/genfs/genfs_vnops.c", 
     lk_unlock_file = 0xc095eb88 "./sys/miscfs/genfs/genfs_vnops.c", 
     lk_lock_line = 309, lk_unlock_line = 325}, v_vnlock = 0xd4c3d9c4, 
   v_data = 0xd977978c, v_klist = {slh_first = 0x0}}

 Simon.

From: Simon Burge <simonb@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
    gnats-admin@netbsd.org, pooka@NetBSD.org
Subject: Re: kern/39548 (kernel debugging assertion "(vp->v_flag & VONWORKLST)" failed) 
Date: Thu, 18 Sep 2008 15:20:36 +1000

 pooka@NetBSD.org wrote:

 > Synopsis: kernel debugging assertion "(vp->v_flag & VONWORKLST)" failed
 > 
 > State-Changed-From-To: open->analyzed
 > State-Changed-By: pooka@NetBSD.org
 > State-Changed-When: Wed, 17 Sep 2008 18:58:39 +0300
 > State-Changed-Why:
 > some sort of analysis attempted.
 > 
 > I'd like to re-prioritize this as "low", since it won't likely affect
 > people running without DEBUG.

 Sound ok, I've done this.

 Thanks,
 Simon.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/39548: kernel debugging assertion "(vp->v_flag &
	VONWORKLST)" failed
Date: Wed, 6 May 2009 23:12:44 +0000

 see also 41157.

 -- 
 David A. Holland
 dholland@netbsd.org

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.