NetBSD Problem Report #51178
From mlelstv@tazz.1st.de Sun May 29 08:52:58 2016
Return-Path: <mlelstv@tazz.1st.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 56F4F7A218
for <gnats-bugs@gnats.NetBSD.org>; Sun, 29 May 2016 08:52:58 +0000 (UTC)
Message-Id: <20160529085202.6C67A269FA@tazz.1st.de>
Date: Sun, 29 May 2016 10:52:02 +0200 (CEST)
From: mlelstv@serpens.de
Reply-To: mlelstv@serpens.de
To: gnats-bugs@NetBSD.org
Subject: forced umount panics with wapbl
X-Send-Pr-Version: 3.95
>Number: 51178
>Category: kern
>Synopsis: forced umount panics with wapbl
>Confidential: no
>Severity: critical
>Priority: low
>Responsible: jdolecek
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun May 29 08:55:00 +0000 2016
>Closed-Date: Wed Jan 20 21:47:38 +0000 2021
>Last-Modified: Wed Jan 20 21:47:38 +0000 2021
>Originator: Michael van Elst
>Release: NetBSD 7.99.29
>Organization:
>Environment:
System: NetBSD tazz 7.99.29 NetBSD 7.99.29 (TAZZ) #5: Sat May 28 08:36:59 CEST 2016 mlelstv@gossam:/home/netbsd-current/obj.amd64/home/netbsd-current/src/sys/arch/amd64/compile/TAZZ amd64
Architecture: x86_64
Machine: amd64
>Description:
An ISCSI device that was no longer responding was mounted with -o log.
When the system is rebooted, it fails to umount regularly and then tries
a forced umount which triggers
kernel diagnostic assertion "bp->b_freelist == -1" failed: file "/home/netbsd-curent/src/sys/kern/vfs_bio.c", line 334 double free of buffer? bp=0xfffffe811ca9b008, b_freelistindex=0
gdb shows the following backtrace:
#11 0xffffffff807b1d2d in binstailfree (dp=0xffffffff813f6b20 <bufqueues>,
bp=0xfffffe811ca9b008) at /home/netbsd-current/src/sys/kern/vfs_bio.c:333
#12 brelsel (bp=0xfffffe811ca9b008, set=<optimized out>)
at /home/netbsd-current/src/sys/kern/vfs_bio.c:1092
#13 0xffffffff807cf7e6 in wapbl_discard (wl=0xfffffe8116eefc08)
at /home/netbsd-current/src/sys/kern/vfs_wapbl.c:661
#14 0xffffffff807c985d in vclean (vp=vp@entry=0xfffffe811be41bd0)
at /home/netbsd-current/src/sys/kern/vfs_vnode.c:1032
#15 0xffffffff807cb45c in vrecycle (vp=vp@entry=0xfffffe811be41bd0)
at /home/netbsd-current/src/sys/kern/vfs_vnode.c:1099
#16 0xffffffff807bd42d in vflush (mp=mp@entry=0xfffffe8113de4008,
skipvp=skipvp@entry=0x0, flags=flags@entry=3)
at /home/netbsd-current/src/sys/kern/vfs_mount.c:525
#17 0xffffffff806ce0a5 in ffs_flushfiles (mp=mp@entry=0xfffffe8113de4008,
flags=flags@entry=2, l=l@entry=0xfffffe8117f6b0c0)
at /home/netbsd-current/src/sys/ufs/ffs/ffs_vfsops.c:1775
#18 0xffffffff806cebdf in ffs_unmount (mp=0xfffffe8113de4008,
mntflags=<optimized out>)
at /home/netbsd-current/src/sys/ufs/ffs/ffs_vfsops.c:1693
#19 0xffffffff807c0cd5 in VFS_UNMOUNT (mp=mp@entry=0xfffffe8113de4008,
a=a@entry=524288) at /home/netbsd-current/src/sys/kern/vfs_subr.c:1296
#20 0xffffffff807bd5dd in dounmount (mp=mp@entry=0xfffffe8113de4008,
flags=flags@entry=524288, l=l@entry=0xfffffe8117f6b0c0)
at /home/netbsd-current/src/sys/kern/vfs_mount.c:856
#21 0xffffffff807bd87b in vfs_unmount_forceone (l=0xfffffe8117f6b0c0)
at /home/netbsd-current/src/sys/kern/vfs_mount.c:953
#22 0xffffffff80119928 in cpu_reboot (howto=0, bootstr=bootstr@entry=0x0)
at /home/netbsd-current/src/sys/arch/amd64/amd64/machdep.c:659
#23 0xffffffff80758bd5 in sys_reboot (l=<optimized out>,
uap=0xfffffe8040ca6f00, retval=<optimized out>)
at /home/netbsd-current/src/sys/kern/kern_xxx.c:82
#24 0xffffffff8013cd65 in sy_call (rval=0xfffffe8040ca6eb0,
uap=0xfffffe8040ca6f00, l=0xfffffe8117f6b0c0,
sy=0xffffffff80e268e0 <sysent+4992>)
at /home/netbsd-current/src/sys/sys/syscallvar.h:65
#25 sy_invoke (code=208, rval=0xfffffe8040ca6eb0, uap=0xfffffe8040ca6f00,
l=0xfffffe8117f6b0c0, sy=0xffffffff80e268e0 <sysent+4992>)
at /home/netbsd-current/src/sys/sys/syscallvar.h:94
#26 syscall (frame=0xfffffe8040ca6f00)
at /home/netbsd-current/src/sys/arch/x86/x86/syscall.c:156
#27 0xffffffff80100731 in Xsyscall ()
>How-To-Repeat:
umount forcefully a filesystem mounted with -o log where the
device no longer responds.
You probably need some write operation in progress.
>Fix:
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Tue, 01 Nov 2016 19:45:19 +0000
Responsible-Changed-Why:
I'm looking on wapbl-related bugs.
State-Changed-From-To: open->analyzed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Tue, 01 Nov 2016 19:45:19 +0000
State-Changed-Why:
I can reproduce the problem by starting i/o (untar base.tgz) on logged
disk mounted via iscsi-initiator/vnd0, then cutting cord on the server
network and killing iscsi-initiator and rebooting. I'm working on
tracking down the bug.
State-Changed-From-To: analyzed->open
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Thu, 10 Nov 2016 21:42:37 +0000
State-Changed-Why:
I haven't found the root cause yet, just confirmed I can trigger this.
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51178 CVS commit: src/sys/kern
Date: Sun, 12 Apr 2020 17:02:53 +0000
Module Name: src
Committed By: jdolecek
Date: Sun Apr 12 17:02:52 UTC 2020
Modified Files:
src/sys/kern: vfs_wapbl.c
Log Message:
fix wapbl_discard() to actually discard the queued bufs properly - need
to set BC_INVAL for them, and also need to explicitly remove them
from the BQ_LOCKED queue
fixes DIAGNOSTIC panic when force unmounting unresponsive disk device
PR kern/51178 by Michael van Elst
To generate a diff of this commit:
cvs rdiff -u -r1.107 -r1.108 src/sys/kern/vfs_wapbl.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sun, 12 Apr 2020 17:07:00 +0000
State-Changed-Why:
This should be fixed in rev. 1.107 of sys/kern/vfs_wapbl.c. Can you please
confirm?
State-Changed-From-To: feedback->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Wed, 20 Jan 2021 21:47:38 +0000
State-Changed-Why:
Feedback timeout, this should be fixed by sys/kern/vfs_wapbl.c rev. 1.108.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.