NetBSD Problem Report #8964
Received: (qmail 736 invoked from network); 7 Dec 1999 06:46:03 -0000
Message-Id: <199912070645.PAA16552@icnmp9.icg.tnr.sharp.co.jp>
Date: Tue, 7 Dec 1999 15:45:15 +0900 (JST)
From: itohy@netbsd.org
Reply-To: itohy@netbsd.org
To: gnats-bugs@gnats.netbsd.org
Subject: panic on reboot(2) if an LFS is mounted read-only
X-Send-Pr-Version: 3.95
>Number: 8964
>Category: kern
>Synopsis: panic on reboot(2) if an LFS is mounted read-only
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Dec 06 22:48:00 +0000 1999
>Closed-Date: Sat Jul 14 09:39:20 +0000 2018
>Last-Modified: Sat Jul 14 09:39:20 +0000 2018
>Originator: ITOH Yasufumi
>Release: 1.4P (Dec. 3, 1999)
>Organization:
>Environment:
System: NetBSD 1.4P NetBSD 1.4P (ACHA.elf) #20: Sun Dec 5 14:55:44 JST 1999 itohy@pino.my.domain:/usr/src/sys/arch/x68k/compile/ACHA.elf x68k
no SOFTDEP in the kernel config.
>Description:
The system panic()s on reboot if
1. an LFS is mounted, and
2. the filesystem is read-only.
This problem seems to be introduced by the softdep merge.
>How-To-Repeat:
Mount an LFS in read-only mode and reboot the system.
Here's an example when the root filesystem is an LFS.
# mount
root_device on / type lfs (local, read-only)
# reboot
Dec 6 08:29:42 init: kernel security level changed from 0 to 1
syncing disks... panic: bawrite LFS buffer
Stopped in reboot at cpu_Debugger+0x6: unlk a6
db> trace
cpu_Debugger(2004,1,2710,519c00,54fc54) + 6
panic(116d2c,54fe68,45bd8,54fe60,12abd4) + 56
lfs_bwrite(54fe60,12abd4,519c00,54fe9c,4bd16) + 1c
bawrite(519c00) + 36
vfs_shutdown(54ff3c,300e0,0,0,0) + be
cpu_reboot(0,0,0,530640,1) + 3c
sys_reboot(530640,54ff88,54ff80) + 52
syscall(d0) + 196
trap0() + e
db>
>Fix:
Unknown.
LFS buffers should not be marked asynchronous.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->perseant
Responsible-Changed-By: perseant
Responsible-Changed-When: Tue Dec 7 11:54:43 PST 1999
Responsible-Changed-Why:
I have a solution.
State-Changed-From-To: open->feedback
State-Changed-By: perseant
State-Changed-When: Tue Dec 7 11:55:36 PST 1999
State-Changed-Why:
Suggested patch.
From: Konrad Schroder <perseant@hhhh.org>
To: itohy@netbsd.org, fvdl@netbsd.org
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: kern/8964: panic on reboot(2) if an LFS is mounted read-only
Date: Tue, 7 Dec 1999 11:54:18 -0800 (PST)
On Tue, 7 Dec 1999 itohy@netbsd.org wrote:
> The system panic()s on reboot if
> 1. an LFS is mounted, and
> 2. the filesystem is read-only.
>
> This problem seems to be introduced by the softdep merge.
Itoh, thanks, I see it. There are actually two problems here...the first
(bawrite an lfs buffer) masks the fact that lfs buffers can be marked
dirty (and written to disk under some circumstances) even if the
filesystem is mounted ro. This patch should fix both...please let me know
whether it works for you.
Frank, is my reading of vfs_shutdown correct in that we only have to
rewrite buffers from softdep filesystems?
Konrad Schroder
perseant@hhhh.org
Index: kern/vfs_subr.c
===================================================================
RCS file: /cvsroot/syssrc/sys/kern/vfs_subr.c,v
retrieving revision 1.115
diff -u -r1.115 vfs_subr.c
--- vfs_subr.c 1999/11/23 23:52:40 1.115
+++ vfs_subr.c 1999/12/07 19:25:02
@@ -2294,7 +2294,9 @@
* written will be remarked as dirty until other
* buffers are written.
*/
- if (bp->b_flags & B_DELWRI) {
+ if (bp->b_vp && bp->b_vp->v_mount
+ && (bp->b_vp->v_mount->mnt_flag & MNT_SOFTDEP)
+ && (bp->b_flags & B_DELWRI)) {
s = splbio();
bremfree(bp);
bp->b_flags |= B_BUSY;
Index: ufs/lfs/lfs_bio.c
===================================================================
RCS file: /cvsroot/syssrc/sys/ufs/lfs/lfs_bio.c,v
retrieving revision 1.14
diff -u -r1.14 lfs_bio.c
--- lfs_bio.c 1999/11/23 23:52:42 1.14
+++ lfs_bio.c 1999/12/07 19:25:11
@@ -140,8 +140,10 @@
register struct buf *bp = ap->a_bp;
#ifdef DIAGNOSTIC
- if(bp->b_flags & B_ASYNC)
+ if(VTOI(bp->b_vp)->i_lfs->lfs_ronly == 0
+ && (bp->b_flags & B_ASYNC)) {
panic("bawrite LFS buffer");
+ }
#endif /* DIAGNOSTIC */
return lfs_bwrite_ext(bp,0);
}
@@ -184,6 +186,23 @@
struct inode *ip;
int db, error, s;
+#ifdef LFS_HONOR_RDONLY
+ /*
+ * Don't write *any* blocks if we're mounted read-only.
+ * In particular the cleaner can't write blocks either.
+ */
+ if(VTOI(bp->b_vp)->i_lfs->lfs_ronly) {
+ bp->b_flags &= ~(B_DELWRI|B_LOCKED|B_READ|B_ERROR);
+ s = splbio();
+ reassignbuf(bp, bp->b_vp);
+ splx(s);
+ if(bp->b_flags & B_CALL)
+ bp->b_flags &= ~B_BUSY;
+ else
+ brelse(bp);
+ return EROFS;
+ }
+#endif
/*
* Set the delayed write flag and use reassignbuf to move the buffer
* from the clean list to the dirty one.
@@ -242,17 +261,7 @@
++locked_queue_count;
locked_queue_bytes += bp->b_bufsize;
s = splbio();
-#ifdef LFS_HONOR_RDONLY
- /*
- * XXX KS - Don't write blocks if we're mounted ro.
- * Placement here means that the cleaner can't write
- * blocks either.
- */
- if(VTOI(bp->b_vp)->i_lfs->lfs_ronly)
- bp->b_flags &= ~(B_DELWRI|B_LOCKED);
- else
-#endif
- bp->b_flags |= B_DELWRI | B_LOCKED;
+ bp->b_flags |= B_DELWRI | B_LOCKED;
bp->b_flags &= ~(B_READ | B_ERROR);
reassignbuf(bp, bp->b_vp);
splx(s);
@@ -316,8 +325,12 @@
if(lfs_dostats)
++lfs_stats.write_exceeded;
- if (lfs_writing && flags==0) /* XXX flags */
+ if (lfs_writing && flags==0) {/* XXX flags */
+#ifdef DEBUG_LFS
+ printf("lfs_flush: not flushing because another flush is active\n");
+#endif
return;
+ }
lfs_writing = 1;
simple_lock(&mountlist_slock);
@@ -378,6 +391,10 @@
{
if(lfs_dostats)
++lfs_stats.wait_exceeded;
+#ifdef DEBUG_LFS
+ printf("lfs_check: waiting: count=%d, bytes=%ld\n",
+ locked_queue_count, locked_queue_bytes);
+#endif
error = tsleep(&locked_queue_count, PCATCH | PUSER,
"buffers", hz * LFS_BUFWAIT);
}
From: Frank van der Linden <frank@wins.uva.nl>
To: Konrad Schroder <perseant@hhhh.org>
Cc: itohy@netbsd.org, fvdl@netbsd.org, gnats-bugs@gnats.netbsd.org
Subject: Re: kern/8964: panic on reboot(2) if an LFS is mounted read-only
Date: Wed, 8 Dec 1999 20:15:49 +0100
On Tue, Dec 07, 1999 at 11:54:18AM -0800, Konrad Schroder wrote:
> Frank, is my reading of vfs_shutdown correct in that we only have to
> rewrite buffers from softdep filesystems?
That is correct. However, I would feel much better about this if
read-only LFS actually meant read only. Isn't it simpler to
make the read-only option imply the "noclean" (-n) flag?
- Frank
From: Konrad Schroder <perseant@hhhh.org>
To: Frank van der Linden <frank@wins.uva.nl>
Cc: itohy@netbsd.org, fvdl@netbsd.org, gnats-bugs@gnats.netbsd.org
Subject: Re: kern/8964: panic on reboot(2) if an LFS is mounted read-only
Date: Wed, 8 Dec 1999 14:52:19 -0800 (PST)
On Wed, 8 Dec 1999, Frank van der Linden wrote:
> That is correct. However, I would feel much better about this if
> read-only LFS actually meant read only. Isn't it simpler to
> make the read-only option imply the "noclean" (-n) flag?
Good point, I've just made mount_lfs do this.
Two things though: I don't think that the softdep part of this problem
affected only read-only fss; in particular if the LFS is deadlocked[*]
within a dirop, and I reboot it from the debugger, I also get this
behavior. Also, the LFS portion of the patch does ensure that lfs_bwrite
never marks blocks dirty if the fs is mounted read-only.
[* - another open PR; the LFS does dirop accounting wrong and is unable to
write dirty blocks. But bawrite still is the wrong thing to do.]
Konrad Schroder
perseant@hhhh.org
From: itohy@netbsd.org (ITOH Yasufumi)
To: perseant@hhhh.org
Cc: fvdl@netbsd.org, gnats-bugs@gnats.netbsd.org
Subject: Re: kern/8964: panic on reboot(2) if an LFS is mounted read-only
Date: Thu, 9 Dec 1999 08:33:35 +0900 (JST)
In article <Pine.NEB.4.10.9912071142580.4110-100000@hhhh.hitl.washington.edu>
perseant@hhhh.org writes:
> On Tue, 7 Dec 1999 itohy@netbsd.org wrote:
>
> > The system panic()s on reboot if
> > 1. an LFS is mounted, and
> > 2. the filesystem is read-only.
> >
> > This problem seems to be introduced by the softdep merge.
>
> Itoh, thanks, I see it. There are actually two problems here...the first
> (bawrite an lfs buffer) masks the fact that lfs buffers can be marked
> dirty (and written to disk under some circumstances) even if the
> filesystem is mounted ro. This patch should fix both...please let me know
> whether it works for you.
I tried the patch and I confirmed it works fine
about read-only LFS and reboot. Thanks.
However, I had a problem.
I'm not sure this problem has something to do with this change or
another LFS problem or an MD problem.
Unfortunately, I didn't record the details of the problem.
I thought it can be reproduced, but once I tried the previous version
of kernel (the problem didn't appear), the problem disappeared....
The problem was,
(boot single-user on LFS)
# mount -u /dev/sd2a / # remount read/write
# mount -r /dev/sd0a /mnt # this is usual root (FFS)
# cp /mnt/netbsd-* / # netbsd-1.4C, netbsd-1.4M, ...
[hang (tty echoback is alive)]
I pressed the INTERRUPT button and saw the trace, but didn't wrote down
the output.... The trace output was something like as
tsleep()
lfs_check()
lfs_balloc() ?? probably
...
I can recall the wait channel was "buffers" (by hitting
status character on the console).
fsck_lfs found an unref file in the filesystem.
Oops, I have another problem after I wrote above.
I'm not sure this is a problem of LFS or MD part or the hardware.
(boot single user)
# mount -u /dev/sd2a / # remount r/w
# rm netbsd-*
# date 199912092352 # yeah, I set the system date
# sync
# sync
# reboot
Dec 9 23:52:40 init: kernel security level changed from 0 to 1
syncing disks... done
unmounting / (/dev/sd2a)...
uvm_vnp_terminate(0x5521a8): terminating active vonde (refs=2)
uvm_vnp_terminate(0x52b72c): terminating active vnode (refs=2)
[hang with continuous disk access (tty echo is alive)]
[press INTERRUPT button]
Got a keyboard NMI
Stopped in reboot at cpu_Debugger+0x6: unlk a6
db> trace
cpu_Debugger(55bb44,4f8,ffffff08,51,8) + 6
nmihand(ffffff08,51,8,a96021,80a96021) + 42
lev7intr(31ec00,2004,320a00,328d80,55bb9c) + 12
intio_intr(55bb34) + 48
intiotrap(328d80,52b198,55bc20,b9bde,55bc18) + 8 # SCSI interrupt?
spec_strategy(55bc18) + 46
lfs_writeseg(325c00,30e280) + 610
lfs_segwrite(324400,4,52c0d8,55bcac,b7012) + 2e8
lfs_flush_fs(324400,4) + 38
lfs_update(55bcb8,12ac94,52b330,0,0) + 170
lfs_fsync(55bd04) + 40
vinvalbuf(52b330,1,ffffffff,530780,0,0) + b6
vclean(52b330,8,530780) + a4
vgonel(52b330,530780) + 40
vflush(324400,52b264,2) + 72
lfs_unmount(324400,80000,530780) + 2a
dounmount(324400,80000,530780) + d4
vfs_unmountall(0,130,d0,2,0) + 6a
vfs_shutdown(55bf3c,300f0,0,0,0) + 184
cpu_reboot(0,0,0,530780,1) + 3c
sys_reboot(530780,55bf88,55bf80) + 52
syscall(d0) + 196
trap0() + e
db> c
(continue and stop at another timing)
Got a keyboard NMI
Stopped in reboot at cpu_Debugger+0x6: unlk a6
db> trace
cpu_Debugger(55bb88,4f8,2304,300,2304) + 6
nmihand(2304,300,2304,1,52b198) + 42
lev7intr(?)
bgetvp(52b198,328780) + 12
lfs_newbuf(52b198,54eb4,200,4,1) + a6
lfs_initseg(325c00,1e8,2b,0,3c,2b,0) + 158
lfs_seglock(325c00,4,2004,2004,530780) + 9e
lfs_segwrite(324400,4,52c0d8,55bcac,b7012) + ae
lfs_flush_fs(324400,4) + 38
lfs_update(55bcb8,12ac94,52b330,0,0) + 170
lfs_fsync(55bd04) + 40
vinvalbuf(52b330,1,ffffffff,530780,0,0) + b6
vclean(52b330,8,530780) + a4
vgonel(52b330,530780) + 40
vflush(324400,52b264,2) + 72
lfs_unmount(324400,80000,530780) + 2a
dounmount(324400,80000,530780) + d4
vfs_unmountall(0,130,d0,2,0) + 6a
vfs_shutdown(55bf3c,300f0,0,0,0) + 184
cpu_reboot(0,0,0,530780,1) + 3c
sys_reboot(530780,55bf88,55bf80) + 52
syscall(d0) + 196
trap0() + e
db>
rm and reboot worked on the previous session.
--
ITOH, Yasufumi <itohy@netbsd.org>
State-Changed-From-To: feedback->analyzed
State-Changed-By: fair
State-Changed-When: Sun Apr 23 02:46:30 PDT 2000
State-Changed-Why:
Feedback was provided, and the original problem was apparently solved.
However, the submitter reported a new problem with LFS. Ideally, this new
problem should be reported in a new PR, but I leave it up to the responsible
developer to decide whether to keep working on it in this PR, or require a
new one.
Responsible-Changed-From-To: perseant->kern-bug-people
Responsible-Changed-By: perseant
Responsible-Changed-When: Thu Nov 20 19:56:50 UTC 2003
Responsible-Changed-Why:
Trying to be realistic
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/8964: panic on reboot(2) if an LFS is mounted read-only
Date: Sat, 3 May 2008 03:37:16 +0000
Is this (partly?) a symptom of the silly softdep rootfs/syncer thing
that ad just fixed? ad?
--
David A. Holland
dholland@netbsd.org
State-Changed-From-To: analyzed->closed
State-Changed-By: zafer@NetBSD.org
State-Changed-When: Sat, 14 Jul 2018 09:39:20 +0000
State-Changed-Why:
I tested thoroughly and I cannot reproduce the issue anymore.
Thank you for the problem report ITOH Yasufumi.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.