NetBSD Problem Report #52301
From dholland@netbsd.org Thu Jun 15 06:10:56 2017
Return-Path: <dholland@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 6FEF47A269
for <gnats-bugs@gnats.NetBSD.org>; Thu, 15 Jun 2017 06:10:56 +0000 (UTC)
Message-Id: <20170615061056.337D284CD8@mail.netbsd.org>
Date: Thu, 15 Jun 2017 06:10:56 +0000 (UTC)
From: dholland@NetBSD.org
Reply-To: dholland@NetBSD.org
To: gnats-bugs@NetBSD.org
Cc: coypu@sdf.org
Subject: lfs deadlock between lfs_fsync and lfs_create
X-Send-Pr-Version: 3.95
>Number: 52301
>Category: kern
>Synopsis: lfs deadlock between lfs_fsync and lfs_create
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Jun 15 06:15:00 +0000 2017
>Closed-Date: Sun Jun 20 22:24:42 +0000 2021
>Last-Modified: Sun Jun 20 22:24:42 +0000 2021
>Originator: David A. Holland
>Release: NetBSD 7.99.75 (20170603)
>Organization:
>Environment:
System: NetBSD loggy 7.99.75 NetBSD 7.99.75 (GENERIC) #66: Sat Jun 10 05:26:28
+IDT 2017 fly@loggy:/home/fly/obj/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
lfs can deadlock as follows:
1. process A calls lfs_fsync
- which calls lfs_vflush
- which takes the seglock and calls lfs_segwrite
2. meanwhile process B calls lfs_create
- which starts a dirop
- and then down inside lfs_valloc blocks waiting for the seglock.
3. now process A in lfs_segwrite calls lfs_writer_enter
- which blocks waiting for dirops to clear
- but they can't.
We found this by analyzing a crashdump, and confirmed it in the source
of a 7.99.75 tree from 20170603.
>How-To-Repeat:
See above.
>Fix:
Not obvious.
I wonder if this was introduced by the vnode cache changes... but even
if so reverting those isn't the answer.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->maya
Responsible-Changed-By: maya@NetBSD.org
Responsible-Changed-When: Thu, 15 Jun 2017 13:59:19 +0000
Responsible-Changed-Why:
Mine
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create
Date: Thu, 15 Jun 2017 14:20:29 +0000
--bg08WKrSYDhXBjb5
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
I am torture testing the following diff.
It boots, let's see how it holds up :-)
--bg08WKrSYDhXBjb5
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="lfs12.diff"
It isn't safe to drain dirops with seglock held, this leads
to a deadlock. drain before grabbing seglock.
Index: lfs_segment.c
===================================================================
RCS file: /cvsroot/src/sys/ufs/lfs/lfs_segment.c,v
retrieving revision 1.271
diff -u -r1.271 lfs_segment.c
--- lfs_segment.c 12 Jun 2017 15:02:32 -0000 1.271
+++ lfs_segment.c 15 Jun 2017 14:18:13 -0000
@@ -603,7 +603,6 @@
SEGUSE *segusep;
int do_ckp, did_ckp, error;
unsigned n, segleft, maxseg, sn, i, curseg;
- int writer_set = 0;
int dirty;
int redo;
SEGSUM *ssp;
@@ -628,6 +627,8 @@
if (do_ckp)
flags &= ~SEGM_SINGLE;
+ lfs_writer_enter(fs, "lfs segwrite");
+
lfs_seglock(fs, flags | (do_ckp ? SEGM_CKP : 0));
sp = fs->lfs_sp;
if (sp->seg_flags & (SEGM_CLEAN | SEGM_CKP))
@@ -653,11 +654,7 @@
break;
}
- if (do_ckp || fs->lfs_dirops == 0) {
- if (!writer_set) {
- lfs_writer_enter(fs, "lfs writer");
- writer_set = 1;
- }
+ if (do_ckp) {
error = lfs_writevnodes(fs, mp, sp, VN_DIROP);
if (um_error == 0)
um_error = error;
@@ -806,8 +803,7 @@
/* Note Ifile no longer needs to be written */
fs->lfs_doifile = 0;
- if (writer_set)
- lfs_writer_leave(fs);
+ lfs_writer_leave(fs);
/*
* If we didn't write the Ifile, we didn't really do anything.
--bg08WKrSYDhXBjb5--
From: "Maya Rashish" <maya@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52301 CVS commit: src/sys/ufs/lfs
Date: Thu, 15 Jun 2017 14:37:31 +0000
Module Name: src
Committed By: maya
Date: Thu Jun 15 14:37:31 UTC 2017
Modified Files:
src/sys/ufs/lfs: lfs_segment.c
Log Message:
It isn't safe to drain dirops with seglock held, it'll deadlock if there
are any dirops. drain before grabbing seglock.
lfs_dirops == 0 is always true (as we already drained dirops), so omit
that part of the comparison.
Fixes a lot of LFS deadlocks. PR kern/52301
Many thanks to dholland for help analyzing coredumps
To generate a diff of this commit:
cvs rdiff -u -r1.271 -r1.272 src/sys/ufs/lfs/lfs_segment.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create
Date: Thu, 15 Jun 2017 14:43:42 +0000
Now I'm unconditionally lfs_writer_enter'ing, whereas previously it did
so if there were no dirops pending (in a racy check) or if doing a
checkpoint. This feels more correct but I am not sure yet. Certainly it
feels slower, but at least it doesn't deadlock all the time.
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create
Date: Mon, 19 Jun 2017 12:18:10 +0000
lfs_fcntl:
case LFCNRECLAIM:
/*
* Flush dirops and write Ifile, allowing empty segments
* to be immediately reclaimed.
*/
lfs_writer_enter(fs, "pndirop");
off = lfs_sb_getoffset(fs);
lfs_seglock(fs, SEGM_FORCE_CKP | SEGM_CKP);
lfs_flush_dirops(fs);
LFS_CLEANERINFO(cip, fs, bp);
oclean = lfs_ci_getclean(fs, cip);
LFS_SYNC_CLEANERINFO(cip, fs, bp, 1);
lfs_segwrite(ap->a_vp->v_mount, SEGM_FORCE_CKP);
fs->lfs_sp->seg_flags |= SEGM_PROT;
lfs_segunlock(fs);
lfs_writer_leave(fs);
From: "Maya Rashish" <maya@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52301 CVS commit: src/sys/ufs/lfs
Date: Wed, 26 Jul 2017 15:07:27 +0000
Module Name: src
Committed By: maya
Date: Wed Jul 26 15:07:27 UTC 2017
Modified Files:
src/sys/ufs/lfs: lfs_segment.c
Log Message:
Revert r1.272 fix to PR kern/52301, the performance hit is making things
unusable.
To generate a diff of this commit:
cvs rdiff -u -r1.272 -r1.273 src/sys/ufs/lfs/lfs_segment.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: PR/52301 CVS commit: src/sys/ufs/lfs
Date: Wed, 26 Jul 2017 17:15:10 +0000
On Wed, Jul 26, 2017 at 03:10:01PM +0000, Maya Rashish wrote:
> Log Message:
> Revert r1.272 fix to PR kern/52301, the performance hit is making things
> unusable.
I wonder if we should put that fix in -8 though, at least for now...
(pretty sure this issue's a behavioral regression relative to -7)
--
David A. Holland
dholland@netbsd.org
From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52301 CVS commit: [netbsd-8] src
Date: Mon, 30 Oct 2017 09:29:04 +0000
Module Name: src
Committed By: snj
Date: Mon Oct 30 09:29:04 UTC 2017
Modified Files:
src/sbin/fsck_lfs [netbsd-8]: inode.c lfs.c pass6.c segwrite.c
src/sys/ufs/lfs [netbsd-8]: lfs.h lfs_accessors.h lfs_alloc.c
lfs_balloc.c lfs_bio.c lfs_extern.h lfs_inode.c lfs_inode.h
lfs_itimes.c lfs_pages.c lfs_rename.c lfs_segment.c lfs_subr.c
lfs_syscalls.c lfs_vfsops.c lfs_vnops.c ulfs_inode.c ulfs_inode.h
ulfs_lookup.c ulfs_quota2.c ulfs_readwrite.c ulfs_vnops.c
Log Message:
Pull up following revision(s) (requested by maya in ticket #330):
sbin/fsck_lfs/inode.c: 1.69
sbin/fsck_lfs/lfs.c: 1.73
sbin/fsck_lfs/pass6.c: 1.50
sbin/fsck_lfs/segwrite.c: 1.46
sys/ufs/lfs/lfs.h: 1.202-1.203
sys/ufs/lfs/lfs_accessors.h: 1.48
sys/ufs/lfs/lfs_alloc.c: 1.136-1.137
sys/ufs/lfs/lfs_balloc.c: 1.94
sys/ufs/lfs/lfs_bio.c: 1.141
sys/ufs/lfs/lfs_extern.h: 1.113
sys/ufs/lfs/lfs_inode.c: 1.156-1.157
sys/ufs/lfs/lfs_inode.h: 1.20, 1.21, 1.23
sys/ufs/lfs/lfs_itimes.c: 1.20
sys/ufs/lfs/lfs_pages.c: 1.13-1.15
sys/ufs/lfs/lfs_rename.c: 1.22
sys/ufs/lfs/lfs_segment.c: 1.270-1.275
sys/ufs/lfs/lfs_subr.c: 1.94-1.97
sys/ufs/lfs/lfs_syscalls.c: 1.175
sys/ufs/lfs/lfs_vfsops.c: 1.360
sys/ufs/lfs/lfs_vnops.c: 1.316-1.321
sys/ufs/lfs/ulfs_inode.c: 1.20
sys/ufs/lfs/ulfs_inode.h: 1.24
sys/ufs/lfs/ulfs_lookup.c: 1.41
sys/ufs/lfs/ulfs_quota2.c: 1.31
sys/ufs/lfs/ulfs_readwrite.c: 1.24
sys/ufs/lfs/ulfs_vnops.c: 1.49-1.50
Update inode member i_flag --> i_state to keep up with kernel changes
Move definition of IN_ALLMOD near the flag it's a mask for.
Now we can see that it doesn't match all the flags, but changing that will
require more careful thought.
Correct confusion between i_flag and i_flags
These will have to be renamed.
Spotted by Riastradh, thanks!
Add an XXX about the missing flags so it's not buried in a commit
message.
now the XXX count for LFS is 260
Rename i_flag to i_state.
The similarity to i_flags has previously caused errors.
Use continue to denote the no-op loop to match netbsd style
newline for extra clarity.
It isn't safe to drain dirops with seglock held, it'll deadlock if there
are any dirops. drain before grabbing seglock.
lfs_dirops == 0 is always true (as we already drained dirops), so omit
that part of the comparison.
Fixes a lot of LFS deadlocks. PR kern/52301
Many thanks to dholland for help analyzing coredumps
Ifdef out KDASSERT which fires on my machine.
Deduplicate sanity check that seglock is held on segunlock
Revert r1.272 fix to PR kern/52301, the performance hit is making things
unusable.
change lfs_nextsegsleep and lfs_allclean_wakeup to use condvar
XXX had to use lfs_lock in lfs_segwait, removed kernel_lock, is this
appropriate?
fix buffer overflow/KASSERT when cookies are supplied
lfs no longer uses the ffs-style struct direct, use the correct minimum
size
from dholland
XXX more wrong
Consistently use {,UN}MARK_VNODE macros rather than function calls.
Not much point doing anything after a panic call
Ask some question about the code in a XXX comment
XXX question our double-flushing of dirops
Fix typo in comment
To generate a diff of this commit:
cvs rdiff -u -r1.68 -r1.68.8.1 src/sbin/fsck_lfs/inode.c
cvs rdiff -u -r1.72 -r1.72.6.1 src/sbin/fsck_lfs/lfs.c
cvs rdiff -u -r1.49 -r1.49.8.1 src/sbin/fsck_lfs/pass6.c
cvs rdiff -u -r1.45 -r1.45.8.1 src/sbin/fsck_lfs/segwrite.c
cvs rdiff -u -r1.201 -r1.201.6.1 src/sys/ufs/lfs/lfs.h
cvs rdiff -u -r1.47 -r1.47.8.1 src/sys/ufs/lfs/lfs_accessors.h
cvs rdiff -u -r1.135 -r1.135.6.1 src/sys/ufs/lfs/lfs_alloc.c
cvs rdiff -u -r1.92 -r1.92.6.1 src/sys/ufs/lfs/lfs_balloc.c \
src/sys/ufs/lfs/lfs_subr.c
cvs rdiff -u -r1.139 -r1.139.4.1 src/sys/ufs/lfs/lfs_bio.c
cvs rdiff -u -r1.111 -r1.111.10.1 src/sys/ufs/lfs/lfs_extern.h
cvs rdiff -u -r1.155 -r1.155.6.1 src/sys/ufs/lfs/lfs_inode.c
cvs rdiff -u -r1.19 -r1.19.6.1 src/sys/ufs/lfs/lfs_inode.h
cvs rdiff -u -r1.19 -r1.19.10.1 src/sys/ufs/lfs/lfs_itimes.c
cvs rdiff -u -r1.11.6.1 -r1.11.6.2 src/sys/ufs/lfs/lfs_pages.c
cvs rdiff -u -r1.21 -r1.21.10.1 src/sys/ufs/lfs/lfs_rename.c
cvs rdiff -u -r1.269 -r1.269.6.1 src/sys/ufs/lfs/lfs_segment.c
cvs rdiff -u -r1.174 -r1.174.4.1 src/sys/ufs/lfs/lfs_syscalls.c
cvs rdiff -u -r1.359 -r1.359.4.1 src/sys/ufs/lfs/lfs_vfsops.c
cvs rdiff -u -r1.315 -r1.315.2.1 src/sys/ufs/lfs/lfs_vnops.c
cvs rdiff -u -r1.19 -r1.19.2.1 src/sys/ufs/lfs/ulfs_inode.c
cvs rdiff -u -r1.22 -r1.22.10.1 src/sys/ufs/lfs/ulfs_inode.h
cvs rdiff -u -r1.40 -r1.40.6.1 src/sys/ufs/lfs/ulfs_lookup.c
cvs rdiff -u -r1.30 -r1.30.6.1 src/sys/ufs/lfs/ulfs_quota2.c
cvs rdiff -u -r1.23 -r1.23.6.1 src/sys/ufs/lfs/ulfs_readwrite.c
cvs rdiff -u -r1.48 -r1.48.4.1 src/sys/ufs/lfs/ulfs_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Responsible-Changed-From-To: maya->kern-bug-people
Responsible-Changed-By: maya@NetBSD.org
Responsible-Changed-When: Mon, 13 Nov 2017 19:24:51 +0000
Responsible-Changed-Why:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: Konrad Schroder <perseant@netbsd.org>
Subject: Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create
Date: Mon, 14 Jan 2019 00:50:43 +0000
On Thu, Jun 15, 2017 at 06:15:00AM +0000, dholland@NetBSD.org wrote:
> 1. process A calls lfs_fsync
> - which calls lfs_vflush
> - which takes the seglock and calls lfs_segwrite
> 2. meanwhile process B calls lfs_create
> - which starts a dirop
> - and then down inside lfs_valloc blocks waiting for the seglock.
> 3. now process A in lfs_segwrite calls lfs_writer_enter
> - which blocks waiting for dirops to clear
> - but they can't.
So I was just looking into fixing this by having lfs_writer_enter
temporarily release the seglock, which would allow lfs_create to
finish. However, this led me to discover something I'd either not
realized or forgotten: lfs_seglock isn't just a lock, it's closely
tied to the segment writing mechanism; to wit, taking lfs_seglock is
not just a locking action but also beginning to write out a segment,
and releasing it means waiting until it's done; you can't release it
temporarily without compromising core design invariants.
But.
Why, then, is lfs_valloc taking the seglock? It seems like this means
that every call to lfs_valloc will write out its own (partial) segment
containing just that allocation; this may technically be correct in
the sense of not corrupting the volume, but it's certainly not
desirable.
(Even if this segment collects other pending writes, which it might --
didn't see the logic for that but didn't look very carefully -- it
still puts every newly created file in a different segment which will
be unfortunate for things like untar.)
This behavior seems to have been introduced for locking reasons a long
time ago and for the moment I'm not clear on what those were or
whether they're still relevant.
Am I wrong somewhere?
Cc: perseant@ in case he's awake and remembers anything helpful...
--
David A. Holland
dholland@netbsd.org
From: Konrad Schroder <perseant@hhhh.org>
To: David Holland <dholland-bugs@netbsd.org>, gnats-bugs@netbsd.org
Cc: Konrad Schroder <perseant@netbsd.org>
Subject: Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create
Date: Fri, 18 Jan 2019 08:56:53 -0800
My recollection is that lfs_valloc takes the segment lock because
otherwise the Ifile free list could become corrupted on disk if a
segment was being written at the same time. The segment lock doesn't
cause a segment to be written when it is released, per se, so there's no
harm in using it to prevent this.
The handling of dirops needs to be thought out very carefully. The easy
way would be to prohibit dirops from beginning when a segment write is
in progress. This would indicate that the locking order in your point
#2 below is wrong. It might end up starving dirops, however.
A better approach might be to note, when taking the segment lock,
whether dirops are ready to be written, and only prohibit new dirops in
that case...but if the segment lock was taken when no dirops are ready,
the segment writer wouldn't write any VDIROP vnodes and wouldn't wait
for dirops to complete before doing anything.
Take care,
-Konrad
On 1/13/19 4:50 PM, David Holland wrote:
> On Thu, Jun 15, 2017 at 06:15:00AM +0000, dholland@NetBSD.org wrote:
> > 1. process A calls lfs_fsync
> > - which calls lfs_vflush
> > - which takes the seglock and calls lfs_segwrite
> > 2. meanwhile process B calls lfs_create
> > - which starts a dirop
> > - and then down inside lfs_valloc blocks waiting for the seglock.
> > 3. now process A in lfs_segwrite calls lfs_writer_enter
> > - which blocks waiting for dirops to clear
> > - but they can't.
>
> So I was just looking into fixing this by having lfs_writer_enter
> temporarily release the seglock, which would allow lfs_create to
> finish. However, this led me to discover something I'd either not
> realized or forgotten: lfs_seglock isn't just a lock, it's closely
> tied to the segment writing mechanism; to wit, taking lfs_seglock is
> not just a locking action but also beginning to write out a segment,
> and releasing it means waiting until it's done; you can't release it
> temporarily without compromising core design invariants.
>
> But.
>
> Why, then, is lfs_valloc taking the seglock? It seems like this means
> that every call to lfs_valloc will write out its own (partial) segment
> containing just that allocation; this may technically be correct in
> the sense of not corrupting the volume, but it's certainly not
> desirable.
>
> (Even if this segment collects other pending writes, which it might --
> didn't see the logic for that but didn't look very carefully -- it
> still puts every newly created file in a different segment which will
> be unfortunate for things like untar.)
>
> This behavior seems to have been introduced for locking reasons a long
> time ago and for the moment I'm not clear on what those were or
> whether they're still relevant.
>
> Am I wrong somewhere?
>
> Cc: perseant@ in case he's awake and remembers anything helpful...
>
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52301 CVS commit: src/sys/ufs/lfs
Date: Sun, 23 Feb 2020 08:40:37 +0000
Module Name: src
Committed By: riastradh
Date: Sun Feb 23 08:40:37 UTC 2020
Modified Files:
src/sys/ufs/lfs: lfs_extern.h lfs_segment.c lfs_subr.c
Log Message:
Break deadlock in PR kern/52301.
The lock order is lfs_writer -> lfs_seglock. The problem in 52301 is
that lfs_segwrite violates this lock order by sometimes doing
lfs_seglock -> lfs_writer, either (a) when doing a checkpoint or (b),
opportunistically, when there are no dirops pending. Both cases can
deadlock, because dirops sometimes take the seglock (lfs_truncate,
lfs_valloc, lfs_vfree):
(a) There may be dirops pending, and they may be waiting for the
seglock, so we can't wait for them to complete while holding the
seglock.
(b) The test for fs->lfs_dirops == 0 happens unlocked, and the state
may change by the time lfs_writer_enter acquires lfs_lock.
To resolve this in each case:
(a) Do lfs_writer_enter before lfs_seglock, since we will need it
unconditionally anyway. The worst performance impact of this should
be that some dirops get delayed a little bit.
(b) Create a new lfs_writer_tryenter to use at this point so that the
test for fs->lfs_dirops == 0 and the acquisition of lfs_writer happen
atomically under lfs_lock.
To generate a diff of this commit:
cvs rdiff -u -r1.115 -r1.116 src/sys/ufs/lfs/lfs_extern.h
cvs rdiff -u -r1.284 -r1.285 src/sys/ufs/lfs/lfs_segment.c
cvs rdiff -u -r1.98 -r1.99 src/sys/ufs/lfs/lfs_subr.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: maya@NetBSD.org
State-Changed-When: Tue, 21 Apr 2020 16:50:54 +0000
State-Changed-Why:
Hi david, do you find the latest fix satisfactory?
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52301 CVS commit: [netbsd-9] src
Date: Mon, 17 Aug 2020 10:30:23 +0000
Module Name: src
Committed By: martin
Date: Mon Aug 17 10:30:23 UTC 2020
Modified Files:
src/lib/libp2k [netbsd-9]: p2k.c
src/sbin/fsck_lfs [netbsd-9]: pass1.c
src/sys/arch/i386/stand/efiboot/bootx64 [netbsd-9]: Makefile
src/sys/rump/fs/lib/liblfs [netbsd-9]: Makefile
src/sys/ufs/lfs [netbsd-9]: lfs.h lfs_accessors.h lfs_alloc.c
lfs_balloc.c lfs_bio.c lfs_debug.c lfs_extern.h lfs_inode.c
lfs_inode.h lfs_pages.c lfs_rename.c lfs_segment.c lfs_subr.c
lfs_vfsops.c lfs_vnops.c
src/usr.sbin/dumplfs [netbsd-9]: dumplfs.c
Log Message:
Pull up following revision(s) (requested by riastradh in ticket #1050):
sys/ufs/lfs/lfs_subr.c: revision 1.101
sys/ufs/lfs/lfs_subr.c: revision 1.102
sys/ufs/lfs/lfs_inode.c: revision 1.158
sys/ufs/lfs/lfs_inode.h: revision 1.25
sys/ufs/lfs/lfs_balloc.c: revision 1.95
sys/ufs/lfs/lfs_pages.c: revision 1.21
sys/ufs/lfs/lfs_vnops.c: revision 1.330
sys/ufs/lfs/lfs_alloc.c: revision 1.140 (patch)
sys/ufs/lfs/lfs_alloc.c: revision 1.141 (patch)
lib/libp2k/p2k.c: revision 1.72
sys/ufs/lfs/lfs.h: revision 1.205
sys/ufs/lfs/lfs.h: revision 1.206
sys/ufs/lfs/lfs_segment.c: revision 1.284
sys/ufs/lfs/lfs.h: revision 1.207
sys/ufs/lfs/lfs_segment.c: revision 1.285
sys/ufs/lfs/lfs_debug.c: revision 1.55
sys/ufs/lfs/lfs_rename.c: revision 1.23
usr.sbin/dumplfs/dumplfs.c: revision 1.65
sys/ufs/lfs/lfs_vfsops.c: revision 1.371
sys/arch/i386/stand/efiboot/bootx64/Makefile: revision 1.3
sys/ufs/lfs/lfs_vfsops.c: revision 1.372
sys/ufs/lfs/lfs_vfsops.c: revision 1.373
sbin/fsck_lfs/pass1.c: revision 1.46
sys/ufs/lfs/lfs_vnops.c: revision 1.326
sys/ufs/lfs/lfs_vnops.c: revision 1.327
sys/ufs/lfs/lfs_vfsops.c: revision 1.375 (patch)
sys/ufs/lfs/lfs_vnops.c: revision 1.328
sys/ufs/lfs/lfs_subr.c: revision 1.98
sys/ufs/lfs/lfs_extern.h: revision 1.116
sys/ufs/lfs/lfs_vnops.c: revision 1.329
sys/ufs/lfs/lfs_subr.c: revision 1.99
sys/ufs/lfs/lfs_extern.h: revision 1.117
sys/ufs/lfs/lfs_accessors.h: revision 1.49
sys/ufs/lfs/lfs_extern.h: revision 1.118
sys/rump/fs/lib/liblfs/Makefile: revision 1.15
sys/ufs/lfs/lfs_bio.c: revision 1.146 (patch)
sys/ufs/lfs/lfs_bio.c: revision 1.147
sys/ufs/lfs/lfs_subr.c: revision 1.100
Fix kassert in lfs by initializing vp first.
Use a marker node to iterate lfs_dchainhd / i_lfs_dchain.
I believe elements can be removed while the lock is dropped,
including the next node we're hanging on to.
Just use VOP_BWRITE for lfs_bwrite_log.
Hope this doesn't cause trouble with vfs_suspend.
Teach lfs to transition ro<->rw.
Prevent new dirops while we issue lfs_flush_dirops.
lfs_flush_dirops assumes (by KASSERT((ip->i_state & IN_ADIROP) == 0))
that vnodes on the dchain will not become involved in active dirops
even while holding no other locks (lfs_lock, v_interlock), so we must
set lfs_writer here. All other callers already set lfs_writer.
We set fs->lfs_writer++ without explicitly doing lfs_writer_enter
because
(a) we already waited for the dirops to drain, and
(b) we hold lfs_lock and cannot drop it before setting lfs_writer.
Assert lfs_writer where I think we can now prove it.
Serialize access to the splay tree with lfs_lock.
Change some cheap KDASSERT into KASSERT.
Take a reference and fix assertions in lfs_flush_dirops.
Fixes panic:
KASSERT((ip->i_state & IN_ADIROP) == 0) at lfs_vnops.c:1670
lfs_flush_dirops
lfs_check
lfs_setattr
VOP_SETATTR
change_mode
sys_fchmod
syscall
This assertion -- and the assertion that vp->v_uflag has VU_DIROP set
-- is valid only until we release lfs_lock, because we may race with
lfs_unmark_dirop which will remove the nodes and change the flags.
Further, vp itself is valid only as long as it is referenced, which it
is as long as it's on the dchain, but lfs_unmark_dirop drops the
dchain's reference.
Don't lfs_writer_enter while holding v_interlock.
There's no need to lfs_writer_enter at all here, as far as I can see.
lfs_flush_fs will do it for us.
Break deadlock in PR kern/52301.
The lock order is lfs_writer -> lfs_seglock. The problem in 52301 is
that lfs_segwrite violates this lock order by sometimes doing
lfs_seglock -> lfs_writer, either (a) when doing a checkpoint or (b),
opportunistically, when there are no dirops pending. Both cases can
deadlock, because dirops sometimes take the seglock (lfs_truncate,
lfs_valloc, lfs_vfree):
(a) There may be dirops pending, and they may be waiting for the
seglock, so we can't wait for them to complete while holding the
seglock.
(b) The test for fs->lfs_dirops == 0 happens unlocked, and the state
may change by the time lfs_writer_enter acquires lfs_lock.
To resolve this in each case:
(a) Do lfs_writer_enter before lfs_seglock, since we will need it
unconditionally anyway. The worst performance impact of this should
be that some dirops get delayed a little bit.
(b) Create a new lfs_writer_tryenter to use at this point so that the
test for fs->lfs_dirops == 0 and the acquisition of lfs_writer happen
atomically under lfs_lock.
Initialize/destroy lfs_allclean_wakeup in modcmd, not lfs_mountfs.
Fixes reloading lfs.kmod.
In lfs_update, hold lfs_writer around lfs_vflush.
Otherwise, we might do
lfs_vflush
-> lfs_seglock
-> lfs_segwait(SEGM_CKP)
-> lfs_writer_enter
which is the reverse of the lfs_writer -> lfs_seglock ordering.
Call lfs_orphan in lfs_rename while we're still in the dirop.
lfs_writer_enter can't fail; keep it simple and don't pretend it can.
Assert that mtsleep can't fail either -- it doesn't catch signals and
there's no timeout.
Teach LFS_ORPHAN_NEXTFREE about lfs64.
Dust off the orphan detection code and try to make it work.
Fix !DIAGNOSTIC compile
Fix userland references to LFS_ORPHAN_NEXTFREE.
Forgot to grep for these or do a full distribution build, oops!
Fix missing <sys/evcnt.h> by removing the evcnts instead.
Just wanted to confirm that a race might happen, and indeed it did.
These serve little diagnostic value otherwise.
OR into bp->b_cflags; don't overwrite.
CTASSERT lfs on-disk structure sizes.
Avoid misaligned access to lfs64 on-disk records in memory.
lfs64 directory entries are only 32-bit aligned in order to conserve
space in directory blocks, and we had a hack to stuff a 64-bit inode
in them. This replaces the hack by __aligned(4) __packed, and goes
further:
1. It's not clear that all the other lfs64 data structures are 64-bit
aligned on disk to begin with. We can go through these later and
upgrade them from
struct foo64 {
...
} __aligned(4) __packed;
union foo {
struct foo64 f64;
...
};
to
struct foo64 {
...
};
union foo {
struct foo64 f64 __aligned(8);
...
} __aligned(4) __packed;
if we really want to take advantage of 64-bit memory accesses.
However, the __aligned(4) __packed must remain on the union
because:
2. We access even the lfs32 data structures via a union that has
lfs64 members, and it turns out that compilers will assume access
through a union with 64-bit aligned members implies the whole
union has 64-bit alignment, even if we're only accessing a 32-bit
aligned member.
Fix clang build after packed lfs64 accessor change.
Suppress spurious address-of-packed error in rump lfs too.
To generate a diff of this commit:
cvs rdiff -u -r1.70 -r1.70.14.1 src/lib/libp2k/p2k.c
cvs rdiff -u -r1.45 -r1.45.18.1 src/sbin/fsck_lfs/pass1.c
cvs rdiff -u -r1.1.26.1 -r1.1.26.2 \
src/sys/arch/i386/stand/efiboot/bootx64/Makefile
cvs rdiff -u -r1.14 -r1.14.22.1 src/sys/rump/fs/lib/liblfs/Makefile
cvs rdiff -u -r1.204 -r1.204.4.1 src/sys/ufs/lfs/lfs.h
cvs rdiff -u -r1.48 -r1.48.12.1 src/sys/ufs/lfs/lfs_accessors.h
cvs rdiff -u -r1.137 -r1.137.8.1 src/sys/ufs/lfs/lfs_alloc.c
cvs rdiff -u -r1.94 -r1.94.10.1 src/sys/ufs/lfs/lfs_balloc.c
cvs rdiff -u -r1.142 -r1.142.6.1 src/sys/ufs/lfs/lfs_bio.c
cvs rdiff -u -r1.54 -r1.54.22.1 src/sys/ufs/lfs/lfs_debug.c
cvs rdiff -u -r1.114 -r1.114.4.1 src/sys/ufs/lfs/lfs_extern.h
cvs rdiff -u -r1.157 -r1.157.10.1 src/sys/ufs/lfs/lfs_inode.c
cvs rdiff -u -r1.23 -r1.23.10.1 src/sys/ufs/lfs/lfs_inode.h
cvs rdiff -u -r1.15 -r1.15.8.1 src/sys/ufs/lfs/lfs_pages.c
cvs rdiff -u -r1.22 -r1.22.10.1 src/sys/ufs/lfs/lfs_rename.c
cvs rdiff -u -r1.278 -r1.278.4.1 src/sys/ufs/lfs/lfs_segment.c
cvs rdiff -u -r1.97 -r1.97.8.1 src/sys/ufs/lfs/lfs_subr.c
cvs rdiff -u -r1.365 -r1.365.2.1 src/sys/ufs/lfs/lfs_vfsops.c
cvs rdiff -u -r1.324 -r1.324.2.1 src/sys/ufs/lfs/lfs_vnops.c
cvs rdiff -u -r1.64 -r1.64.4.1 src/usr.sbin/dumplfs/dumplfs.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 20 Jun 2021 22:24:42 +0000
State-Changed-Why:
riastradh fixed this properly some time back, and while lfs dirops still
need to be redesigned, keeping this PR open (especially in feedback) doesn't
do anything towards making that happen.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.