NetBSD Problem Report #40562

From yamt@mwd.biglobe.ne.jp  Fri Feb  6 00:06:07 2009
Return-Path: <yamt@mwd.biglobe.ne.jp>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id B211063C07D
	for <gnats-bugs@gnats.NetBSD.org>; Fri,  6 Feb 2009 00:06:07 +0000 (UTC)
Message-Id: <20090206000602.DB7A811704@yamt.dyndns.org>
Date: Fri,  6 Feb 2009 09:06:02 +0900 (JST)
From: yamt@mwd.biglobe.ne.jp
Reply-To: yamt@mwd.biglobe.ne.jp
To: gnats-bugs@gnats.NetBSD.org
Subject: busy loop in ffs_sync when unmounting a file system
X-Send-Pr-Version: 3.95

>Number:         40562
>Category:       kern
>Synopsis:       busy loop in ffs_sync when unmounting a file system
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Feb 06 00:10:00 +0000 2009
>Closed-Date:    
>Last-Modified:  Mon Apr 09 05:57:35 +0000 2012
>Originator:     YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
>Release:        NetBSD 5.99.7
>Organization:

>Environment:


	amd64
>Description:
	it seems that ffs_fsync failed to write out dirty blocks
	for VBLK if the file system which the VBLK special file is on
	is mounted with wapbl.  it causes ffs_sync busy-loop with calling
	VOP_FSYNC on devvp.

>How-To-Repeat:
	make /dev an ffs with logging, run the following,
	and see that umount takes too long.

	rm foo
	dd if=/dev/zero of=foo bs=1m count=16
	vnconfig vnd0 foo
	newfs -F /dev/rvnd0d
	mount /dev/vnd0d /mnt
	touch /mnt/0
	umount /dev/vnd0d

>Fix:
	ensure that ffs_fsync do vflushbuf equivalent for VBLK.

	i'd suggest:

	- ensure that all VOP_FSYNC implementations call VFS_FSYNC
	  for VBLK if a file system is mounted on it.
	- make ffs VFS_FSYNC have its own function, say, ffs_vfs_fsync,
	  rather than sharing ffs_full_fsync or ffs_fsync.
	- move the softdep/wapbl VBLK handling for a mounted file system
	  from ffs_full_fsync to ffs_vfs_fsync.
	- kill FSYNC_VFS.

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->ad
Responsible-Changed-By: ad@NetBSD.org
Responsible-Changed-When: Fri, 06 Feb 2009 09:58:23 +0000
Responsible-Changed-Why:
i'm (still) working on this


State-Changed-From-To: open->feedback
State-Changed-By: ad@NetBSD.org
State-Changed-When: Sun, 22 Feb 2009 20:13:38 +0000
State-Changed-Why:
fixed
last time i looked i didn't see missing calls to VFS_FSYNC()
did you?


From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/40562 CVS commit: src/sys
Date: Sun, 22 Feb 2009 20:10:25 +0000 (UTC)

 Module Name:	src
 Committed By:	ad
 Date:		Sun Feb 22 20:10:25 UTC 2009

 Modified Files:
 	src/sys/kern: vfs_wapbl.c
 	src/sys/miscfs/syncfs: sync_subr.c sync_vnops.c
 	src/sys/ufs/ffs: ffs_alloc.c ffs_vfsops.c ffs_vnops.c

 Log Message:
 PR kern/39564 wapbl performance issues with disk cache flushing
 PR kern/40361 WAPBL locking panic in -current
 PR kern/40361 WAPBL locking panic in -current
 PR kern/40470 WAPBL corrupts ext2fs
 PR kern/40562 busy loop in ffs_sync when unmounting a file system
 PR kern/40525 panic: ffs_valloc: dup alloc

 - A fix for an issue that can lead to "ffs_valloc: dup" due to dirty cg
   buffers being invalidated. Problem discovered and patch by dholland@.

 - If the syncer fails to lazily sync a vnode due to lock contention,
   retry 1 second later instead of 30 seconds later.

 - Flush inode atime updates every ~10 seconds (this makes most sense with
   logging). Presently they didn't hit the disk for read-only files or
   devices until the file system was unmounted. It would be better to trickle
   the updates out but that would require more extensive changes.

 - Fix issues with file system corruption, busy looping and other nasty
   problems when logging and non-logging file systems are intermixed,
   with one being the root file system.

 - For logging, do not flush metadata on an inode-at-a-time basis if the sync
   has been requested by ioflush. Previously, we could try hundreds of log
   sync operations a second due to inode update activity, causing the syncer
   to fall behind and metadata updates to be serialized across the entire
   file system. Instead, burst out metadata and log flushes at a minimum
   interval of every 10 seconds on an active file system (happens more often
   if the log becomes full). Note this does not change the operation of
   fsync() etc.

 - With the flush issue fixed, re-enable concurrent metadata updates in
   vfs_wapbl.c.


 To generate a diff of this commit:
 cvs rdiff -r1.22 -r1.23 src/sys/kern/vfs_wapbl.c
 cvs rdiff -r1.35 -r1.36 src/sys/miscfs/syncfs/sync_subr.c
 cvs rdiff -r1.25 -r1.26 src/sys/miscfs/syncfs/sync_vnops.c
 cvs rdiff -r1.120 -r1.121 src/sys/ufs/ffs/ffs_alloc.c
 cvs rdiff -r1.241 -r1.242 src/sys/ufs/ffs/ffs_vfsops.c
 cvs rdiff -r1.109 -r1.110 src/sys/ufs/ffs/ffs_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: yamt@mwd.biglobe.ne.jp (YAMAMOTO Takashi)
To: gnats-bugs@NetBSD.org
Cc: ad@NetBSD.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org,
	ad@NetBSD.org
Subject: Re: kern/40562 (busy loop in ffs_sync when unmounting a file system)
Date: Mon, 23 Feb 2009 08:54:55 +0900 (JST)

 > Synopsis: busy loop in ffs_sync when unmounting a file system
 > 
 > State-Changed-From-To: open->feedback
 > State-Changed-By: ad@NetBSD.org
 > State-Changed-When: Sun, 22 Feb 2009 20:13:38 +0000
 > State-Changed-Why:
 > fixed
 > last time i looked i didn't see missing calls to VFS_FSYNC()
 > did you?

 at least ffs is missing the call.
 i haven't checked other filesystems.

 YAMAMOTO Takashi

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/40562 CVS commit: [netbsd-5] src/sys
Date: Tue, 24 Feb 2009 04:13:35 +0000 (UTC)

 Module Name:	src
 Committed By:	snj
 Date:		Tue Feb 24 04:13:35 UTC 2009

 Modified Files:
 	src/sys/kern [netbsd-5]: vfs_wapbl.c
 	src/sys/miscfs/syncfs [netbsd-5]: sync_subr.c sync_vnops.c
 	src/sys/ufs/ffs [netbsd-5]: ffs_alloc.c ffs_vfsops.c ffs_vnops.c

 Log Message:
 Pull up following revision(s) (requested by ad in ticket #490):
 	sys/kern/vfs_wapbl.c: revision 1.23
 	sys/miscfs/syncfs/sync_subr.c: revision 1.36
 	sys/miscfs/syncfs/sync_vnops.c: revision 1.26
 	sys/ufs/ffs/ffs_alloc.c: revision 1.121
 	sys/ufs/ffs/ffs_vfsops.c: revision 1.242
 	sys/ufs/ffs/ffs_vnops.c: revision 1.110
 PR kern/39564 wapbl performance issues with disk cache flushing
 PR kern/40361 WAPBL locking panic in -current
 PR kern/40361 WAPBL locking panic in -current
 PR kern/40470 WAPBL corrupts ext2fs
 PR kern/40562 busy loop in ffs_sync when unmounting a file system
 PR kern/40525 panic: ffs_valloc: dup alloc
 - A fix for an issue that can lead to "ffs_valloc: dup" due to dirty cg
   buffers being invalidated. Problem discovered and patch by dholland@.
 - If the syncer fails to lazily sync a vnode due to lock contention,
   retry 1 second later instead of 30 seconds later.
 - Flush inode atime updates every ~10 seconds (this makes most sense with
   logging). Presently they didn't hit the disk for read-only files or
   devices until the file system was unmounted. It would be better to trickle
   the updates out but that would require more extensive changes.
 - Fix issues with file system corruption, busy looping and other nasty
   problems when logging and non-logging file systems are intermixed,
   with one being the root file system.
 - For logging, do not flush metadata on an inode-at-a-time basis if the sync
   has been requested by ioflush. Previously, we could try hundreds of log
   sync operations a second due to inode update activity, causing the syncer
   to fall behind and metadata updates to be serialized across the entire
   file system. Instead, burst out metadata and log flushes at a minimum
   interval of every 10 seconds on an active file system (happens more often
   if the log becomes full). Note this does not change the operation of
   fsync() etc.
 - With the flush issue fixed, re-enable concurrent metadata updates in
   vfs_wapbl.c.


 To generate a diff of this commit:
 cvs rdiff -r1.3 -r1.3.8.1 src/sys/kern/vfs_wapbl.c
 cvs rdiff -r1.34 -r1.34.20.1 src/sys/miscfs/syncfs/sync_subr.c
 cvs rdiff -r1.25 -r1.25.10.1 src/sys/miscfs/syncfs/sync_vnops.c
 cvs rdiff -r1.113 -r1.113.4.1 src/sys/ufs/ffs/ffs_alloc.c
 cvs rdiff -r1.239 -r1.239.2.1 src/sys/ufs/ffs/ffs_vfsops.c
 cvs rdiff -r1.104.4.5 -r1.104.4.6 src/sys/ufs/ffs/ffs_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 15 Nov 2009 02:26:37 +0000
State-Changed-Why:
feedback was received in February


Responsible-Changed-From-To: ad->kern-bug-people
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Mon, 09 Apr 2012 05:57:35 +0000
Responsible-Changed-Why:
ad resigned, should not own PRs any more


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.