NetBSD Problem Report #41189

From root@xen-2.its.iastate.edu  Sat Apr 11 19:09:55 2009
Return-Path: <root@xen-2.its.iastate.edu>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 87C1063C166
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 11 Apr 2009 19:09:55 +0000 (UTC)
Message-Id: <20090411190953.BE5E4ABC27@xen-2.its.iastate.edu>
Date: Sat, 11 Apr 2009 14:09:53 -0500 (CDT)
From: jdwhite@iastate.edu
Reply-To: jdwhite@iastate.edu
To: gnats-bugs@gnats.NetBSD.org
Subject: kernel panic xen dom0 using mke2fs & WAPBL
X-Send-Pr-Version: 3.95

>Number:         41189
>Category:       kern
>Synopsis:       kernel panic xen dom0 using mke2fs & WAPBL
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Apr 11 19:10:00 +0000 2009
>Closed-Date:    Sat Dec 19 02:32:24 +0000 2015
>Last-Modified:  Sat Dec 19 02:32:24 +0000 2015
>Originator:     Jason White
>Release:        NetBSD 5.0_RC3
>Organization:
Iowa State University
>Environment:
System: NetBSD xen-2.its.iastate.edu 5.0_RC3 NetBSD 5.0_RC3 (XEN3_DOM0) #0: Fri Mar 20 07:11:29 UTC 2009 builds@b6.netbsd.org:/home/builds/ab/netbsd-5-0-RC3/amd64/200903200521Z-obj/home/builds/ab/netbsd-5-0-RC3/src/sys/arch/amd64/compile/XEN3_DOM0 amd64
Architecture: x86_64
Machine: amd64
>Description:
When mounting root filesystem with 'log' option to enable WAPBL and 
running mke2fs from sysutils/e2fsprogs the kernel panics:

panic: kernel diagnostic assertion "LIST_EMPTY(&vp->v_dirtyblkhd)" failed:
 file "/home/builds/ab/netbsd-5-0-RC3/src/sys/kern/vfs_subr.c", line 872
Begin traceback...
copyright() at 0xffffffff808bd18c
End traceback...

>How-To-Repeat:
Mount root filesystem with 'log' option to enable WAPBL.  Use mke2fs to 
create an ext2 filesystem.

>Fix:
Mount root filesystem without 'log' option.

>Release-Note:

>Audit-Trail:
From: Jason White <jdwhite@iastate.edu>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Tue, 14 Apr 2009 14:14:07 -0500

 Just tried netbsd-5 200904130000Z and the problem still persists.

 -Jason

 -- 
 Jason White
 Systems Analyst
 Information Technology Services
 Iowa State University

From: Jason White <jdwhite@iastate.edu>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Thu, 16 Apr 2009 16:47:18 -0500

 # uname -a
 NetBSD xen-2.its.iastate.edu 5.0_RC4 NetBSD 5.0_RC4 (XEN3_DOM0) #0: Wed Apr 15 23:03:24 PDT 2009  builds@wb28:/home/builds/ab/netbsd-5/amd64/200904150002Z-obj/home/builds/ab/netbsd-5/src/sys/arch/amd64/compile/XEN3_DOM0 amd64

 # mke2fs -V
 mke2fs 1.40.7 (28-Feb-2008)
         Using EXT2FS Library version 1.40.7

 # mke2fs -v -I 128 /dev/sd0m
 [...]
 panic: kernel diagnostic assertion "LIST_EMPTY(&vp->v_dirtyblkhd)" 
 failed: file "/home/builds/ab/netbsd-5/src/sys/kern/vfs_subr.c", line 872
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff804bfded cs e030 rflags 246 cr2  
 7f7ffdfdc000 cpl 0 rsp ffffa0001e8aa960
 Stopped in pid 66.1 (mke2fs) at netbsd:breakpoint+0x5:  leave
 breakpoint() at netbsd:breakpoint+0x5
 panic() at netbsd:panic+0x242
 __kernassert() at netbsd:__kernassert+0x2d
 vinvalbuf() at netbsd:vinvalbuf+0x206
 spec_close() at netbsd:spec_close+0x8a
 VOP_CLOSE() at netbsd:VOP_CLOSE+0x29
 vn_close() at netbsd:vn_close+0x51
 closef() at netbsd:closef+0x68
 fd_close() at netbsd:fd_close+0x134
 syscall() at netbsd:syscall+0xb4
 ds          0xa970
 es          0x121c
 fs          0xa970
 gs          0x12f7
 rdi         0
 rsi         0x1
 rbp         0xffffa0001e8aa960
 rbx         0xffffa0001e8aa970
 rdx         0
 rcx         0
 rax         0x1
 r8          0xffffffff80b56000  cpu_info_primary
 r9          0x1
 r10         0xffffa0001e8aa880
 r11         0xffffffff804fd2b0  xenconscn_putc
 r12         0x104
 r13         0xffffffff809f62d8
 r14         0xffffa0001e8aab20
 r15         0
 rip         0xffffffff804bfded  breakpoint+0x5
 cs          0xe030
 rflags      0x246
 rsp         0xffffa0001e8aa960
 ss          0xe02b
 netbsd:breakpoint+0x5:  leave
 db> show uvm
 Current UVM status:
   pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
   60079 VM pages: 12663 active, 0 inactive, 265 wired, 2068 free
   pages  7293 anon, 3389 file, 2246 exec
   freemin=256, free-target=341, wired-max=20026
   faults=55481, traps=54788, intrs=111349, ctxswitch=56744
   softint=47907, syscalls=157734, swapins=0, swapouts=0
   fault counts:
     noram=0, noanon=0, pgwait=0, pgrele=0
     ok relocks(total)=814(814), anget(retrys)=16556(0), amapcopy=5401
     neighbor anon/obj pg=13847/78621, gets(lock/unlock)=22813/814
     cases: anon=8604, anoncow=7942, obj=18376, prcopy=4437, przero=13214
   daemon and swap counts:
     woke=0, revs=0, scans=0, obscans=0, anscans=0
     busy=0, freed=0, reactivate=0, deactivate=0
     pageouts=0, pending=0, nswget=0
     nswapdev=1, swpgavail=131071
     swpages=131071, swpginuse=0, swpgonly=0, paging=0
 db> show event
 evcnt type 0: vmcmd kills = 203
 evcnt type 0: vmcmd extends = 6
 evcnt type 0: vmcmd calls = 1722
 evcnt type 0: bus_dma bounces = 92383
 evcnt type 0: bus_dma loads = 181023
 evcnt type 0: bus_dma nbouncebufs = 3104
 evcnt type 0: softint net/0 = 10311
 evcnt type 0: softint bio/0 = 17963
 evcnt type 0: softint clk/0 = 3667
 evcnt type 0: softint ser/0 = 786
 evcnt type 0: callout late/0 = 15
 evcnt type 0: crosscall unicast = 3
 evcnt type 0: namecache entries collected = 185
 evcnt type 0: namecache under scan target = 151
 evcnt type 1: vcpu0 xencons = 1
 evcnt type 1: vcpu0 ioapic0 pin 16 = 2593
 evcnt type 1: vcpu0 ioapic1 pin 14 = 88838
 evcnt type 1: vcpu0 ioapic0 pin 21 = 19
 evcnt type 1: vcpu0 ioapic0 = 8
 evcnt type 1: vcpu0 clock = 17389
 evcnt type 1: vcpu0 xenbus = 1046
 evcnt type 1: vcpu0 xbd1.1 = 1341
 evcnt type 1: vcpu0 xvif1.0 = 114
 db> show all pool
 POOL CACHEfatal protection fault in supervisor mode
 trap type 4 code 0 rip ffffffff806c025a cs e030 rflags 10202 cr2  
 7f7ffdfdc000 cpl 8 rsp ffffa0001e8aa2f8
 kernel: protection fault trap, code=0
 Faulted in DDB; continuing...
 db>

From: Jason White <jdwhite@iastate.edu>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Thu, 16 Apr 2009 17:09:50 -0500

 # uname -a
 NetBSD xen-2.its.iastate.edu 5.0_RC4 NetBSD 5.0_RC4 (XEN3_DOM0) #0: Wed Apr 15 23:03:24 PDT 2009  builds@wb28:/home/builds/ab/netbsd-5/amd64/200904150002Z-obj/home/builds/ab/netbsd-5/src/sys/arch/amd64/compile/XEN3_DOM0 amd64

 Another panic.  This time I was unmounting an ext2 filesystem, so the 
 problem isn't with mke2fs, but seemingly with ext2 filesystems in 
 general.

 panic: kernel diagnostic assertion "LIST_EMPTY(&vp->v_dirtyblkhd)" 
 failed: file "/home/builds/ab/netbsd-5/src/sys/kern/vfs_subr.c", line 872
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff804bfded cs e030 rflags 246 cr2  
 7f7ffda04000 cpl 0 rsp ffffa0001e8c7900
 Stopped in pid 752.1 (umount) at        netbsd:breakpoint+0x5:  leave
 breakpoint() at netbsd:breakpoint+0x5
 panic() at netbsd:panic+0x242
 __kernassert() at netbsd:__kernassert+0x2d
 vinvalbuf() at netbsd:vinvalbuf+0x206
 spec_close() at netbsd:spec_close+0x8a
 VOP_CLOSE() at netbsd:VOP_CLOSE+0x29
 ext2fs_unmount() at netbsd:ext2fs_unmount+0xa3
 dounmount() at netbsd:dounmount+0xd5
 sys_unmount() at netbsd:sys_unmount+0x11c
 syscall() at netbsd:syscall+0xb4
 ds          0x7910
 es          0x121c
 fs          0x7910
 gs          0x12f7
 rdi         0
 rsi         0x1
 rbp         0xffffa0001e8c7900
 rbx         0xffffa0001e8c7910
 rdx         0
 rcx         0
 rax         0x1
 r8          0xffffffff80b56000  cpu_info_primary
 r9          0x1
 r10         0xffffa0001e8c7820
 r11         0xffffffff804fd2b0  xenconscn_putc
 r12         0x104
 r13         0xffffffff809f62d8
 r14         0xffffa0001e8c7ac0
 r15         0
 rip         0xffffffff804bfded  breakpoint+0x5
 cs          0xe030
 rflags      0x246
 rsp         0xffffa0001e8c7900
 ss          0xe02b
 netbsd:breakpoint+0x5:  leave

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, jdwhite@iastate.edu
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Thu, 16 Apr 2009 23:56:46 -0400

 On Apr 16, 10:10pm, jdwhite@iastate.edu (Jason White) wrote:
 -- Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL

 | The following reply was made to PR kern/41189; it has been noted by GNATS.
 | 
 | From: Jason White <jdwhite@iastate.edu>
 | To: gnats-bugs@NetBSD.org
 | Cc: 
 | Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
 | Date: Thu, 16 Apr 2009 17:09:50 -0500
 | 
 |  # uname -a
 |  NetBSD xen-2.its.iastate.edu 5.0_RC4 NetBSD 5.0_RC4 (XEN3_DOM0) #0: Wed Apr 15 23:03:24 PDT 2009  builds@wb28:/home/builds/ab/netbsd-5/amd64/200904150002Z-obj/home/builds/ab/netbsd-5/src/sys/arch/amd64/compile/XEN3_DOM0 amd64
 |  
 |  Another panic.  This time I was unmounting an ext2 filesystem, so the 
 |  problem isn't with mke2fs, but seemingly with ext2 filesystems in 
 |  general.
 |  
 |  panic: kernel diagnostic assertion "LIST_EMPTY(&vp->v_dirtyblkhd)" 
 |  failed: file "/home/builds/ab/netbsd-5/src/sys/kern/vfs_subr.c", line 872
 |  fatal breakpoint trap in supervisor mode
 |  trap type 1 code 0 rip ffffffff804bfded cs e030 rflags 246 cr2  
 |  7f7ffda04000 cpl 0 rsp ffffa0001e8c7900
 |  Stopped in pid 752.1 (umount) at        netbsd:breakpoint+0x5:  leave
 |  breakpoint() at netbsd:breakpoint+0x5
 |  panic() at netbsd:panic+0x242
 |  __kernassert() at netbsd:__kernassert+0x2d
 |  vinvalbuf() at netbsd:vinvalbuf+0x206
 |  spec_close() at netbsd:spec_close+0x8a
 |  VOP_CLOSE() at netbsd:VOP_CLOSE+0x29
 |  ext2fs_unmount() at netbsd:ext2fs_unmount+0xa3
 |  dounmount() at netbsd:dounmount+0xd5
 |  sys_unmount() at netbsd:sys_unmount+0x11c
 |  syscall() at netbsd:syscall+0xb4
 |  ds          0x7910
 |  es          0x121c
 |  fs          0x7910
 |  gs          0x12f7
 |  rdi         0
 |  rsi         0x1
 |  rbp         0xffffa0001e8c7900
 |  rbx         0xffffa0001e8c7910
 |  rdx         0
 |  rcx         0
 |  rax         0x1
 |  r8          0xffffffff80b56000  cpu_info_primary
 |  r9          0x1
 |  r10         0xffffa0001e8c7820
 |  r11         0xffffffff804fd2b0  xenconscn_putc
 |  r12         0x104
 |  r13         0xffffffff809f62d8
 |  r14         0xffffa0001e8c7ac0
 |  r15         0
 |  rip         0xffffffff804bfded  breakpoint+0x5
 |  cs          0xe030
 |  rflags      0x246
 |  rsp         0xffffa0001e8c7900
 |  ss          0xe02b
 |  netbsd:breakpoint+0x5:  leave

 I don't know what I am talking about perhaps, but could something like this
 be the issue?

 Index: ext2fs/ext2fs_vnops.c
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ext2fs/ext2fs_vnops.c,v
 retrieving revision 1.83
 diff -u -u -r1.83 ext2fs_vnops.c
 --- ext2fs/ext2fs_vnops.c	23 Nov 2008 10:09:25 -0000	1.83
 +++ ext2fs/ext2fs_vnops.c	17 Apr 2009 03:54:58 -0000
 @@ -1349,6 +1349,10 @@
  	int wait;
  	int error;

 +	if ((ap->a_offlo == 0 && ap->a_offhi == 0) || (vp->v_type != VREG)) {
 +		error = ufs_full_fsync(vp, ap->a_flags, ext2fs_update);
 +		goto out;
 +	}
  	wait = (ap->a_flags & FSYNC_WAIT) != 0;

  	if (vp->v_type == VBLK)
 @@ -1365,7 +1369,7 @@
  		error = VOP_IOCTL(VTOI(vp)->i_devvp, DIOCCACHESYNC, &l, FWRITE,
  		    curlwp->l_cred);
  	}
 -
 +out:
  	return error;
  }

 Index: ffs/ffs_extern.h
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ffs/ffs_extern.h,v
 retrieving revision 1.75
 diff -u -u -r1.75 ffs_extern.h
 --- ffs/ffs_extern.h	22 Feb 2009 20:28:06 -0000	1.75
 +++ ffs/ffs_extern.h	17 Apr 2009 03:54:58 -0000
 @@ -138,7 +138,6 @@
  int	ffs_lock(void *);
  int	ffs_unlock(void *);
  int	ffs_islocked(void *);
 -int	ffs_full_fsync(struct vnode *, int);

  /*
   * Snapshot function prototypes.
 Index: ffs/ffs_vnops.c
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ffs/ffs_vnops.c,v
 retrieving revision 1.112
 diff -u -u -r1.112 ffs_vnops.c
 --- ffs/ffs_vnops.c	29 Mar 2009 10:29:00 -0000	1.112
 +++ ffs/ffs_vnops.c	17 Apr 2009 03:54:58 -0000
 @@ -290,7 +290,7 @@

  	fstrans_start(vp->v_mount, FSTRANS_LAZY);
  	if ((ap->a_offlo == 0 && ap->a_offhi == 0) || (vp->v_type != VREG)) {
 -		error = ffs_full_fsync(vp, ap->a_flags);
 +		error = ufs_full_fsync(vp, ap->a_flags, ffs_update);
  		goto out;
  	}

 @@ -394,179 +394,6 @@
  }

  /*
 - * Synch an open file.  Called for VOP_FSYNC().
 - */
 -/* ARGSUSED */
 -int
 -ffs_full_fsync(struct vnode *vp, int flags)
 -{
 -	struct buf *bp, *nbp;
 -	int error, passes, skipmeta, waitfor, i;
 -	struct mount *mp;
 -
 -	KASSERT(VTOI(vp) != NULL);
 -	KASSERT(vp->v_tag == VT_UFS);
 -
 -	error = 0;
 -
 -	mp = vp->v_mount;
 -	if (vp->v_type == VBLK && vp->v_specmountpoint != NULL) {
 -		mp = vp->v_specmountpoint;
 -	} else {
 -		mp = vp->v_mount;
 -	}
 -
 -	/*
 -	 * Flush all dirty data associated with the vnode.
 -	 */
 -	if (vp->v_type == VREG || vp->v_type == VBLK) {
 -		int pflags = PGO_ALLPAGES | PGO_CLEANIT;
 -
 -		if ((flags & FSYNC_WAIT))
 -			pflags |= PGO_SYNCIO;
 -		if (vp->v_type == VREG &&
 -		    fstrans_getstate(mp) == FSTRANS_SUSPENDING)
 -			pflags |= PGO_FREE;
 -		mutex_enter(&vp->v_interlock);
 -		error = VOP_PUTPAGES(vp, 0, 0, pflags);
 -		if (error)
 -			return error;
 -	}
 -
 -#ifdef WAPBL
 -	mp = wapbl_vptomp(vp);
 -	if (mp && mp->mnt_wapbl) {
 -		/*
 -		 * Don't bother writing out metadata if the syncer is
 -		 * making the request.  We will let the sync vnode
 -		 * write it out in a single burst through a call to
 -		 * VFS_SYNC().
 -		 */
 -		if ((flags & (FSYNC_DATAONLY | FSYNC_LAZY)) != 0)
 -			return 0;
 -
 -		if ((VTOI(vp)->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE
 -		    | IN_MODIFY | IN_MODIFIED | IN_ACCESSED)) != 0) {
 -			error = UFS_WAPBL_BEGIN(mp);
 -			if (error)
 -				return error;
 -			error = ffs_update(vp, NULL, NULL, UPDATE_CLOSE |
 -			    ((flags & FSYNC_WAIT) ? UPDATE_WAIT : 0));
 -			UFS_WAPBL_END(mp);
 -		}
 -		if (error || (flags & FSYNC_NOLOG) != 0)
 -			return error;
 -
 -		/*
 -		 * Don't flush the log if the vnode being flushed
 -		 * contains no dirty buffers that could be in the log.
 -		 */
 -		if (!LIST_EMPTY(&vp->v_dirtyblkhd)) {
 -			error = wapbl_flush(mp->mnt_wapbl, 0);
 -			if (error)
 -				return error;
 -		}
 -
 -		if ((flags & FSYNC_WAIT) != 0) {
 -			mutex_enter(&vp->v_interlock);
 -			while (vp->v_numoutput != 0)
 -				cv_wait(&vp->v_cv, &vp->v_interlock);
 -			mutex_exit(&vp->v_interlock);
 -		}
 -
 -		return error;
 -	}
 -#endif /* WAPBL */
 -
 -	/*
 -	 * Write out metadata for non-logging file systems. XXX This block
 -	 * should be simplified now that softdep is gone.
 -	 */
 -	passes = NIADDR + 1;
 -	skipmeta = 0;
 -	if (flags & FSYNC_WAIT)
 -		skipmeta = 1;
 -
 -loop:
 -	mutex_enter(&bufcache_lock);
 -	LIST_FOREACH(bp, &vp->v_dirtyblkhd, b_vnbufs) {
 -		bp->b_cflags &= ~BC_SCANNED;
 -	}
 -	for (bp = LIST_FIRST(&vp->v_dirtyblkhd); bp; bp = nbp) {
 -		nbp = LIST_NEXT(bp, b_vnbufs);
 -		if (bp->b_cflags & (BC_BUSY | BC_SCANNED))
 -			continue;
 -		if ((bp->b_oflags & BO_DELWRI) == 0)
 -			panic("ffs_fsync: not dirty");
 -		if (skipmeta && bp->b_lblkno < 0)
 -			continue;
 -		bp->b_cflags |= BC_BUSY | BC_VFLUSH | BC_SCANNED;
 -		mutex_exit(&bufcache_lock);
 -		/*
 -		 * On our final pass through, do all I/O synchronously
 -		 * so that we can find out if our flush is failing
 -		 * because of write errors.
 -		 */
 -		if (passes > 0 || !(flags & FSYNC_WAIT))
 -			(void) bawrite(bp);
 -		else if ((error = bwrite(bp)) != 0)
 -			return (error);
 -		/*
 -		 * Since we unlocked during the I/O, we need
 -		 * to start from a known point.
 -		 */
 -		mutex_enter(&bufcache_lock);
 -		nbp = LIST_FIRST(&vp->v_dirtyblkhd);
 -	}
 -	mutex_exit(&bufcache_lock);
 -	if (skipmeta) {
 -		skipmeta = 0;
 -		goto loop;
 -	}
 -
 -	if ((flags & FSYNC_WAIT) != 0) {
 -		mutex_enter(&vp->v_interlock);
 -		while (vp->v_numoutput) {
 -			cv_wait(&vp->v_cv, &vp->v_interlock);
 -		}
 -		mutex_exit(&vp->v_interlock);
 -
 -		/*
 -		 * Ensure that any filesystem metadata associated
 -		 * with the vnode has been written.
 -		 */
 -		if (!LIST_EMPTY(&vp->v_dirtyblkhd)) {
 -			/*
 -			* Block devices associated with filesystems may
 -			* have new I/O requests posted for them even if
 -			* the vnode is locked, so no amount of trying will
 -			* get them clean. Thus we give block devices a
 -			* good effort, then just give up. For all other file
 -			* types, go around and try again until it is clean.
 -			*/
 -			if (passes > 0) {
 -				passes--;
 -				goto loop;
 -			}
 -#ifdef DIAGNOSTIC
 -			if (vp->v_type != VBLK)
 -				vprint("ffs_fsync: dirty", vp);
 -#endif
 -		}
 -	}
 -
 -	waitfor = (flags & FSYNC_WAIT) ? UPDATE_WAIT : 0;
 -	error = ffs_update(vp, NULL, NULL, UPDATE_CLOSE | waitfor);
 -
 -	if (error == 0 && (flags & FSYNC_CACHE) != 0) {
 -		(void)VOP_IOCTL(VTOI(vp)->i_devvp, DIOCCACHESYNC, &i, FWRITE,
 -		    kauth_cred_get());
 -	}
 -
 -	return error;
 -}
 -
 -/*
   * Reclaim an inode so that it can be used for other purposes.
   */
  int
 Index: ufs/ufs_extern.h
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ufs/ufs_extern.h,v
 retrieving revision 1.61
 diff -u -u -r1.61 ufs_extern.h
 --- ufs/ufs_extern.h	22 Feb 2009 20:28:07 -0000	1.61
 +++ ufs/ufs_extern.h	17 Apr 2009 03:54:58 -0000
 @@ -165,6 +165,9 @@
  		      struct componentname *);
  int	ufs_gop_alloc(struct vnode *, off_t, off_t, int, kauth_cred_t);
  void	ufs_gop_markupdate(struct vnode *, int);
 +int	ufs_full_fsync(struct vnode *, int, int (*)(struct vnode *,
 +    const struct timespec *, const struct timespec *, int));
 +

  /*
   * Snapshot function prototypes.
 Index: ufs/ufs_vnops.c
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ufs/ufs_vnops.c,v
 retrieving revision 1.173
 diff -u -u -r1.173 ufs_vnops.c
 --- ufs/ufs_vnops.c	22 Feb 2009 20:28:07 -0000	1.173
 +++ ufs/ufs_vnops.c	17 Apr 2009 03:54:59 -0000
 @@ -2332,3 +2332,177 @@
  		ip->i_flag |= mask;
  	}
  }
 +
 +/*
 + * Synch an open file.  Called for VOP_FSYNC().
 + */
 +/* ARGSUSED */
 +int
 +ufs_full_fsync(struct vnode *vp, int flags, int (*update)(struct vnode *,
 +    const struct timespec *, const struct timespec *, int))
 +{
 +	struct buf *bp, *nbp;
 +	int error, passes, skipmeta, waitfor, i;
 +	struct mount *mp;
 +
 +	KASSERT(VTOI(vp) != NULL);
 +	KASSERT(vp->v_tag == VT_UFS);
 +
 +	error = 0;
 +
 +	mp = vp->v_mount;
 +	if (vp->v_type == VBLK && vp->v_specmountpoint != NULL) {
 +		mp = vp->v_specmountpoint;
 +	} else {
 +		mp = vp->v_mount;
 +	}
 +
 +	/*
 +	 * Flush all dirty data associated with the vnode.
 +	 */
 +	if (vp->v_type == VREG || vp->v_type == VBLK) {
 +		int pflags = PGO_ALLPAGES | PGO_CLEANIT;
 +
 +		if ((flags & FSYNC_WAIT))
 +			pflags |= PGO_SYNCIO;
 +		if (vp->v_type == VREG &&
 +		    fstrans_getstate(mp) == FSTRANS_SUSPENDING)
 +			pflags |= PGO_FREE;
 +		mutex_enter(&vp->v_interlock);
 +		error = VOP_PUTPAGES(vp, 0, 0, pflags);
 +		if (error)
 +			return error;
 +	}
 +
 +#ifdef WAPBL
 +	mp = wapbl_vptomp(vp);
 +	if (mp && mp->mnt_wapbl) {
 +		/*
 +		 * Don't bother writing out metadata if the syncer is
 +		 * making the request.  We will let the sync vnode
 +		 * write it out in a single burst through a call to
 +		 * VFS_SYNC().
 +		 */
 +		if ((flags & (FSYNC_DATAONLY | FSYNC_LAZY)) != 0)
 +			return 0;
 +
 +		if ((VTOI(vp)->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE
 +		    | IN_MODIFY | IN_MODIFIED | IN_ACCESSED)) != 0) {
 +			error = UFS_WAPBL_BEGIN(mp);
 +			if (error)
 +				return error;
 +			error = (*update)(vp, NULL, NULL, UPDATE_CLOSE |
 +			    ((flags & FSYNC_WAIT) ? UPDATE_WAIT : 0));
 +			UFS_WAPBL_END(mp);
 +		}
 +		if (error || (flags & FSYNC_NOLOG) != 0)
 +			return error;
 +
 +		/*
 +		 * Don't flush the log if the vnode being flushed
 +		 * contains no dirty buffers that could be in the log.
 +		 */
 +		if (!LIST_EMPTY(&vp->v_dirtyblkhd)) {
 +			error = wapbl_flush(mp->mnt_wapbl, 0);
 +			if (error)
 +				return error;
 +		}
 +
 +		if ((flags & FSYNC_WAIT) != 0) {
 +			mutex_enter(&vp->v_interlock);
 +			while (vp->v_numoutput != 0)
 +				cv_wait(&vp->v_cv, &vp->v_interlock);
 +			mutex_exit(&vp->v_interlock);
 +		}
 +
 +		return error;
 +	}
 +#endif /* WAPBL */
 +
 +	/*
 +	 * Write out metadata for non-logging file systems. XXX This block
 +	 * should be simplified now that softdep is gone.
 +	 */
 +	passes = NIADDR + 1;
 +	skipmeta = 0;
 +	if (flags & FSYNC_WAIT)
 +		skipmeta = 1;
 +
 +loop:
 +	mutex_enter(&bufcache_lock);
 +	LIST_FOREACH(bp, &vp->v_dirtyblkhd, b_vnbufs) {
 +		bp->b_cflags &= ~BC_SCANNED;
 +	}
 +	for (bp = LIST_FIRST(&vp->v_dirtyblkhd); bp; bp = nbp) {
 +		nbp = LIST_NEXT(bp, b_vnbufs);
 +		if (bp->b_cflags & (BC_BUSY | BC_SCANNED))
 +			continue;
 +		if ((bp->b_oflags & BO_DELWRI) == 0)
 +			panic("ufs_fsync: not dirty");
 +		if (skipmeta && bp->b_lblkno < 0)
 +			continue;
 +		bp->b_cflags |= BC_BUSY | BC_VFLUSH | BC_SCANNED;
 +		mutex_exit(&bufcache_lock);
 +		/*
 +		 * On our final pass through, do all I/O synchronously
 +		 * so that we can find out if our flush is failing
 +		 * because of write errors.
 +		 */
 +		if (passes > 0 || !(flags & FSYNC_WAIT))
 +			(void) bawrite(bp);
 +		else if ((error = bwrite(bp)) != 0)
 +			return (error);
 +		/*
 +		 * Since we unlocked during the I/O, we need
 +		 * to start from a known point.
 +		 */
 +		mutex_enter(&bufcache_lock);
 +		nbp = LIST_FIRST(&vp->v_dirtyblkhd);
 +	}
 +	mutex_exit(&bufcache_lock);
 +	if (skipmeta) {
 +		skipmeta = 0;
 +		goto loop;
 +	}
 +
 +	if ((flags & FSYNC_WAIT) != 0) {
 +		mutex_enter(&vp->v_interlock);
 +		while (vp->v_numoutput) {
 +			cv_wait(&vp->v_cv, &vp->v_interlock);
 +		}
 +		mutex_exit(&vp->v_interlock);
 +
 +		/*
 +		 * Ensure that any filesystem metadata associated
 +		 * with the vnode has been written.
 +		 */
 +		if (!LIST_EMPTY(&vp->v_dirtyblkhd)) {
 +			/*
 +			* Block devices associated with filesystems may
 +			* have new I/O requests posted for them even if
 +			* the vnode is locked, so no amount of trying will
 +			* get them clean. Thus we give block devices a
 +			* good effort, then just give up. For all other file
 +			* types, go around and try again until it is clean.
 +			*/
 +			if (passes > 0) {
 +				passes--;
 +				goto loop;
 +			}
 +#ifdef DIAGNOSTIC
 +			if (vp->v_type != VBLK)
 +				vprint("ffs_fsync: dirty", vp);
 +#endif
 +		}
 +	}
 +
 +	waitfor = (flags & FSYNC_WAIT) ? UPDATE_WAIT : 0;
 +	error = (*update)(vp, NULL, NULL, UPDATE_CLOSE | waitfor);
 +
 +	if (error == 0 && (flags & FSYNC_CACHE) != 0) {
 +		(void)VOP_IOCTL(VTOI(vp)->i_devvp, DIOCCACHESYNC, &i, FWRITE,
 +		    kauth_cred_get());
 +	}
 +
 +	return error;
 +}

From: Jason White <jdwhite@iastate.edu>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Fri, 17 Apr 2009 14:10:34 -0500

 Tried the above patch against today's 5.0RC4 sources:

 Right after I invoke mke2fs I get a panic:

 panic: kernel diagnostic assertion "rw_lock_held(&wl->wl_rwlock)" 
 failed: file "/usr/src/sys/kern/vfs_wapbl.c", line 1540
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff804bfddd cs e030 rflags 246 cr2  40a5d0 
 cpl 0 rsp ffffa0001e828790
 Stopped in pid 521.1 (mke2fs) at        netbsd:breakpoint+0x5:  leave
 breakpoint() at netbsd:breakpoint+0x5
 panic() at netbsd:panic+0x242
 __kernassert() at netbsd:__kernassert+0x2d
 wapbl_add_buf() at netbsd:wapbl_add_buf+0x42
 bdwrite() at netbsd:bdwrite+0xc0
 ffs_update() at netbsd:ffs_update+0x1ef
 ufs_full_fsync() at netbsd:ufs_full_fsync+0x3a7
 ffs_fsync() at netbsd:ffs_fsync+0x64
 VOP_FSYNC() at netbsd:VOP_FSYNC+0x34
 vinvalbuf() at netbsd:vinvalbuf+0xf6
 spec_close() at netbsd:spec_close+0x8a
 VOP_CLOSE() at netbsd:VOP_CLOSE+0x29
 vn_close() at netbsd:vn_close+0x51
 closef() at netbsd:closef+0x68
 fd_close() at netbsd:fd_close+0x134
 syscall() at netbsd:syscall+0xb4
 ds          0x87a0
 es          0x121c
 fs          0x87a0
 gs          0x12f7
 rdi         0
 rsi         0x1
 rbp         0xffffa0001e828790
 rbx         0xffffa0001e8287a0
 rdx         0
 rcx         0
 rax         0x1
 r8          0xffffffff80b53700  cpu_info_primary
 r9          0x1
 r10         0xffffa0001e8286b0
 r11         0xffffffff804fd2b0  xenconscn_putc
 r12         0x104
 r13         0xffffffff809f39a0
 r14         0x5
 r15         0xffffa0001e1c2e58
 rip         0xffffffff804bfddd  breakpoint+0x5
 cs          0xe030
 rflags      0x246
 rsp         0xffffa0001e828790
 ss          0xe02b
 netbsd:breakpoint+0x5:  leave

 ------

 This panic happens when I unmount an ext2 partition:

 panic: kernel diagnostic assertion "vp->v_tag == VT_UFS" failed: file 
 "/usr/src/sys/ufs/ufs/ufs_vnops.c", line 2438
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff804bfddd cs e030 rflags 246 cr2  
 7f7ffda04000 cpl 0 rsp ffffa0001e82a810
 Stopped in pid 570.1 (umount) at        netbsd:breakpoint+0x5:  leave
 breakpoint() at netbsd:breakpoint+0x5
 panic() at netbsd:panic+0x242
 __kernassert() at netbsd:__kernassert+0x2d
 ufs_full_fsync() at netbsd:ufs_full_fsync+0x40b
 ext2fs_fsync() at netbsd:ext2fs_fsync+0x42
 VOP_FSYNC() at netbsd:VOP_FSYNC+0x34
 vinvalbuf() at netbsd:vinvalbuf+0xf6
 vclean() at netbsd:vclean+0x1cf
 vflush() at netbsd:vflush+0x2cd
 ext2fs_unmount() at netbsd:ext2fs_unmount+0x33
 dounmount() at netbsd:dounmount+0xd5
 sys_unmount() at netbsd:sys_unmount+0x11c
 syscall() at netbsd:syscall+0xb4
 ds          0xa820
 es          0x121c
 fs          0xa820
 gs          0x12f7
 rdi         0
 rsi         0x1
 rbp         0xffffa0001e82a810
 rbx         0xffffa0001e82a820
 rdx         0
 rcx         0
 rax         0x1
 r8          0xffffffff80b53700  cpu_info_primary
 r9          0x1
 r10         0xffffa0001e82a730
 r11         0xffffffff804fd2b0  xenconscn_putc
 r12         0x104
 r13         0xffffffff809f39a0
 r14         0xffffa0001e7797c0
 r15         0
 rip         0xffffffff804bfddd  breakpoint+0x5
 cs          0xe030
 rflags      0x246
 rsp         0xffffa0001e82a810
 ss          0xe02b
 netbsd:breakpoint+0x5:  leave

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, jdwhite@iastate.edu
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Fri, 17 Apr 2009 16:25:59 -0400

 On Apr 17,  7:15pm, jdwhite@iastate.edu (Jason White) wrote:
 -- Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL

 | The following reply was made to PR kern/41189; it has been noted by GNATS.
 | 
 | From: Jason White <jdwhite@iastate.edu>
 | To: gnats-bugs@NetBSD.org
 | Cc: 
 | Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
 | Date: Fri, 17 Apr 2009 14:10:34 -0500
 | 
 |  Tried the above patch against today's 5.0RC4 sources:
 |  
 |  Right after I invoke mke2fs I get a panic:

 Did you run mke2fs on a partition that used to contain an ffs filesystem?
 Was that mounted?

 |  This panic happens when I unmount an ext2 partition:
 |  
 |  panic: kernel diagnostic assertion "vp->v_tag == VT_UFS" failed: file 

 This is my fault, new patch:

 Index: ext2fs/ext2fs_vnops.c
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ext2fs/ext2fs_vnops.c,v
 retrieving revision 1.83
 diff -u -u -r1.83 ext2fs_vnops.c
 --- ext2fs/ext2fs_vnops.c	23 Nov 2008 10:09:25 -0000	1.83
 +++ ext2fs/ext2fs_vnops.c	17 Apr 2009 20:24:10 -0000
 @@ -1349,6 +1349,11 @@
  	int wait;
  	int error;

 +	if ((ap->a_offlo == 0 && ap->a_offhi == 0) || (vp->v_type != VREG)) {
 +		error = ufs_full_fsync(vp, ap->a_flags, ext2fs_update,
 +		    VT_EXT2FS);
 +		goto out;
 +	}
  	wait = (ap->a_flags & FSYNC_WAIT) != 0;

  	if (vp->v_type == VBLK)
 @@ -1365,7 +1370,7 @@
  		error = VOP_IOCTL(VTOI(vp)->i_devvp, DIOCCACHESYNC, &l, FWRITE,
  		    curlwp->l_cred);
  	}
 -
 +out:
  	return error;
  }

 Index: ffs/ffs_extern.h
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ffs/ffs_extern.h,v
 retrieving revision 1.75
 diff -u -u -r1.75 ffs_extern.h
 --- ffs/ffs_extern.h	22 Feb 2009 20:28:06 -0000	1.75
 +++ ffs/ffs_extern.h	17 Apr 2009 20:24:10 -0000
 @@ -138,7 +138,6 @@
  int	ffs_lock(void *);
  int	ffs_unlock(void *);
  int	ffs_islocked(void *);
 -int	ffs_full_fsync(struct vnode *, int);

  /*
   * Snapshot function prototypes.
 Index: ffs/ffs_vnops.c
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ffs/ffs_vnops.c,v
 retrieving revision 1.112
 diff -u -u -r1.112 ffs_vnops.c
 --- ffs/ffs_vnops.c	29 Mar 2009 10:29:00 -0000	1.112
 +++ ffs/ffs_vnops.c	17 Apr 2009 20:24:10 -0000
 @@ -290,7 +290,7 @@

  	fstrans_start(vp->v_mount, FSTRANS_LAZY);
  	if ((ap->a_offlo == 0 && ap->a_offhi == 0) || (vp->v_type != VREG)) {
 -		error = ffs_full_fsync(vp, ap->a_flags);
 +		error = ufs_full_fsync(vp, ap->a_flags, ffs_update, VT_UFS);
  		goto out;
  	}

 @@ -394,179 +394,6 @@
  }

  /*
 - * Synch an open file.  Called for VOP_FSYNC().
 - */
 -/* ARGSUSED */
 -int
 -ffs_full_fsync(struct vnode *vp, int flags)
 -{
 -	struct buf *bp, *nbp;
 -	int error, passes, skipmeta, waitfor, i;
 -	struct mount *mp;
 -
 -	KASSERT(VTOI(vp) != NULL);
 -	KASSERT(vp->v_tag == VT_UFS);
 -
 -	error = 0;
 -
 -	mp = vp->v_mount;
 -	if (vp->v_type == VBLK && vp->v_specmountpoint != NULL) {
 -		mp = vp->v_specmountpoint;
 -	} else {
 -		mp = vp->v_mount;
 -	}
 -
 -	/*
 -	 * Flush all dirty data associated with the vnode.
 -	 */
 -	if (vp->v_type == VREG || vp->v_type == VBLK) {
 -		int pflags = PGO_ALLPAGES | PGO_CLEANIT;
 -
 -		if ((flags & FSYNC_WAIT))
 -			pflags |= PGO_SYNCIO;
 -		if (vp->v_type == VREG &&
 -		    fstrans_getstate(mp) == FSTRANS_SUSPENDING)
 -			pflags |= PGO_FREE;
 -		mutex_enter(&vp->v_interlock);
 -		error = VOP_PUTPAGES(vp, 0, 0, pflags);
 -		if (error)
 -			return error;
 -	}
 -
 -#ifdef WAPBL
 -	mp = wapbl_vptomp(vp);
 -	if (mp && mp->mnt_wapbl) {
 -		/*
 -		 * Don't bother writing out metadata if the syncer is
 -		 * making the request.  We will let the sync vnode
 -		 * write it out in a single burst through a call to
 -		 * VFS_SYNC().
 -		 */
 -		if ((flags & (FSYNC_DATAONLY | FSYNC_LAZY)) != 0)
 -			return 0;
 -
 -		if ((VTOI(vp)->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE
 -		    | IN_MODIFY | IN_MODIFIED | IN_ACCESSED)) != 0) {
 -			error = UFS_WAPBL_BEGIN(mp);
 -			if (error)
 -				return error;
 -			error = ffs_update(vp, NULL, NULL, UPDATE_CLOSE |
 -			    ((flags & FSYNC_WAIT) ? UPDATE_WAIT : 0));
 -			UFS_WAPBL_END(mp);
 -		}
 -		if (error || (flags & FSYNC_NOLOG) != 0)
 -			return error;
 -
 -		/*
 -		 * Don't flush the log if the vnode being flushed
 -		 * contains no dirty buffers that could be in the log.
 -		 */
 -		if (!LIST_EMPTY(&vp->v_dirtyblkhd)) {
 -			error = wapbl_flush(mp->mnt_wapbl, 0);
 -			if (error)
 -				return error;
 -		}
 -
 -		if ((flags & FSYNC_WAIT) != 0) {
 -			mutex_enter(&vp->v_interlock);
 -			while (vp->v_numoutput != 0)
 -				cv_wait(&vp->v_cv, &vp->v_interlock);
 -			mutex_exit(&vp->v_interlock);
 -		}
 -
 -		return error;
 -	}
 -#endif /* WAPBL */
 -
 -	/*
 -	 * Write out metadata for non-logging file systems. XXX This block
 -	 * should be simplified now that softdep is gone.
 -	 */
 -	passes = NIADDR + 1;
 -	skipmeta = 0;
 -	if (flags & FSYNC_WAIT)
 -		skipmeta = 1;
 -
 -loop:
 -	mutex_enter(&bufcache_lock);
 -	LIST_FOREACH(bp, &vp->v_dirtyblkhd, b_vnbufs) {
 -		bp->b_cflags &= ~BC_SCANNED;
 -	}
 -	for (bp = LIST_FIRST(&vp->v_dirtyblkhd); bp; bp = nbp) {
 -		nbp = LIST_NEXT(bp, b_vnbufs);
 -		if (bp->b_cflags & (BC_BUSY | BC_SCANNED))
 -			continue;
 -		if ((bp->b_oflags & BO_DELWRI) == 0)
 -			panic("ffs_fsync: not dirty");
 -		if (skipmeta && bp->b_lblkno < 0)
 -			continue;
 -		bp->b_cflags |= BC_BUSY | BC_VFLUSH | BC_SCANNED;
 -		mutex_exit(&bufcache_lock);
 -		/*
 -		 * On our final pass through, do all I/O synchronously
 -		 * so that we can find out if our flush is failing
 -		 * because of write errors.
 -		 */
 -		if (passes > 0 || !(flags & FSYNC_WAIT))
 -			(void) bawrite(bp);
 -		else if ((error = bwrite(bp)) != 0)
 -			return (error);
 -		/*
 -		 * Since we unlocked during the I/O, we need
 -		 * to start from a known point.
 -		 */
 -		mutex_enter(&bufcache_lock);
 -		nbp = LIST_FIRST(&vp->v_dirtyblkhd);
 -	}
 -	mutex_exit(&bufcache_lock);
 -	if (skipmeta) {
 -		skipmeta = 0;
 -		goto loop;
 -	}
 -
 -	if ((flags & FSYNC_WAIT) != 0) {
 -		mutex_enter(&vp->v_interlock);
 -		while (vp->v_numoutput) {
 -			cv_wait(&vp->v_cv, &vp->v_interlock);
 -		}
 -		mutex_exit(&vp->v_interlock);
 -
 -		/*
 -		 * Ensure that any filesystem metadata associated
 -		 * with the vnode has been written.
 -		 */
 -		if (!LIST_EMPTY(&vp->v_dirtyblkhd)) {
 -			/*
 -			* Block devices associated with filesystems may
 -			* have new I/O requests posted for them even if
 -			* the vnode is locked, so no amount of trying will
 -			* get them clean. Thus we give block devices a
 -			* good effort, then just give up. For all other file
 -			* types, go around and try again until it is clean.
 -			*/
 -			if (passes > 0) {
 -				passes--;
 -				goto loop;
 -			}
 -#ifdef DIAGNOSTIC
 -			if (vp->v_type != VBLK)
 -				vprint("ffs_fsync: dirty", vp);
 -#endif
 -		}
 -	}
 -
 -	waitfor = (flags & FSYNC_WAIT) ? UPDATE_WAIT : 0;
 -	error = ffs_update(vp, NULL, NULL, UPDATE_CLOSE | waitfor);
 -
 -	if (error == 0 && (flags & FSYNC_CACHE) != 0) {
 -		(void)VOP_IOCTL(VTOI(vp)->i_devvp, DIOCCACHESYNC, &i, FWRITE,
 -		    kauth_cred_get());
 -	}
 -
 -	return error;
 -}
 -
 -/*
   * Reclaim an inode so that it can be used for other purposes.
   */
  int
 Index: ufs/ufs_extern.h
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ufs/ufs_extern.h,v
 retrieving revision 1.61
 diff -u -u -r1.61 ufs_extern.h
 --- ufs/ufs_extern.h	22 Feb 2009 20:28:07 -0000	1.61
 +++ ufs/ufs_extern.h	17 Apr 2009 20:24:11 -0000
 @@ -165,6 +165,9 @@
  		      struct componentname *);
  int	ufs_gop_alloc(struct vnode *, off_t, off_t, int, kauth_cred_t);
  void	ufs_gop_markupdate(struct vnode *, int);
 +int	ufs_full_fsync(struct vnode *, int, int (*)(struct vnode *,
 +    const struct timespec *, const struct timespec *, int), int);
 +

  /*
   * Snapshot function prototypes.
 Index: ufs/ufs_vnops.c
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ufs/ufs_vnops.c,v
 retrieving revision 1.173
 diff -u -u -r1.173 ufs_vnops.c
 --- ufs/ufs_vnops.c	22 Feb 2009 20:28:07 -0000	1.173
 +++ ufs/ufs_vnops.c	17 Apr 2009 20:24:11 -0000
 @@ -2332,3 +2332,179 @@
  		ip->i_flag |= mask;
  	}
  }
 +
 +/*
 + * Synch an open file.  Called for VOP_FSYNC().
 + */
 +/* ARGSUSED */
 +int
 +ufs_full_fsync(struct vnode *vp, int flags, int (*update)(struct vnode *,
 +    const struct timespec *, const struct timespec *, int), int vtag)
 +{
 +	struct buf *bp, *nbp;
 +	int error, passes, skipmeta, waitfor, i;
 +	struct mount *mp;
 +
 +	KASSERT(VTOI(vp) != NULL);
 +	KASSERT(vp->v_tag == vtag);
 +
 +	error = 0;
 +
 +	mp = vp->v_mount;
 +	if (vp->v_type == VBLK && vp->v_specmountpoint != NULL) {
 +		mp = vp->v_specmountpoint;
 +	} else {
 +		mp = vp->v_mount;
 +	}
 +
 +	/*
 +	 * Flush all dirty data associated with the vnode.
 +	 */
 +	if (vp->v_type == VREG || vp->v_type == VBLK) {
 +		int pflags = PGO_ALLPAGES | PGO_CLEANIT;
 +
 +		if ((flags & FSYNC_WAIT))
 +			pflags |= PGO_SYNCIO;
 +		if (vp->v_type == VREG &&
 +		    fstrans_getstate(mp) == FSTRANS_SUSPENDING)
 +			pflags |= PGO_FREE;
 +		mutex_enter(&vp->v_interlock);
 +		error = VOP_PUTPAGES(vp, 0, 0, pflags);
 +		if (error)
 +			return error;
 +	}
 +
 +#ifdef WAPBL
 +	if (vtag == VT_UFS) {
 +	mp = wapbl_vptomp(vp);
 +	if (mp && mp->mnt_wapbl) {
 +		/*
 +		 * Don't bother writing out metadata if the syncer is
 +		 * making the request.  We will let the sync vnode
 +		 * write it out in a single burst through a call to
 +		 * VFS_SYNC().
 +		 */
 +		if ((flags & (FSYNC_DATAONLY | FSYNC_LAZY)) != 0)
 +			return 0;
 +
 +		if ((VTOI(vp)->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE
 +		    | IN_MODIFY | IN_MODIFIED | IN_ACCESSED)) != 0) {
 +			error = UFS_WAPBL_BEGIN(mp);
 +			if (error)
 +				return error;
 +			error = (*update)(vp, NULL, NULL, UPDATE_CLOSE |
 +			    ((flags & FSYNC_WAIT) ? UPDATE_WAIT : 0));
 +			UFS_WAPBL_END(mp);
 +		}
 +		if (error || (flags & FSYNC_NOLOG) != 0)
 +			return error;
 +
 +		/*
 +		 * Don't flush the log if the vnode being flushed
 +		 * contains no dirty buffers that could be in the log.
 +		 */
 +		if (!LIST_EMPTY(&vp->v_dirtyblkhd)) {
 +			error = wapbl_flush(mp->mnt_wapbl, 0);
 +			if (error)
 +				return error;
 +		}
 +
 +		if ((flags & FSYNC_WAIT) != 0) {
 +			mutex_enter(&vp->v_interlock);
 +			while (vp->v_numoutput != 0)
 +				cv_wait(&vp->v_cv, &vp->v_interlock);
 +			mutex_exit(&vp->v_interlock);
 +		}
 +
 +		return error;
 +	}
 +	}
 +#endif /* WAPBL */
 +
 +	/*
 +	 * Write out metadata for non-logging file systems. XXX This block
 +	 * should be simplified now that softdep is gone.
 +	 */
 +	passes = NIADDR + 1;
 +	skipmeta = 0;
 +	if (flags & FSYNC_WAIT)
 +		skipmeta = 1;
 +
 +loop:
 +	mutex_enter(&bufcache_lock);
 +	LIST_FOREACH(bp, &vp->v_dirtyblkhd, b_vnbufs) {
 +		bp->b_cflags &= ~BC_SCANNED;
 +	}
 +	for (bp = LIST_FIRST(&vp->v_dirtyblkhd); bp; bp = nbp) {
 +		nbp = LIST_NEXT(bp, b_vnbufs);
 +		if (bp->b_cflags & (BC_BUSY | BC_SCANNED))
 +			continue;
 +		if ((bp->b_oflags & BO_DELWRI) == 0)
 +			panic("ufs_fsync: not dirty");
 +		if (skipmeta && bp->b_lblkno < 0)
 +			continue;
 +		bp->b_cflags |= BC_BUSY | BC_VFLUSH | BC_SCANNED;
 +		mutex_exit(&bufcache_lock);
 +		/*
 +		 * On our final pass through, do all I/O synchronously
 +		 * so that we can find out if our flush is failing
 +		 * because of write errors.
 +		 */
 +		if (passes > 0 || !(flags & FSYNC_WAIT))
 +			(void) bawrite(bp);
 +		else if ((error = bwrite(bp)) != 0)
 +			return (error);
 +		/*
 +		 * Since we unlocked during the I/O, we need
 +		 * to start from a known point.
 +		 */
 +		mutex_enter(&bufcache_lock);
 +		nbp = LIST_FIRST(&vp->v_dirtyblkhd);
 +	}
 +	mutex_exit(&bufcache_lock);
 +	if (skipmeta) {
 +		skipmeta = 0;
 +		goto loop;
 +	}
 +
 +	if ((flags & FSYNC_WAIT) != 0) {
 +		mutex_enter(&vp->v_interlock);
 +		while (vp->v_numoutput) {
 +			cv_wait(&vp->v_cv, &vp->v_interlock);
 +		}
 +		mutex_exit(&vp->v_interlock);
 +
 +		/*
 +		 * Ensure that any filesystem metadata associated
 +		 * with the vnode has been written.
 +		 */
 +		if (!LIST_EMPTY(&vp->v_dirtyblkhd)) {
 +			/*
 +			* Block devices associated with filesystems may
 +			* have new I/O requests posted for them even if
 +			* the vnode is locked, so no amount of trying will
 +			* get them clean. Thus we give block devices a
 +			* good effort, then just give up. For all other file
 +			* types, go around and try again until it is clean.
 +			*/
 +			if (passes > 0) {
 +				passes--;
 +				goto loop;
 +			}
 +#ifdef DIAGNOSTIC
 +			if (vp->v_type != VBLK)
 +				vprint("ffs_fsync: dirty", vp);
 +#endif
 +		}
 +	}
 +
 +	waitfor = (flags & FSYNC_WAIT) ? UPDATE_WAIT : 0;
 +	error = (*update)(vp, NULL, NULL, UPDATE_CLOSE | waitfor);
 +
 +	if (error == 0 && (flags & FSYNC_CACHE) != 0) {
 +		(void)VOP_IOCTL(VTOI(vp)->i_devvp, DIOCCACHESYNC, &i, FWRITE,
 +		    kauth_cred_get());
 +	}
 +
 +	return error;
 +}

From: Andrew Doran <ad@netbsd.org>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, jdwhite@iastate.edu
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Fri, 17 Apr 2009 20:37:08 +0000

 It was my understanding that no ext2fs file systems are involved, other
 than the one being created with mke2fs. Have I misread the PR?

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Fri, 17 Apr 2009 20:51:04 +0000

 I don't know exactly what is happening in this case but there are at least
 two issues that I see. Ok so ffs_full_fsync() is called for the vnode
 representing /dev/sd0m:

 (1) WAPBL processing should happen on this vnode, because the corresponding
 inode is from a logging file system. However it represents a block device,
 and so the call to wapbl_vptomp() ignores the important fact that it's from
 a logging FS and returns NULL. We skip the WAPBL block.

 (2) Since we are not doing the logging update in ffs_full_sync(), we descend
 into the regular UFS case. This flushes any delayed writes resulting from
 block I/O from userspace (or a mounted file system) to the device. This
 happens correctly. Note that if (1) is fixed naively, this will cease to
 happen. In theory this should flush the buffers for spec_close(). Once this
 is done, we incorrectly do the inode update as if there was no logging.

 Some background information. ffs hangs its dirty buffers in two places:

 - vp->v_dirtyblkhd

   VREG/VDIR: indirect blocks, to track per-inode space allocation
   VBLK: dirty blocks for a block device
   Others: not used

 - VTOI(vp)->i_ump->um_devvp->v_dirtyblkhd

   All other metadata, e.g. on-disk inodes.

 Since block devices don't have indirect blocks, there shouldn't be a clash.
 The whole thing is a big mess. We could fix a slew of bugs once and for all
 with devfs. It would greatly simplify this crud. In the meantime I am going
 to crack open a beer. :-)

From: Jason White <jdwhite@iastate.edu>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Fri, 17 Apr 2009 16:05:06 -0500

 On Fri, Apr 17, 2009 at 08:30PM +0000, Christos Zoulas wrote:
 >
 > Did you run mke2fs on a partition that used to contain an ffs filesystem?

 	No.

 > Was that mounted?

 	No.

 > |  This panic happens when I unmount an ext2 partition:
 > |
 > |  panic: kernel diagnostic assertion "vp->v_tag == VT_UFS" failed: file
 >
 > This is my fault, new patch:
 [...]
 Running 'mke2fs -v -I 128 /dev/sd0m':

 panic: kernel diagnostic assertion "rw_lock_held(&wl->wl_rwlock)" 
 failed: file "/usr/src/sys/kern/vfs_wapbl.c", line 1540
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff804bfddd cs e030 rflags 246 cr2  40a5d0 
 cpl 0 rsp ffffa0001e829790
 Stopped in pid 504.1 (mke2fs) at        netbsd:breakpoint+0x5:  leave
 breakpoint() at netbsd:breakpoint+0x5
 panic() at netbsd:panic+0x242
 __kernassert() at netbsd:__kernassert+0x2d
 wapbl_add_buf() at netbsd:wapbl_add_buf+0x42
 bdwrite() at netbsd:bdwrite+0xc0
 ffs_update() at netbsd:ffs_update+0x1ef
 ufs_full_fsync() at netbsd:ufs_full_fsync+0x35d
 ffs_fsync() at netbsd:ffs_fsync+0x69
 VOP_FSYNC() at netbsd:VOP_FSYNC+0x34
 vinvalbuf() at netbsd:vinvalbuf+0xf6
 spec_close() at netbsd:spec_close+0x8a
 VOP_CLOSE() at netbsd:VOP_CLOSE+0x29
 vn_close() at netbsd:vn_close+0x51
 closef() at netbsd:closef+0x68
 fd_close() at netbsd:fd_close+0x134
 syscall() at netbsd:syscall+0xb4
 ds          0x97a0
 es          0x121c
 fs          0x97a0
 gs          0x12f7
 rdi         0
 rsi         0x1
 rbp         0xffffa0001e829790
 rbx         0xffffa0001e8297a0
 rdx         0
 rcx         0
 rax         0x1
 r8          0xffffffff80b53700  cpu_info_primary
 r9          0x1
 r10         0xffffa0001e8296b0
 r11         0xffffffff804fd2b0  xenconscn_putc
 r12         0x104
 r13         0xffffffff809f39b0
 r14         0x5
 r15         0xffffa0001e1c2e58
 rip         0xffffffff804bfddd  breakpoint+0x5
 cs          0xe030
 rflags      0x246
 rsp         0xffffa0001e829790
 ss          0xe02b
 netbsd:breakpoint+0x5:  leave
 db>

 -- 
 Jason White
 Systems Analyst
 Information Technology Services
 Iowa State University

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, jdwhite@iastate.edu
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Fri, 17 Apr 2009 17:19:12 -0400

 On Apr 17,  8:55pm, ad@netbsd.org (Andrew Doran) wrote:
 -- Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL

 | The following reply was made to PR kern/41189; it has been noted by GNATS.
 | 
 | From: Andrew Doran <ad@netbsd.org>
 | To: gnats-bugs@NetBSD.org
 | Cc: 
 | Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
 | Date: Fri, 17 Apr 2009 20:51:04 +0000
 | 
 |  I don't know exactly what is happening in this case but there are at least
 |  two issues that I see. Ok so ffs_full_fsync() is called for the vnode
 |  representing /dev/sd0m:
 |  
 |  (1) WAPBL processing should happen on this vnode, because the corresponding
 |  inode is from a logging file system. However it represents a block device,
 |  and so the call to wapbl_vptomp() ignores the important fact that it's from
 |  a logging FS and returns NULL. We skip the WAPBL block.
 |  
 |  (2) Since we are not doing the logging update in ffs_full_sync(), we descend
 |  into the regular UFS case. This flushes any delayed writes resulting from
 |  block I/O from userspace (or a mounted file system) to the device. This
 |  happens correctly. Note that if (1) is fixed naively, this will cease to
 |  happen. In theory this should flush the buffers for spec_close(). Once this
 |  is done, we incorrectly do the inode update as if there was no logging.
 |  
 |  Some background information. ffs hangs its dirty buffers in two places:
 |  
 |  - vp->v_dirtyblkhd
 |   
 |    VREG/VDIR: indirect blocks, to track per-inode space allocation
 |    VBLK: dirty blocks for a block device
 |    Others: not used
 |  
 |  - VTOI(vp)->i_ump->um_devvp->v_dirtyblkhd
 |  
 |    All other metadata, e.g. on-disk inodes.
 |  
 |  Since block devices don't have indirect blocks, there shouldn't be a clash.
 |  The whole thing is a big mess. We could fix a slew of bugs once and for all
 |  with devfs. It would greatly simplify this crud. In the meantime I am going
 |  to crack open a beer. :-)

 I think that there are differences in the way that the wait or no wait flag
 is computed in many places and I believe that they should be consistent. Like:

 error = ffs_update(vp, NULL, NULL,
     (ap->a_flags & FSYNC_WAIT) ? UPDATE_WAIT : 0);

 vs.
 error = ffs_update(vp, NULL, NULL,
     ((ap->a_flags & (FSYNC_WAIT | FSYNC_DATAONLY)) == FSYNC_WAIT)
     ? UPDATE_WAIT : 0);

 christos

From: christos@zoulas.com (Christos Zoulas)
To: Andrew Doran <ad@netbsd.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, jdwhite@iastate.edu
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Fri, 17 Apr 2009 17:34:17 -0400

 On Apr 17,  8:37pm, ad@netbsd.org (Andrew Doran) wrote:
 -- Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL

 | It was my understanding that no ext2fs file systems are involved, other
 | than the one being created with mke2fs. Have I misread the PR?

 You are right, the second panic was my fault as I mentioned in the PR
 from unmounting ext2fs.

 christos

From: Jason White <jdwhite@iastate.edu>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Fri, 17 Apr 2009 16:59:34 -0500

 > On Apr 17,  8:37pm, ad@netbsd.org (Andrew Doran) wrote:
 > -- Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
 >
 > | It was my understanding that no ext2fs file systems are involved, other
 > | than the one being created with mke2fs. Have I misread the PR?
 >
 > You are right, the second panic was my fault as I mentioned in the PR
 > from unmounting ext2fs.

   The machine I've been testing on has three ext2 partitions on it.  In 
 the course of narrowing down what triggers this bug and testing patches,
 I'd successfully made three ext2 filesystems (when / (sd0a) was mounted 
 without 'log') on different partitions -- none of which was ever ffs 
 formatted.

 Typically I've been running mke2fs on just one of the ext2 partitions, 
 but I was in a situation where I had booted with log enabled on / and 
 had mounted one of the other ext2 partitions.  When finished, I 
 unmounted it and (to my surprise) caused a kernel panic.  Up until that 
 time I'd only been running mke2fs to trigger the panic.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Sun, 26 Apr 2009 05:04:55 +0000

 On Fri, Apr 17, 2009 at 08:55:02PM +0000, Andrew Doran wrote:
  >  Some background information. ffs hangs its dirty buffers in two places:
  >  
  >  - vp->v_dirtyblkhd
  >   
  >    VREG/VDIR: indirect blocks, to track per-inode space allocation
  >    VBLK: dirty blocks for a block device
  >    Others: not used
  >  
  >  - VTOI(vp)->i_ump->um_devvp->v_dirtyblkhd
  >  
  >    All other metadata, e.g. on-disk inodes.
  >  
  >  Since block devices don't have indirect blocks, there shouldn't be a clash.
  >  The whole thing is a big mess. We could fix a slew of bugs once and for all
  >  with devfs. It would greatly simplify this crud.

 Realistically devfs won't change the real problem (confusing whether
 um_devvp belongs to the fs mounted on it or the fs it sits on) but
 just replace wapbl-related issues with comparable devfs-related
 issues.

 The real fix is to change the buffer cache indexing scheme. Right now
 the buffer cache is indexed by vnode and offset; it should be indexed
 by filesystem (that is, struct mount), vnode or inode number, and
 offset. This should use a reserved inode number or vnode pointer for
 whole-fs buffers; and whole-fs buffers should be getting queued in the
 mount structure, not any vnode.

 Then at the cost of what should be only a small amount of new code
 (but a lot of interface changes) the potential for confusion will go
 away permanently, and we can change the name of um_devvp so as to hunt
 down and kill all further misuses of it.

  >  In the meantime I am going to crack open a beer. :-)

 wise choice. :-)

 -- 
 David A. Holland
 dholland@netbsd.org

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, jdwhite@iastate.edu
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Mon, 4 May 2009 23:05:08 +0000

 On Sun, Apr 26, 2009 at 05:05:02AM +0000, David Holland wrote:

 >  Realistically devfs won't change the real problem (confusing whether
 >  um_devvp belongs to the fs mounted on it or the fs it sits on) but
 >  just replace wapbl-related issues with comparable devfs-related
 >  issues.
 >  
 >  The real fix is to change the buffer cache indexing scheme. Right now
 >  the buffer cache is indexed by vnode and offset; it should be indexed
 >  by filesystem (that is, struct mount), vnode or inode number, and
 >  offset. This should use a reserved inode number or vnode pointer for
 >  whole-fs buffers; and whole-fs buffers should be getting queued in the
 >  mount structure, not any vnode.
 >  
 >  Then at the cost of what should be only a small amount of new code
 >  (but a lot of interface changes) the potential for confusion will go
 >  away permanently, and we can change the name of um_devvp so as to hunt
 >  down and kill all further misuses of it.

 I disagree. I think the solution is:

 - devfs
 - kill block devices in userspace
 - allow unaligned I/O to disk devices via the raw node

From: David Holland <dholland-bugs@netbsd.org>
To: Andrew Doran <ad@netbsd.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, jdwhite@iastate.edu
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Wed, 6 May 2009 23:22:03 +0000

 On Mon, May 04, 2009 at 11:05:08PM +0000, Andrew Doran wrote:
  > >  Realistically devfs won't change the real problem (confusing whether
  > >  um_devvp belongs to the fs mounted on it or the fs it sits on) but
  > >  just replace wapbl-related issues with comparable devfs-related
  > >  issues.
  > >  
  > >  The real fix is [...]
  > 
  > I disagree. I think the solution is:
  > 
  > - devfs
  > - kill block devices in userspace
  > - allow unaligned I/O to disk devices via the raw node

 How/why? As I've already explained once (mostly quoted above), devfs
 will not solve the real problem, just move it around. What are you
 envisioning that devfs will provide that will avoid the confusion?

 And while unaligned I/O seems like a fine idea (as long as it doesn't
 make physio slower) I don't see how it's relevant.

 Please *explain* your reasoning.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Andrew Doran <ad@netbsd.org>
To: David Holland <dholland-bugs@netbsd.org>
Cc: tech-kern@netbsd.org, gnats-bugs@NetBSD.org
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Sat, 9 May 2009 07:30:37 +0000

 On Wed, May 06, 2009 at 11:22:03PM +0000, David Holland wrote:

 >  > I disagree. I think the solution is:
 >  > 
 >  > - devfs
 >  > - kill block devices in userspace
 >  > - allow unaligned I/O to disk devices via the raw node
 > 
 > How/why? As I've already explained once (mostly quoted above), devfs
 > will not solve the real problem, just move it around. What are you
 > envisioning that devfs will provide that will avoid the confusion?
 > 
 > And while unaligned I/O seems like a fine idea (as long as it doesn't
 > make physio slower) I don't see how it's relevant.
 > 
 > Please *explain* your reasoning.

 I should have been clearer. Have a look at the bigger picture.

 - We have longstanding problems with device nodes showing up in multiple
   file systems. For 6.0 we have devfs coming along, which will at some point
   in its development likely eliminate the need to support device nodes on
   other file systems. So devfs will give us a 1:1 mapping between device
   instances and vnodes (or maybe devfs inodes).

 - We have longstanding problems providing block device semantics. Block
   devices are an interesting toy but they have no real application. Disk
   character devices suffice with one exception: on NetBSD, transfers on
   these devices must be aligned. So there is no need for physio to cache,
   it could simply buffer to allow misaligned transfers.

From: Jason Thorpe <thorpej@shagadelic.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 jdwhite@iastate.edu
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Sat, 9 May 2009 12:59:09 -0700

 On May 9, 2009, at 12:35 AM, Andrew Doran wrote:


 > - We have longstanding problems providing block device semantics.  
 > Block
 >   devices are an interesting toy but they have no real application.  
 > Disk
 >   character devices suffice with one exception: on NetBSD, transfers  
 > on
 >   these devices must be aligned. So there is no need for physio to  
 > cache,
 >   it could simply buffer to allow misaligned transfers.

 Agreed, wholeheartedly.

 >




From: yamt@mwd.biglobe.ne.jp (YAMAMOTO Takashi)
To: ad@netbsd.org
Cc: dholland-bugs@netbsd.org, tech-kern@netbsd.org, gnats-bugs@NetBSD.org
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Sun, 10 May 2009 05:41:55 +0000 (UTC)

 hi,

 > I should have been clearer. Have a look at the bigger picture.
 > 
 > - We have longstanding problems with device nodes showing up in multiple
 >   file systems. For 6.0 we have devfs coming along, which will at some point
 >   in its development likely eliminate the need to support device nodes on
 >   other file systems. So devfs will give us a 1:1 mapping between device
 >   instances and vnodes (or maybe devfs inodes).

 i don't think devfs is necessary here.
 just giving up updating timestamps of mounted VBLK special files is enough.

 > - We have longstanding problems providing block device semantics. Block
 >   devices are an interesting toy but they have no real application. Disk
 >   character devices suffice with one exception: on NetBSD, transfers on
 >   these devices must be aligned. So there is no need for physio to cache,
 >   it could simply buffer to allow misaligned transfers.

 are there applications of the misaligned transfers?

 YAMAMOTO Takashi

From: Antti Kantee <pooka@cs.hut.fi>
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Cc: ad@netbsd.org, dholland-bugs@netbsd.org, tech-kern@netbsd.org,
	gnats-bugs@NetBSD.org
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Sun, 10 May 2009 08:53:10 +0300

 On Sun May 10 2009 at 05:41:55 +0000, YAMAMOTO Takashi wrote:
 > hi,
 > 
 > > I should have been clearer. Have a look at the bigger picture.
 > > 
 > > - We have longstanding problems with device nodes showing up in multiple
 > >   file systems. For 6.0 we have devfs coming along, which will at some point
 > >   in its development likely eliminate the need to support device nodes on
 > >   other file systems. So devfs will give us a 1:1 mapping between device
 > >   instances and vnodes (or maybe devfs inodes).
 > 
 > i don't think devfs is necessary here.
 > just giving up updating timestamps of mounted VBLK special files is enough.
 > 
 > > - We have longstanding problems providing block device semantics. Block
 > >   devices are an interesting toy but they have no real application. Disk
 > >   character devices suffice with one exception: on NetBSD, transfers on
 > >   these devices must be aligned. So there is no need for physio to cache,
 > >   it could simply buffer to allow misaligned transfers.
 > 
 > are there applications of the misaligned transfers?

 iirc ntfs-3g does misaligned transfers and works only on block devices.
 IMHO it should be fixed (and probably not very hard).

 but there's the more general problem that we do not currently provide
 any good unbuffered method for a userspace process to access a raw disk
 device.  I would like to be able to say either "block until written" or
 "put this in the *device driver* queue and return immediately after the
 request is queued".  it might be solved with better aio, but we don't
 currently have it.

 (i don't see the relevance of devfs here either)

From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
        jdwhite@iastate.edu, tsutsui@ceres.dti.ne.jp
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Sun, 13 Sep 2009 21:30:50 +0900

 It looks there are two different problems.

 >> Another panic.  This time I was unmounting an ext2 filesystem, so the 
 >> problem isn't with mke2fs, but seemingly with ext2 filesystems in 
 >> general.
 >>
 >>  panic: kernel diagnostic assertion "LIST_EMPTY(&vp->v_dirtyblkhd)" 
 >> failed: file "/home/builds/ab/netbsd-5/src/sys/kern/vfs_subr.c", line 872

 This one seems the same problem as PR kern/39914,
 and this can't be reproducible on -current for me.

 >> Running 'mke2fs -v -I 128 /dev/sd0m':
 >> 
 >> panic: kernel diagnostic assertion "rw_lock_held(&wl->wl_rwlock)" 

 This one is the "block device node on wapbl" problem mentioned in
 the release note, and -current still has this problem:

 http://www.NetBSD.org/releases/formal-5/NetBSD-5.0.html#errata
 >> Using block device nodes directly for I/O may cause a kernel crash
 >> when the file system containing /dev is FFS and is mounted with -o log.
 >> Workaround: use raw disk devices, or remount the file system without -o log.
 (I wonder if there is other PR about this errata)

 ---
 Izumi Tsutsui

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41189: kernel panic xen dom0 using mke2fs & WAPBL
Date: Sun, 13 Sep 2009 18:37:07 +0000

 On Sat, May 09, 2009 at 07:35:01AM +0000, Andrew Doran wrote:
  >>> I disagree. I think the solution is:
  >>> 
  >>> - devfs
  >>> - kill block devices in userspace
  >>> - allow unaligned I/O to disk devices via the raw node
  >> 
  >> How/why? As I've already explained once (mostly quoted above), devfs
  >> will not solve the real problem, just move it around. What are you
  >> envisioning that devfs will provide that will avoid the confusion?
  >> [...]
  >> Please *explain* your reasoning.
  >  
  >  I should have been clearer. Have a look at the bigger picture.

 Thank you, but four months later I still don't understand.

  >  - We have longstanding problems with device nodes showing up in
  >    multiple file systems. For 6.0 we have devfs coming along, which
  >    will at some point in its development likely eliminate the need
  >    to support device nodes on other file systems. So devfs will
  >    give us a 1:1 mapping between device instances and vnodes (or
  >    maybe devfs inodes).

 More likely the latter, but even assuming all this happens, it won't
 solve the problem. We'll end up with devfs ops being called on
 arbitrary non-devfs vnodes, same as the previous round of problems
 gave us wapbl ops being called on arbitrary non-wapbl vnodes, and
 before that we had softupdate ops being called on non-softupdate
 vnodes, and so on ad infinitum.

 The problem is that as things stand, the device vnode half belongs to
 the FS it came from and half to the FS that's mounted on it. The
 dividing line isn't clear (I'm not entirely convinced it's even well
 defined) and therefore it's easy for mistakes to arise. When mistakes
 arise, the consequence is often calling the wrong vnode ops. There are
 probably cases where the consequence is more subtle than that, too.

 This is a structural problem, and to really make it go away for real
 it needs to be solved structurally.

 The device vnode should belong only to the FS it came from, whether
 that's devfs or wapbl or whatever.

  >  - We have longstanding problems providing block device semantics. Block
  >    devices are an interesting toy but they have no real application. Disk
  >    character devices suffice with one exception: on NetBSD, transfers on
  >    these devices must be aligned. So there is no need for physio to cache,
  >    it could simply buffer to allow misaligned transfers.

 This I have no disagreement with; it's just not relevant to what I'm
 concerned about.

 -- 
 David A. Holland
 dholland@netbsd.org

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 19 Dec 2015 02:32:24 +0000
State-Changed-Why:
The problem with calling other fses' vnode ops did eventually get worked
out. As for the problem that's the same as 39914, -5 is EOL.


>Unformatted:
Home
PR Database Search
(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.