NetBSD Problem Report #51377

From paul@whooppee.com  Fri Jul 29 20:44:04 2016
Return-Path: <paul@whooppee.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CF9067A1B7
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 29 Jul 2016 20:44:04 +0000 (UTC)
Message-Id: <20160729204402.3545B16E60@pokey.whooppee.com>
Date: Sat, 30 Jul 2016 04:44:02 +0800 (PHT)
From: paul@whooppee.com
Reply-To: paul@whooppee.com
To: gnats-bugs@NetBSD.org
Subject: fss(4) panic if snapshot mounted read/write
X-Send-Pr-Version: 3.95

>Number:         51377
>Category:       kern
>Synopsis:       fss(4) panic if snapshot mounted read/write
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    hannken
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jul 29 20:45:00 +0000 2016
>Closed-Date:    Sat Aug 27 15:32:26 +0000 2016
>Last-Modified:  Sat Aug 27 15:32:26 +0000 2016
>Originator:     Paul Goyette
>Release:        NetBSD 7.99.33
>Organization:
+------------------+--------------------------+------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
>Environment:


System: NetBSD pokey.whooppee.com 7.99.33 NetBSD 7.99.33 (POKEY 2016-07-04 08:34:20) #0: Mon Jul 4 20:24:24 PHT 2016 paul@pokey.whooppee.com:/build/netbsd-local/obj/amd64/sys/arch/amd64/compile/POKEY amd64
Architecture: x86_64
Machine: amd64
>Description:
	If you attempt to mount a fss(4) snap-shot for read-write access,
	you may get the following panic:

	/dev/fss0: file system not clean (fs_clean=0x4); please fsck(8)
	/dev/fss0: lost blocks 0 files 0
	fss0: snapshot invalid: forced unmount
	panic: kernel diagnostic assertion "LIST_FIRST(&fmi->fmi_cow_handler) == NULL" failed: file "/build/netbsd-local/src/sys/kern/vfs_trans.c", line 148 
	cpu0: Begin traceback...
	vpanic() at netbsd:vpanic+0x140
	cd_play_msf() at netbsd:cd_play_msf
	fstrans_mount_dtor() at netbsd:fstrans_mount_dtor+0x82
	fstrans_get_lwp_info() at netbsd:fstrans_get_lwp_info+0x76
	_fstrans_start.part.2() at netbsd:_fstrans_start.part.2+0x1e
	genfs_lock() at netbsd:genfs_lock+0x3d
	VOP_LOCK() at netbsd:VOP_LOCK+0x32
	vn_lock() at netbsd:vn_lock+0x90
	vrelel() at netbsd:vrelel+0x113
	do_sys_waitid() at netbsd:do_sys_waitid+0x960
	do_sys_wait() at netbsd:do_sys_wait+0x8f
	sys___wait450() at netbsd:sys___wait450+0x42
	syscall() at netbsd:syscall+0x15b
	--- syscall (number 449) ---
	75092f43d8ba:
	cpu0: End traceback...

	The panic does not happen immediately, rather it is delayed for
	some seconds.  Note that mount complains about the file system
	being dirty.

	(Note that in the backtrace above, "cd_play_msf" is at the
	same address as "kern_assert + 0x48" - the end of kern_assert.)

>How-To-Repeat:
	Manually execute the body of the src/tests/dev/fss/t_fss atf
	test, but remove the '-o rdonly' from the 'mount fss0' command.

>Fix:
	Unknown


>Release-Note:

>Audit-Trail:
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/51377: fss(4) panic if snapshot mounted read/write
Date: Sun, 31 Jul 2016 12:48:48 +0800 (PHT)

 The traceback here is somewhat misleading.  The frame which references 
 do_sys_waitid()+0x960 is really in proc_exit() at line 1193

 (gdb) list *do_sys_waitid+0x960
 0xffffffff80819231 is in do_sys_waitid 
 (/build/localcount/src/sys/kern/kern_exit.c:1193).
 1188             * Release reference to text vnode
 1189             */
 1190            if (p->p_textvp)
 1191                    vrele(p->p_textvp);
 1192
 1193            mutex_destroy(&p->p_auxlock);
 1194            mutex_obj_free(p->p_lock);
 1195            mutex_destroy(&p->p_stmutex);
 1196            cv_destroy(&p->p_waitcv);
 1197            cv_destroy(&p->p_lwpcv);


 It would appear that mount_ffs(8) detected an inconsistency with the 
 snapshot (file-system dirty), and unmounted it (which generated the 
 "snapshot invalid: forced unmount" message).  then, when mount_ffs tried 
 to exit, it ran into a problem with its iamge/text file, resulting in 
 the panic.

 It is unclear to me how the fss unmount hook could have caused the 
 corruption of the process's text vnode.





 +------------------+--------------------------+------------------------+
 | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
 | (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
 | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
 +------------------+--------------------------+------------------------+

From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/51377: fss(4) panic if snapshot mounted read/write
Date: Sun, 31 Jul 2016 12:52:40 +0800 (PHT)

 On Sun, 31 Jul 2016, Paul Goyette wrote:

 > The following reply was made to PR kern/51377; it has been noted by GNATS.
 >
 > From: Paul Goyette <paul@whooppee.com>
 > To: gnats-bugs@NetBSD.org
 > Cc:
 > Subject: Re: kern/51377: fss(4) panic if snapshot mounted read/write
 > Date: Sun, 31 Jul 2016 12:48:48 +0800 (PHT)
 >
 > The traceback here is somewhat misleading.  The frame which references
 > do_sys_waitid()+0x960 is really in proc_exit() at line 1193

 This is actually proc_free() --------^^^^^^^^^^^ (and near the end)


 > (gdb) list *do_sys_waitid+0x960
 > 0xffffffff80819231 is in do_sys_waitid
 > (/build/localcount/src/sys/kern/kern_exit.c:1193).
 > 1188             * Release reference to text vnode
 > 1189             */
 > 1190            if (p->p_textvp)
 > 1191                    vrele(p->p_textvp);
 > 1192
 > 1193            mutex_destroy(&p->p_auxlock);
 > 1194            mutex_obj_free(p->p_lock);
 > 1195            mutex_destroy(&p->p_stmutex);
 > 1196            cv_destroy(&p->p_waitcv);
 > 1197            cv_destroy(&p->p_lwpcv);
 >
 >
 > It would appear that mount_ffs(8) detected an inconsistency with the
 > snapshot (file-system dirty), and unmounted it (which generated the
 > "snapshot invalid: forced unmount" message).  then, when mount_ffs tried
 > to exit, it ran into a problem with its iamge/text file, resulting in
 > the panic.
 >
 > It is unclear to me how the fss unmount hook could have caused the
 > corruption of the process's text vnode.
 >
 >
 >
 >
 >
 > +------------------+--------------------------+------------------------+
 > | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
 > | (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
 > | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
 > +------------------+--------------------------+------------------------+
 >
 >
 > !DSPAM:579d83a2109975820117874!
 >
 >

 +------------------+--------------------------+------------------------+
 | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
 | (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
 | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
 +------------------+--------------------------+------------------------+

From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51377 CVS commit: src/sys/dev
Date: Sun, 31 Jul 2016 12:17:36 +0000

 Module Name:	src
 Committed By:	hannken
 Date:		Sun Jul 31 12:17:36 UTC 2016

 Modified Files:
 	src/sys/dev: fss.c

 Log Message:
 Disestablish COW handler on error.  No need to do further copies after
 the snapshot device failed.

 Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write


 To generate a diff of this commit:
 cvs rdiff -u -r1.94 -r1.95 src/sys/dev/fss.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->hannken
Responsible-Changed-By: hannken@NetBSD.org
Responsible-Changed-When: Sun, 31 Jul 2016 12:21:38 +0000
Responsible-Changed-Why:
Take.


State-Changed-From-To: open->analyzed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Sun, 31 Jul 2016 12:21:38 +0000
State-Changed-Why:
Committed a fix -- please confirm.


From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: hannken@NetBSD.org
Subject: Re: kern/51377 (fss(4) panic if snapshot mounted read/write)
Date: Sun, 31 Jul 2016 21:07:03 +0800 (PHT)

 On Sun, 31 Jul 2016, hannken@NetBSD.org wrote:

 > Synopsis: fss(4) panic if snapshot mounted read/write
 >
 > Responsible-Changed-From-To: kern-bug-people->hannken
 > Responsible-Changed-By: hannken@NetBSD.org
 > Responsible-Changed-When: Sun, 31 Jul 2016 12:21:38 +0000
 > Responsible-Changed-Why:
 > Take.
 >
 >
 > State-Changed-From-To: open->analyzed
 > State-Changed-By: hannken@NetBSD.org
 > State-Changed-When: Sun, 31 Jul 2016 12:21:38 +0000
 > State-Changed-Why:
 > Committed a fix -- please confirm.

 Fix confirmed!

 Please consider pulling this up to the netbsd-6 and -7 branches, and 
 possibly the -6-0, -6-1, and -7-0 releases.



 +------------------+--------------------------+------------------------+
 | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
 | (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
 | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
 +------------------+--------------------------+------------------------+

State-Changed-From-To: analyzed->needs-pullups
State-Changed-By: pgoyette@NetBSD.org
State-Changed-When: Sun, 31 Jul 2016 21:54:16 +0000
State-Changed-Why:
Please pull-up to -6 and -7


State-Changed-From-To: needs-pullups->pending-pullups
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Sat, 20 Aug 2016 16:53:31 +0000
State-Changed-Why:
Pullups requested:
Ticket #1399 on -6 and #1239 on -7.


From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51377 CVS commit: [netbsd-6] src/sys/dev
Date: Sat, 27 Aug 2016 14:47:48 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Aug 27 14:47:48 UTC 2016

 Modified Files:
 	src/sys/dev [netbsd-6]: fss.c

 Log Message:
 Pull up following revision(s) (requested by hannken in ticket #1399):
 	sys/dev/fss.c: revision 1.95
 Disestablish COW handler on error.  No need to do further copies after
 the snapshot device failed.
 Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write


 To generate a diff of this commit:
 cvs rdiff -u -r1.81.4.3 -r1.81.4.4 src/sys/dev/fss.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51377 CVS commit: [netbsd-6-0] src/sys/dev
Date: Sat, 27 Aug 2016 14:48:09 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Aug 27 14:48:09 UTC 2016

 Modified Files:
 	src/sys/dev [netbsd-6-0]: fss.c

 Log Message:
 Pull up following revision(s) (requested by hannken in ticket #1399):
 	sys/dev/fss.c: revision 1.95
 Disestablish COW handler on error.  No need to do further copies after
 the snapshot device failed.
 Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write


 To generate a diff of this commit:
 cvs rdiff -u -r1.81.4.1.4.2 -r1.81.4.1.4.3 src/sys/dev/fss.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51377 CVS commit: [netbsd-6-1] src/sys/dev
Date: Sat, 27 Aug 2016 14:48:50 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Aug 27 14:48:50 UTC 2016

 Modified Files:
 	src/sys/dev [netbsd-6-1]: fss.c

 Log Message:
 Pull up following revision(s) (requested by hannken in ticket #1399):
 	sys/dev/fss.c: revision 1.95
 Disestablish COW handler on error.  No need to do further copies after
 the snapshot device failed.
 Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write


 To generate a diff of this commit:
 cvs rdiff -u -r1.81.4.3 -r1.81.4.3.2.1 src/sys/dev/fss.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51377 CVS commit: [netbsd-7] src/sys/dev
Date: Sat, 27 Aug 2016 15:09:22 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Aug 27 15:09:22 UTC 2016

 Modified Files:
 	src/sys/dev [netbsd-7]: fss.c

 Log Message:
 Pull up following revision(s) (requested by hannken in ticket #1239):
 	sys/dev/fss.c: revision 1.95
 Disestablish COW handler on error.  No need to do further copies after
 the snapshot device failed.
 Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write


 To generate a diff of this commit:
 cvs rdiff -u -r1.91 -r1.91.2.1 src/sys/dev/fss.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51377 CVS commit: [netbsd-7-0] src/sys/dev
Date: Sat, 27 Aug 2016 15:09:48 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Aug 27 15:09:48 UTC 2016

 Modified Files:
 	src/sys/dev [netbsd-7-0]: fss.c

 Log Message:
 Pull up following revision(s) (requested by hannken in ticket #1239):
 	sys/dev/fss.c: revision 1.95
 Disestablish COW handler on error.  No need to do further copies after
 the snapshot device failed.
 Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write


 To generate a diff of this commit:
 cvs rdiff -u -r1.91 -r1.91.4.1 src/sys/dev/fss.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Sat, 27 Aug 2016 15:32:26 +0000
State-Changed-Why:
Pullups to 6.* and 7.* done.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.