NetBSD Problem Report #51377
From paul@whooppee.com Fri Jul 29 20:44:04 2016
Return-Path: <paul@whooppee.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id CF9067A1B7
for <gnats-bugs@gnats.NetBSD.org>; Fri, 29 Jul 2016 20:44:04 +0000 (UTC)
Message-Id: <20160729204402.3545B16E60@pokey.whooppee.com>
Date: Sat, 30 Jul 2016 04:44:02 +0800 (PHT)
From: paul@whooppee.com
Reply-To: paul@whooppee.com
To: gnats-bugs@NetBSD.org
Subject: fss(4) panic if snapshot mounted read/write
X-Send-Pr-Version: 3.95
>Number: 51377
>Category: kern
>Synopsis: fss(4) panic if snapshot mounted read/write
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: hannken
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Jul 29 20:45:00 +0000 2016
>Closed-Date: Sat Aug 27 15:32:26 +0000 2016
>Last-Modified: Sat Aug 27 15:32:26 +0000 2016
>Originator: Paul Goyette
>Release: NetBSD 7.99.33
>Organization:
+------------------+--------------------------+------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
>Environment:
System: NetBSD pokey.whooppee.com 7.99.33 NetBSD 7.99.33 (POKEY 2016-07-04 08:34:20) #0: Mon Jul 4 20:24:24 PHT 2016 paul@pokey.whooppee.com:/build/netbsd-local/obj/amd64/sys/arch/amd64/compile/POKEY amd64
Architecture: x86_64
Machine: amd64
>Description:
If you attempt to mount a fss(4) snap-shot for read-write access,
you may get the following panic:
/dev/fss0: file system not clean (fs_clean=0x4); please fsck(8)
/dev/fss0: lost blocks 0 files 0
fss0: snapshot invalid: forced unmount
panic: kernel diagnostic assertion "LIST_FIRST(&fmi->fmi_cow_handler) == NULL" failed: file "/build/netbsd-local/src/sys/kern/vfs_trans.c", line 148
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x140
cd_play_msf() at netbsd:cd_play_msf
fstrans_mount_dtor() at netbsd:fstrans_mount_dtor+0x82
fstrans_get_lwp_info() at netbsd:fstrans_get_lwp_info+0x76
_fstrans_start.part.2() at netbsd:_fstrans_start.part.2+0x1e
genfs_lock() at netbsd:genfs_lock+0x3d
VOP_LOCK() at netbsd:VOP_LOCK+0x32
vn_lock() at netbsd:vn_lock+0x90
vrelel() at netbsd:vrelel+0x113
do_sys_waitid() at netbsd:do_sys_waitid+0x960
do_sys_wait() at netbsd:do_sys_wait+0x8f
sys___wait450() at netbsd:sys___wait450+0x42
syscall() at netbsd:syscall+0x15b
--- syscall (number 449) ---
75092f43d8ba:
cpu0: End traceback...
The panic does not happen immediately, rather it is delayed for
some seconds. Note that mount complains about the file system
being dirty.
(Note that in the backtrace above, "cd_play_msf" is at the
same address as "kern_assert + 0x48" - the end of kern_assert.)
>How-To-Repeat:
Manually execute the body of the src/tests/dev/fss/t_fss atf
test, but remove the '-o rdonly' from the 'mount fss0' command.
>Fix:
Unknown
>Release-Note:
>Audit-Trail:
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/51377: fss(4) panic if snapshot mounted read/write
Date: Sun, 31 Jul 2016 12:48:48 +0800 (PHT)
The traceback here is somewhat misleading. The frame which references
do_sys_waitid()+0x960 is really in proc_exit() at line 1193
(gdb) list *do_sys_waitid+0x960
0xffffffff80819231 is in do_sys_waitid
(/build/localcount/src/sys/kern/kern_exit.c:1193).
1188 * Release reference to text vnode
1189 */
1190 if (p->p_textvp)
1191 vrele(p->p_textvp);
1192
1193 mutex_destroy(&p->p_auxlock);
1194 mutex_obj_free(p->p_lock);
1195 mutex_destroy(&p->p_stmutex);
1196 cv_destroy(&p->p_waitcv);
1197 cv_destroy(&p->p_lwpcv);
It would appear that mount_ffs(8) detected an inconsistency with the
snapshot (file-system dirty), and unmounted it (which generated the
"snapshot invalid: forced unmount" message). then, when mount_ffs tried
to exit, it ran into a problem with its iamge/text file, resulting in
the panic.
It is unclear to me how the fss unmount hook could have caused the
corruption of the process's text vnode.
+------------------+--------------------------+------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/51377: fss(4) panic if snapshot mounted read/write
Date: Sun, 31 Jul 2016 12:52:40 +0800 (PHT)
On Sun, 31 Jul 2016, Paul Goyette wrote:
> The following reply was made to PR kern/51377; it has been noted by GNATS.
>
> From: Paul Goyette <paul@whooppee.com>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: kern/51377: fss(4) panic if snapshot mounted read/write
> Date: Sun, 31 Jul 2016 12:48:48 +0800 (PHT)
>
> The traceback here is somewhat misleading. The frame which references
> do_sys_waitid()+0x960 is really in proc_exit() at line 1193
This is actually proc_free() --------^^^^^^^^^^^ (and near the end)
> (gdb) list *do_sys_waitid+0x960
> 0xffffffff80819231 is in do_sys_waitid
> (/build/localcount/src/sys/kern/kern_exit.c:1193).
> 1188 * Release reference to text vnode
> 1189 */
> 1190 if (p->p_textvp)
> 1191 vrele(p->p_textvp);
> 1192
> 1193 mutex_destroy(&p->p_auxlock);
> 1194 mutex_obj_free(p->p_lock);
> 1195 mutex_destroy(&p->p_stmutex);
> 1196 cv_destroy(&p->p_waitcv);
> 1197 cv_destroy(&p->p_lwpcv);
>
>
> It would appear that mount_ffs(8) detected an inconsistency with the
> snapshot (file-system dirty), and unmounted it (which generated the
> "snapshot invalid: forced unmount" message). then, when mount_ffs tried
> to exit, it ran into a problem with its iamge/text file, resulting in
> the panic.
>
> It is unclear to me how the fss unmount hook could have caused the
> corruption of the process's text vnode.
>
>
>
>
>
> +------------------+--------------------------+------------------------+
> | Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
> | (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
> | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
> +------------------+--------------------------+------------------------+
>
>
> !DSPAM:579d83a2109975820117874!
>
>
+------------------+--------------------------+------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51377 CVS commit: src/sys/dev
Date: Sun, 31 Jul 2016 12:17:36 +0000
Module Name: src
Committed By: hannken
Date: Sun Jul 31 12:17:36 UTC 2016
Modified Files:
src/sys/dev: fss.c
Log Message:
Disestablish COW handler on error. No need to do further copies after
the snapshot device failed.
Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write
To generate a diff of this commit:
cvs rdiff -u -r1.94 -r1.95 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Responsible-Changed-From-To: kern-bug-people->hannken
Responsible-Changed-By: hannken@NetBSD.org
Responsible-Changed-When: Sun, 31 Jul 2016 12:21:38 +0000
Responsible-Changed-Why:
Take.
State-Changed-From-To: open->analyzed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Sun, 31 Jul 2016 12:21:38 +0000
State-Changed-Why:
Committed a fix -- please confirm.
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: hannken@NetBSD.org
Subject: Re: kern/51377 (fss(4) panic if snapshot mounted read/write)
Date: Sun, 31 Jul 2016 21:07:03 +0800 (PHT)
On Sun, 31 Jul 2016, hannken@NetBSD.org wrote:
> Synopsis: fss(4) panic if snapshot mounted read/write
>
> Responsible-Changed-From-To: kern-bug-people->hannken
> Responsible-Changed-By: hannken@NetBSD.org
> Responsible-Changed-When: Sun, 31 Jul 2016 12:21:38 +0000
> Responsible-Changed-Why:
> Take.
>
>
> State-Changed-From-To: open->analyzed
> State-Changed-By: hannken@NetBSD.org
> State-Changed-When: Sun, 31 Jul 2016 12:21:38 +0000
> State-Changed-Why:
> Committed a fix -- please confirm.
Fix confirmed!
Please consider pulling this up to the netbsd-6 and -7 branches, and
possibly the -6-0, -6-1, and -7-0 releases.
+------------------+--------------------------+------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
State-Changed-From-To: analyzed->needs-pullups
State-Changed-By: pgoyette@NetBSD.org
State-Changed-When: Sun, 31 Jul 2016 21:54:16 +0000
State-Changed-Why:
Please pull-up to -6 and -7
State-Changed-From-To: needs-pullups->pending-pullups
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Sat, 20 Aug 2016 16:53:31 +0000
State-Changed-Why:
Pullups requested:
Ticket #1399 on -6 and #1239 on -7.
From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51377 CVS commit: [netbsd-6] src/sys/dev
Date: Sat, 27 Aug 2016 14:47:48 +0000
Module Name: src
Committed By: bouyer
Date: Sat Aug 27 14:47:48 UTC 2016
Modified Files:
src/sys/dev [netbsd-6]: fss.c
Log Message:
Pull up following revision(s) (requested by hannken in ticket #1399):
sys/dev/fss.c: revision 1.95
Disestablish COW handler on error. No need to do further copies after
the snapshot device failed.
Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write
To generate a diff of this commit:
cvs rdiff -u -r1.81.4.3 -r1.81.4.4 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51377 CVS commit: [netbsd-6-0] src/sys/dev
Date: Sat, 27 Aug 2016 14:48:09 +0000
Module Name: src
Committed By: bouyer
Date: Sat Aug 27 14:48:09 UTC 2016
Modified Files:
src/sys/dev [netbsd-6-0]: fss.c
Log Message:
Pull up following revision(s) (requested by hannken in ticket #1399):
sys/dev/fss.c: revision 1.95
Disestablish COW handler on error. No need to do further copies after
the snapshot device failed.
Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write
To generate a diff of this commit:
cvs rdiff -u -r1.81.4.1.4.2 -r1.81.4.1.4.3 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51377 CVS commit: [netbsd-6-1] src/sys/dev
Date: Sat, 27 Aug 2016 14:48:50 +0000
Module Name: src
Committed By: bouyer
Date: Sat Aug 27 14:48:50 UTC 2016
Modified Files:
src/sys/dev [netbsd-6-1]: fss.c
Log Message:
Pull up following revision(s) (requested by hannken in ticket #1399):
sys/dev/fss.c: revision 1.95
Disestablish COW handler on error. No need to do further copies after
the snapshot device failed.
Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write
To generate a diff of this commit:
cvs rdiff -u -r1.81.4.3 -r1.81.4.3.2.1 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51377 CVS commit: [netbsd-7] src/sys/dev
Date: Sat, 27 Aug 2016 15:09:22 +0000
Module Name: src
Committed By: bouyer
Date: Sat Aug 27 15:09:22 UTC 2016
Modified Files:
src/sys/dev [netbsd-7]: fss.c
Log Message:
Pull up following revision(s) (requested by hannken in ticket #1239):
sys/dev/fss.c: revision 1.95
Disestablish COW handler on error. No need to do further copies after
the snapshot device failed.
Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write
To generate a diff of this commit:
cvs rdiff -u -r1.91 -r1.91.2.1 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51377 CVS commit: [netbsd-7-0] src/sys/dev
Date: Sat, 27 Aug 2016 15:09:48 +0000
Module Name: src
Committed By: bouyer
Date: Sat Aug 27 15:09:48 UTC 2016
Modified Files:
src/sys/dev [netbsd-7-0]: fss.c
Log Message:
Pull up following revision(s) (requested by hannken in ticket #1239):
sys/dev/fss.c: revision 1.95
Disestablish COW handler on error. No need to do further copies after
the snapshot device failed.
Should fix PR kern/51377: fss(4) panic if snapshot mounted read/write
To generate a diff of this commit:
cvs rdiff -u -r1.91 -r1.91.4.1 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: pending-pullups->closed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Sat, 27 Aug 2016 15:32:26 +0000
State-Changed-Why:
Pullups to 6.* and 7.* done.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.