NetBSD Problem Report #48849
From www@NetBSD.org Thu May 29 17:53:42 2014
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id DA41CA6515
for <gnats-bugs@gnats.NetBSD.org>; Thu, 29 May 2014 17:53:41 +0000 (UTC)
Message-Id: <20140529175340.5CF55A651E@mollari.NetBSD.org>
Date: Thu, 29 May 2014 17:53:40 +0000 (UTC)
From: prlw1@cam.ac.uk
Reply-To: prlw1@cam.ac.uk
To: gnats-bugs@NetBSD.org
Subject: root mirror raid fails on shutdown
X-Send-Pr-Version: www-1.0
>Number: 48849
>Category: kern
>Synopsis: root mirror raid fails on shutdown
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: hannken
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu May 29 17:55:00 +0000 2014
>Closed-Date: Mon Jun 16 08:07:44 +0000 2014
>Last-Modified: Mon Jun 16 08:07:44 +0000 2014
>Originator: Patrick Welche
>Release: -current/amd64 6.99.43
>Organization:
>Environment:
-current/amd64 6.99.43
>Description:
The bootable root partition is a raidframe mirror made from dk wedge components, i.e., the disks were set up with gpt rather than disklabel.
The problem is that the raid appears to work perfectly, but on shutdown it is marked as failed with an "IO Error". On boot, rebuilding the failed component to itself is always successful, and appears to work correctly again until shutdown, when once again it is marked as failed.
wd0,1
64 14680192 1 GPT part - NetBSD RAIDFrame component
raid
:dt=RAID:se#512:ns#128:nt#8:sc#1024:nc#14336:\
:pa#2097152:oa#0:ta=4.2BSD:ba#0:fa#0:\
:pc#14680064:oc#0:\
:pd#14680064:od#0:\
:pe#12582912:oe#2097152:te=4.2BSD:be#0:fe#0:
It is just the root partition which exhibits the problem
sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 14680064
RAID Level: 1
Autoconfig: Yes
Root partition: Force
The other raid partitions are fine.
>How-To-Repeat:
Boot a -current/amd64 box with root on raid 1 built from wedges.
Note: I have seen this on the other box with wedges (eg /dev/dk0 component), but not on -current/amd64 boxen with ordinary disklabel components eg /dev/wd0e.
>Fix:
>Release-Note:
>Audit-Trail:
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/48849: root mirror raid fails on shutdown
Date: Thu, 29 May 2014 13:58:24 -0400
On May 29, 5:55pm, prlw1@cam.ac.uk (prlw1@cam.ac.uk) wrote:
-- Subject: kern/48849: root mirror raid fails on shutdown
| The bootable root partition is a raidframe mirror made from dk wedge components, i.e., the disks were set up with gpt rather than disklabel.
|
| The problem is that the raid appears to work perfectly, but on shutdown it is marked as failed with an "IO Error". On boot, rebuilding the failed component to itself is always successful, and appears to work correctly again until shutdown, when once again it is marked as failed.
|
Do you have the messages of a DEBUG/DIAGNOSTIC kernel on shutdown?
christos
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/48849: root mirror raid fails on shutdown
Date: Thu, 29 May 2014 19:18:56 +0100
Root is on raid7a, made from /dev/dk0 and /dev/dk7
...
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
...
raid5: detached
dk13 at wd1 (df25cd9b-6326-11e3-8f70-10bf48bd3389) deleted
dk13: detached
dk12 at wd1 (df25cd92-6326-11e3-8f70-10bf48bd3389) deleted
dk12: detached
dk10 at wd1 (df25cd7f-6326-11e3-8f70-10bf48bd3389) deleted
dk10: detached
dk9 at wd1 (df25cd76-6326-11e3-8f70-10bf48bd3389) deleted
dk9: detached
dk8 at wd1 (df25cd6b-6326-11e3-8f70-10bf48bd3389) deleted
dk8: detached
dk6 at wd0 (80706d9e-e1f8-11e3-9080-10bf48bd3389) deleted
dk6: detached
dk5 at wd0 (80706d9b-e1f8-11e3-9080-10bf48bd3389) deleted
dk5: detached
dk3 at wd0 (80706d94-e1f8-11e3-9080-10bf48bd3389) deleted
dk3: detached
dk2 at wd0 (80706d90-e1f8-11e3-9080-10bf48bd3389) deleted
dk2: detached
dk1 at wd0 (80706d8c-e1f8-11e3-9080-10bf48bd3389) deleted
dk1: detached
unmounting 0xfffffe811b89b008 /home (/dev/cgd0a)...
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
forcefully unmounting /home (/dev/cgd0a)...
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
cgd0: detached
raid4: detached
raid4: detached
dk11 at wd1 (df25cd88-6326-11e3-8f70-10bf48bd3389) deleted
dk11: detached
dk4 at wd0 (80706d97-e1f8-11e3-9080-10bf48bd3389) deleted
dk4: detached
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
forcefully unmounting / (/dev/raid7a)...
raid7: IO Error. Marking /dev/dk0 as failed.
dk0 at wd0 (80706d87-e1f8-11e3-9080-10bf48bd3389) deleted
wd0: detached
atabus4: detached
raid7: detached
raid7: detached
dk7 at wd1 (df25cd61-6326-11e3-8f70-10bf48bd3389) deleted
dk7: detached
wd1: detached
From: christos@zoulas.com (Christos Zoulas)
To: Patrick Welche <prlw1@cam.ac.uk>, gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/48849: root mirror raid fails on shutdown
Date: Thu, 29 May 2014 15:46:17 -0400
On May 29, 7:18pm, prlw1@cam.ac.uk (Patrick Welche) wrote:
-- Subject: Re: kern/48849: root mirror raid fails on shutdown
| Root is on raid7a, made from /dev/dk0 and /dev/dk7
| ...
| unmounting 0xfffffe811e883008 / (/dev/raid7a)...
These:
unmounting 0xfffffe811b89b008 /home (/dev/cgd0a)...
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
forcefully unmounting /home (/dev/cgd0a)...
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
cgd0: detached
raid4: detached
raid4: detached
dk11 at wd1 (df25cd88-6326-11e3-8f70-10bf48bd3389) deleted
dk11: detached
dk4 at wd0 (80706d97-e1f8-11e3-9080-10bf48bd3389) deleted
dk4: detached
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
forcefully unmounting / (/dev/raid7a)...
raid7: IO Error. Marking /dev/dk0 as failed.
I don't like these forcefully unmounting. We should figure out why they happen.
christos
From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/48849: root mirror raid fails on shutdown
Date: Fri, 30 May 2014 12:26:09 +0200
On 29 May 2014, at 21:50, Christos Zoulas <christos@zoulas.com> wrote:
<snip>
>=20
> unmounting 0xfffffe811e883008 / (/dev/raid7a)...
> cgd0: detached
> raid4: detached
> raid4: detached
> dk11 at wd1 (df25cd88-6326-11e3-8f70-10bf48bd3389) deleted
> dk11: detached
> dk4 at wd0 (80706d97-e1f8-11e3-9080-10bf48bd3389) deleted
> dk4: detached
> unmounting 0xfffffe811e883008 / (/dev/raid7a)...
> forcefully unmounting / (/dev/raid7a)...
> raid7: IO Error. Marking /dev/dk0 as failed.
>=20
> I don't like these forcefully unmounting. We should figure out why =
they happen.
Some files/directories are open during shutdown, while running
shutdown for example:
/lib/libc.so.12.190
/lib/libcrypt.so.1.0
/lib/libgcc_s.so.1.0
/lib/libutil.so.7.21
/libexec/ld.elf_so
/root
/sbin/halt
/sbin/init
Looks like the device node of a component of raid7 gets closed.
Please try:
RCS file: /cvsroot/src/sys/kern/vfs_vnode.c,v
diff -p -u -2 -r1.36 vfs_vnode.c
--- vfs_vnode.c 8 May 2014 08:21:53 -0000 1.36
+++ vfs_vnode.c 30 May 2014 10:23:26 -0000
@@ -979,6 +979,5 @@ vclean(vnode_t *vp)
=20
active =3D (vp->v_usecount > 1);
- doclose =3D ! (active && vp->v_type =3D=3D VBLK &&
- spec_node_getmountedfs(vp) !=3D NULL);
+ doclose =3D ! (active && vp->v_type =3D=3D VBLK);
mutex_exit(vp->v_interlock);
--
J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/48849: root mirror raid fails on shutdown
Date: Fri, 30 May 2014 19:33:11 +0100
With your patch, I no longer see any forceful unmounts, however the
computer doesn't shut down either. I sprinkled a few printfs in dounmount(),
and see:
Entering dounmount /var (/dev/raid7e)...
VFS_SYNC(/var) = 0
VFS_UNMOUNT(/var) = 0
called vfs_hooks_unmount(/var)
Successfully exiting dounmount /var (/dev/raid7e)...
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
Entering dounmount / (/dev/raid7a)...
VFS_SYNC(/) = 0
VFS_UNMOUNT(/) = 16
unmounting 0xfffffe811e883008 / (/dev/raid7a)...
Entering dounmount / (/dev/raid7a)...
VFS_SYNC(/) = 0
VFS_UNMOUNT(/) = 16
cd0: detached
atapibus0: detached
[other detached snipped]
raid6: detached
raid5: detached
raid5: detached
-- and this is where we hang
Note there is no "exiting dounmount" for raid7a
Breaking into ddb at this point
Stopped in pid 0.5 (system) at netbsd:breakpoint+0x5: leave
db{0}> bt
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x529
Xintr_ioapic_edge1() at netbsd:Xintr_ioapic_edge1+0xea
--- interrupt ---
_kernel_lock() at netbsd:_kernel_lock+0x165
intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x12
Xintr_ioapic_level3() at netbsd:Xintr_ioapic_level3+0xf2
--- interrupt ---
_kernel_lock() at netbsd:_kernel_lock+0x165
frag6_fasttimo() at netbsd:frag6_fasttimo+0x1a
pffasttimo() at netbsd:pffasttimo+0x31
callout_softclock() at netbsd:callout_softclock+0x1d0
softint_dispatch() at netbsd:softint_dispatch+0xd3
DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe80cd845ff0
Xsoftintr() at netbsd:Xsoftintr+0x4f
--- interrupt ---
0:
From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/48849: root mirror raid fails on shutdown
Date: Wed, 4 Jun 2014 11:18:40 +0200
--Apple-Mail=_75D36A34-10A6-4B2A-A8E7-2501FAD95608
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=us-ascii
Please try the attached diff. It will
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.
Please make sure you have a full backup as this diff changes
ccd, cgd, dm/lvm and raid, not all covered by anita tests.
--
J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)
--Apple-Mail=_75D36A34-10A6-4B2A-A8E7-2501FAD95608
Content-Disposition: attachment;
filename=patch.diff
Content-Type: application/octet-stream;
name="patch.diff"
Content-Transfer-Encoding: 7bit
Index: sys/dev/ccd.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ccd.c,v
retrieving revision 1.148
diff -p -u -4 -r1.148 ccd.c
--- sys/dev/ccd.c 6 Apr 2014 00:56:39 -0000 1.148
+++ sys/dev/ccd.c 4 Jun 2014 08:39:37 -0000
@@ -120,8 +120,10 @@ __KERNEL_RCSID(0, "$NetBSD: ccd.c,v 1.14
#include <dev/ccdvar.h>
#include <dev/dkvar.h>
+#include <miscfs/specfs/specdev.h> /* for v_rdev */
+
#if defined(CCDDEBUG) && !defined(DEBUG)
#define DEBUG
#endif
@@ -291,9 +293,8 @@ ccdinit(struct ccd_softc *cs, char **cpa
struct lwp *l)
{
struct ccdcinfo *ci = NULL;
int ix;
- struct vattr va;
struct ccdgeom *ccg = &cs->sc_geom;
char *tmppath;
int error, path_alloced;
uint64_t psize, minsize;
@@ -343,21 +344,9 @@ ccdinit(struct ccd_softc *cs, char **cpa
/*
* XXX: Cache the component's dev_t.
*/
- vn_lock(vpp[ix], LK_SHARED | LK_RETRY);
- error = VOP_GETATTR(vpp[ix], &va, l->l_cred);
- VOP_UNLOCK(vpp[ix]);
- if (error != 0) {
-#ifdef DEBUG
- if (ccddebug & (CCDB_FOLLOW|CCDB_INIT))
- printf("%s: %s: getattr failed %s = %d\n",
- cs->sc_xname, ci->ci_path,
- "error", error);
-#endif
- goto out;
- }
- ci->ci_dev = va.va_rdev;
+ ci->ci_dev = vpp[ix]->v_rdev;
/*
* Get partition information for the component.
*/
Index: sys/dev/cgd.c
===================================================================
RCS file: /cvsroot/src/sys/dev/cgd.c,v
retrieving revision 1.87
diff -p -u -4 -r1.87 cgd.c
--- sys/dev/cgd.c 25 May 2014 19:23:49 -0000 1.87
+++ sys/dev/cgd.c 4 Jun 2014 08:39:37 -0000
@@ -54,8 +54,10 @@ __KERNEL_RCSID(0, "$NetBSD: cgd.c,v 1.87
#include <dev/dkvar.h>
#include <dev/cgdvar.h>
+#include <miscfs/specfs/specdev.h> /* for v_rdev */
+
/* Entry Point Functions */
void cgdattach(int);
@@ -808,9 +810,8 @@ static int
cgdinit(struct cgd_softc *cs, const char *cpath, struct vnode *vp,
struct lwp *l)
{
struct disk_geom *dg;
- struct vattr va;
int ret;
char *tmppath;
uint64_t psize;
unsigned secsize;
@@ -825,15 +826,9 @@ cgdinit(struct cgd_softc *cs, const char
goto bail;
cs->sc_tpath = malloc(cs->sc_tpathlen, M_DEVBUF, M_WAITOK);
memcpy(cs->sc_tpath, tmppath, cs->sc_tpathlen);
- vn_lock(vp, LK_SHARED | LK_RETRY);
- ret = VOP_GETATTR(vp, &va, l->l_cred);
- VOP_UNLOCK(vp);
- if (ret != 0)
- goto bail;
-
- cs->sc_tdev = va.va_rdev;
+ cs->sc_tdev = vp->v_rdev;
if ((ret = getdisksize(vp, &psize, &secsize)) != 0)
goto bail;
Index: sys/dev/dksubr.c
===================================================================
RCS file: /cvsroot/src/sys/dev/dksubr.c,v
retrieving revision 1.50
diff -p -u -4 -r1.50 dksubr.c
--- sys/dev/dksubr.c 25 May 2014 19:23:49 -0000 1.50
+++ sys/dev/dksubr.c 4 Jun 2014 08:39:37 -0000
@@ -47,8 +47,9 @@ __KERNEL_RCSID(0, "$NetBSD: dksubr.c,v 1
#include <sys/namei.h>
#include <sys/module.h>
#include <dev/dkvar.h>
+#include <miscfs/specfs/specdev.h> /* for v_rdev */
int dkdebug = 0;
#ifdef DEBUG
@@ -620,9 +621,8 @@ int
dk_lookup(struct pathbuf *pb, struct lwp *l, struct vnode **vpp)
{
struct nameidata nd;
struct vnode *vp;
- struct vattr va;
int error;
if (l == NULL)
return ESRCH; /* Is ESRCH the best choice? */
@@ -634,24 +634,31 @@ dk_lookup(struct pathbuf *pb, struct lwp
return error;
}
vp = nd.ni_vp;
- if ((error = VOP_GETATTR(vp, &va, l->l_cred)) != 0) {
- DPRINTF((DKDB_FOLLOW|DKDB_INIT),
- ("dk_lookup: getattr error = %d\n", error));
+ if (vp->v_type != VBLK) {
+ error = ENOTBLK;
goto out;
}
- /* XXX: eventually we should handle VREG, too. */
- if (va.va_type != VBLK) {
- error = ENOTBLK;
+ /* Reopen as anonymous vnode to protect against forced unmount. */
+ if ((error = bdevvp(vp->v_rdev, vpp)) != 0)
goto out;
+ VOP_UNLOCK(vp);
+ if ((error = vn_close(vp, FREAD | FWRITE, l->l_cred)) != 0) {
+ vrele(*vpp);
+ return error;
+ }
+ if ((error = VOP_OPEN(*vpp, FREAD | FWRITE, l->l_cred)) != 0) {
+ vrele(*vpp);
+ return error;
}
+ mutex_enter((*vpp)->v_interlock);
+ (*vpp)->v_writecount++;
+ mutex_exit((*vpp)->v_interlock);
- IFDEBUG(DKDB_VNODE, vprint("dk_lookup: vnode info", vp));
+ IFDEBUG(DKDB_VNODE, vprint("dk_lookup: vnode info", *vpp));
- VOP_UNLOCK(vp);
- *vpp = vp;
return 0;
out:
VOP_UNLOCK(vp);
(void) vn_close(vp, FREAD | FWRITE, l->l_cred);
Index: sys/dev/dm/dm.h
===================================================================
RCS file: /cvsroot/src/sys/dev/dm/dm.h,v
retrieving revision 1.25
diff -p -u -4 -r1.25 dm.h
--- sys/dev/dm/dm.h 9 Dec 2013 09:35:16 -0000 1.25
+++ sys/dev/dm/dm.h 4 Jun 2014 08:39:38 -0000
@@ -48,8 +48,10 @@
#include <sys/device.h>
#include <sys/disk.h>
#include <sys/disklabel.h>
+#include <miscfs/specfs/specdev.h> /* for v_rdev */
+
#include <prop/proplib.h>
#define DM_MAX_TYPE_NAME 16
#define DM_NAME_LEN 128
Index: sys/dev/dm/dm_target_linear.c
===================================================================
RCS file: /cvsroot/src/sys/dev/dm/dm_target_linear.c,v
retrieving revision 1.13
diff -p -u -4 -r1.13 dm_target_linear.c
--- sys/dev/dm/dm_target_linear.c 14 Oct 2011 09:23:30 -0000 1.13
+++ sys/dev/dm/dm_target_linear.c 4 Jun 2014 08:39:38 -0000
@@ -191,24 +191,16 @@ dm_target_linear_destroy(dm_table_entry_
int
dm_target_linear_deps(dm_table_entry_t * table_en, prop_array_t prop_array)
{
dm_target_linear_config_t *tlc;
- struct vattr va;
-
- int error;
if (table_en->target_config == NULL)
return ENOENT;
tlc = table_en->target_config;
- vn_lock(tlc->pdev->pdev_vnode, LK_SHARED | LK_RETRY);
- error = VOP_GETATTR(tlc->pdev->pdev_vnode, &va, curlwp->l_cred);
- VOP_UNLOCK(tlc->pdev->pdev_vnode);
- if (error != 0)
- return error;
-
- prop_array_add_uint64(prop_array, (uint64_t) va.va_rdev);
+ prop_array_add_uint64(prop_array,
+ (uint64_t) tlc->pdev->pdev_vnode->v_rdev);
return 0;
}
/*
Index: sys/dev/dm/dm_target_snapshot.c
===================================================================
RCS file: /cvsroot/src/sys/dev/dm/dm_target_snapshot.c,v
retrieving revision 1.15
diff -p -u -4 -r1.15 dm_target_snapshot.c
--- sys/dev/dm/dm_target_snapshot.c 14 Oct 2011 09:23:30 -0000 1.15
+++ sys/dev/dm/dm_target_snapshot.c 4 Jun 2014 08:39:38 -0000
@@ -347,35 +347,20 @@ int
dm_target_snapshot_deps(dm_table_entry_t * table_en,
prop_array_t prop_array)
{
dm_target_snapshot_config_t *tsc;
- struct vattr va;
-
- int error;
if (table_en->target_config == NULL)
return 0;
tsc = table_en->target_config;
- vn_lock(tsc->tsc_snap_dev->pdev_vnode, LK_SHARED | LK_RETRY);
- error = VOP_GETATTR(tsc->tsc_snap_dev->pdev_vnode, &va, curlwp->l_cred);
- VOP_UNLOCK(tsc->tsc_snap_dev->pdev_vnode);
- if (error != 0)
- return error;
-
- prop_array_add_uint64(prop_array, (uint64_t) va.va_rdev);
+ prop_array_add_uint64(prop_array,
+ (uint64_t) tsc->tsc_snap_dev->pdev_vnode->v_rdev);
if (tsc->tsc_persistent_dev) {
-
- vn_lock(tsc->tsc_cow_dev->pdev_vnode, LK_SHARED | LK_RETRY);
- error = VOP_GETATTR(tsc->tsc_cow_dev->pdev_vnode, &va,
- curlwp->l_cred);
- VOP_UNLOCK(tsc->tsc_cow_dev->pdev_vnode);
- if (error != 0)
- return error;
-
- prop_array_add_uint64(prop_array, (uint64_t) va.va_rdev);
+ prop_array_add_uint64(prop_array,
+ (uint64_t) tsc->tsc_cow_dev->pdev_vnode->v_rdev);
}
return 0;
}
Index: sys/dev/dm/dm_target_stripe.c
===================================================================
RCS file: /cvsroot/src/sys/dev/dm/dm_target_stripe.c,v
retrieving revision 1.18
diff -p -u -4 -r1.18 dm_target_stripe.c
--- sys/dev/dm/dm_target_stripe.c 7 Aug 2012 16:11:11 -0000 1.18
+++ sys/dev/dm/dm_target_stripe.c 4 Jun 2014 08:39:38 -0000
@@ -318,25 +318,17 @@ int
dm_target_stripe_deps(dm_table_entry_t * table_en, prop_array_t prop_array)
{
dm_target_stripe_config_t *tsc;
dm_target_linear_config_t *tlc;
- struct vattr va;
-
- int error;
if (table_en->target_config == NULL)
return ENOENT;
tsc = table_en->target_config;
TAILQ_FOREACH(tlc, &tsc->stripe_devs, entries) {
- vn_lock(tlc->pdev->pdev_vnode, LK_SHARED | LK_RETRY);
- error = VOP_GETATTR(tlc->pdev->pdev_vnode, &va, curlwp->l_cred);
- VOP_UNLOCK(tlc->pdev->pdev_vnode);
- if (error != 0)
- return error;
-
- prop_array_add_uint64(prop_array, (uint64_t) va.va_rdev);
+ prop_array_add_uint64(prop_array,
+ (uint64_t) tlc->pdev->pdev_vnode->v_rdev);
}
return 0;
}
Index: sys/dev/raidframe/rf_copyback.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_copyback.c,v
retrieving revision 1.49
diff -p -u -4 -r1.49 rf_copyback.c
--- sys/dev/raidframe/rf_copyback.c 14 Oct 2011 09:23:30 -0000 1.49
+++ sys/dev/raidframe/rf_copyback.c 4 Jun 2014 08:39:39 -0000
@@ -82,8 +82,10 @@ rf_ConfigureCopyback(RF_ShutdownList_t *
#include <sys/fcntl.h>
#include <sys/vnode.h>
#include <sys/namei.h> /* for pathbuf */
+#include <miscfs/specfs/specdev.h> /* for v_rdev */
+
/* do a complete copyback */
void
rf_CopybackReconstructedData(RF_Raid_t *raidPtr)
{
@@ -95,9 +97,8 @@ rf_CopybackReconstructedData(RF_Raid_t *
char *databuf;
struct pathbuf *dev_pb;
struct vnode *vp;
- struct vattr va;
int ac;
fcol = 0;
@@ -159,22 +160,17 @@ rf_CopybackReconstructedData(RF_Raid_t *
/* Ok, so we can at least do a lookup... How about actually
* getting a vp for it? */
- vn_lock(vp, LK_SHARED | LK_RETRY);
- retcode = VOP_GETATTR(vp, &va, curlwp->l_cred);
- VOP_UNLOCK(vp);
- if (retcode != 0)
- return;
retcode = rf_getdisksize(vp, &raidPtr->Disks[fcol]);
if (retcode) {
return;
}
raidPtr->raid_cinfo[fcol].ci_vp = vp;
- raidPtr->raid_cinfo[fcol].ci_dev = va.va_rdev;
+ raidPtr->raid_cinfo[fcol].ci_dev = vp->v_rdev;
- raidPtr->Disks[fcol].dev = va.va_rdev; /* XXX or the above? */
+ raidPtr->Disks[fcol].dev = vp->v_rdev; /* XXX or the above? */
/* we allow the user to specify that only a fraction of the
* disks should be used this is just for debug: it speeds up
* the parity scan */
Index: sys/dev/raidframe/rf_disks.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_disks.c,v
retrieving revision 1.85
diff -p -u -4 -r1.85 rf_disks.c
--- sys/dev/raidframe/rf_disks.c 25 Mar 2014 16:19:14 -0000 1.85
+++ sys/dev/raidframe/rf_disks.c 4 Jun 2014 08:39:39 -0000
@@ -79,8 +79,9 @@ __KERNEL_RCSID(0, "$NetBSD: rf_disks.c,v
#include <sys/fcntl.h>
#include <sys/vnode.h>
#include <sys/namei.h> /* for pathbuf */
#include <sys/kauth.h>
+#include <miscfs/specfs/specdev.h> /* for v_rdev */
static int rf_AllocDiskStructures(RF_Raid_t *, RF_Config_t *);
static void rf_print_label_status( RF_Raid_t *, int, char *,
RF_ComponentLabel_t *);
@@ -575,9 +576,8 @@ rf_ConfigureDisk(RF_Raid_t *raidPtr, cha
{
char *p;
struct pathbuf *pb;
struct vnode *vp;
- struct vattr va;
int error;
p = rf_find_non_white(bf);
if (p[strlen(p) - 1] == '\n') {
@@ -630,20 +630,14 @@ rf_ConfigureDisk(RF_Raid_t *raidPtr, cha
if (raidPtr->bytesPerSector == 0)
raidPtr->bytesPerSector = diskPtr->blockSize;
if (diskPtr->status == rf_ds_optimal) {
- vn_lock(vp, LK_SHARED | LK_RETRY);
- error = VOP_GETATTR(vp, &va, curlwp->l_cred);
- VOP_UNLOCK(vp);
- if (error != 0)
- return (error);
-
raidPtr->raid_cinfo[col].ci_vp = vp;
- raidPtr->raid_cinfo[col].ci_dev = va.va_rdev;
+ raidPtr->raid_cinfo[col].ci_dev = vp->v_rdev;
/* This component was not automatically configured */
diskPtr->auto_configured = 0;
- diskPtr->dev = va.va_rdev;
+ diskPtr->dev = vp->v_rdev;
/* we allow the user to specify that only a fraction of the
* disks should be used this is just for debug: it speeds up
* the parity scan */
Index: sys/dev/raidframe/rf_reconstruct.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_reconstruct.c,v
retrieving revision 1.119
diff -p -u -4 -r1.119 rf_reconstruct.c
--- sys/dev/raidframe/rf_reconstruct.c 6 Mar 2013 11:38:15 -0000 1.119
+++ sys/dev/raidframe/rf_reconstruct.c 4 Jun 2014 08:39:39 -0000
@@ -46,8 +46,10 @@ __KERNEL_RCSID(0, "$NetBSD: rf_reconstru
#include <sys/vnode.h>
#include <sys/namei.h> /* for pathbuf */
#include <dev/raidframe/raidframevar.h>
+#include <miscfs/specfs/specdev.h> /* for v_rdev */
+
#include "rf_raid.h"
#include "rf_reconutil.h"
#include "rf_revent.h"
#include "rf_reconbuffer.h"
@@ -351,9 +353,8 @@ rf_ReconstructInPlace(RF_Raid_t *raidPtr
uint64_t numsec;
unsigned int secsize;
struct pathbuf *pb;
struct vnode *vp;
- struct vattr va;
int retcode;
int ac;
rf_lock_mutex2(raidPtr->mutex);
@@ -455,20 +456,8 @@ rf_ReconstructInPlace(RF_Raid_t *raidPtr
/* Ok, so we can at least do a lookup...
How about actually getting a vp for it? */
- vn_lock(vp, LK_SHARED | LK_RETRY);
- retcode = VOP_GETATTR(vp, &va, curlwp->l_cred);
- VOP_UNLOCK(vp);
- if (retcode != 0) {
- vn_close(vp, FREAD | FWRITE, kauth_cred_get());
- rf_lock_mutex2(raidPtr->mutex);
- raidPtr->reconInProgress--;
- rf_signal_cond2(raidPtr->waitForReconCond);
- rf_unlock_mutex2(raidPtr->mutex);
- return(retcode);
- }
-
retcode = getdisksize(vp, &numsec, &secsize);
if (retcode) {
vn_close(vp, FREAD | FWRITE, kauth_cred_get());
rf_lock_mutex2(raidPtr->mutex);
@@ -481,11 +470,11 @@ rf_ReconstructInPlace(RF_Raid_t *raidPtr
raidPtr->Disks[col].blockSize = secsize;
raidPtr->Disks[col].numBlocks = numsec - rf_protectedSectors;
raidPtr->raid_cinfo[col].ci_vp = vp;
- raidPtr->raid_cinfo[col].ci_dev = va.va_rdev;
+ raidPtr->raid_cinfo[col].ci_dev = vp->v_rdev;
- raidPtr->Disks[col].dev = va.va_rdev;
+ raidPtr->Disks[col].dev = vp->v_rdev;
/* we allow the user to specify that only a fraction
of the disks should be used this is just for debug:
it speeds up * the parity scan */
--Apple-Mail=_75D36A34-10A6-4B2A-A8E7-2501FAD95608--
Responsible-Changed-From-To: kern-bug-people->hannken
Responsible-Changed-By: hannken@NetBSD.org
Responsible-Changed-When: Mon, 09 Jun 2014 14:21:57 +0000
Responsible-Changed-Why:
Take.
State-Changed-From-To: open->analyzed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Mon, 09 Jun 2014 14:21:57 +0000
State-Changed-Why:
Added a diff that should solve this problem.
Please test and report back.
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/48849: root mirror raid fails on shutdown
Date: Thu, 12 Jun 2014 19:03:03 +0100
Your patch cures the problem: raid7 no longer is failed on shutdown.
With your patch, all raidframe and cgd are successfully found and configured.
raidctl -R /dev/dk0 raid7 is successful
On shutdown:
unmounting 0xfffffe810e750008 / (/dev/raid7a)...
Entering dounmount / (/dev/raid7a)...
VFS_SYNC(/) = 0
VFS_UNMOUNT(/) = 16
unmounting 0xfffffe821cbc8008 /home (/dev/cgd0a)...
Entering dounmount /home (/dev/cgd0a)...
VFS_SYNC(/home) = 0
VFS_UNMOUNT(/home) = 16
unmounting 0xfffffe810e750008 / (/dev/raid7a)...
Entering dounmount / (/dev/raid7a)...
VFS_SYNC(/) = 0
VFS_UNMOUNT(/) = 16
cd0: detached
...
dk1: detached
unmounting 0xfffffe821cbc8008 /home (/dev/cgd0a)...
Entering dounmount /home (/dev/cgd0a)...
VFS_SYNC(/home) = 0
VFS_UNMOUNT(/home) = 16
unmounting 0xfffffe810e750008 / (/dev/raid7a)...
Entering dounmount / (/dev/raid7a)...
VFS_SYNC(/) = 0
VFS_UNMOUNT(/) = 16
forcefully unmounting /home (/dev/cgd0a)...
Entering dounmount /home (/dev/cgd0a)...
VFS_SYNC(/home) = 0
force: tag VT_UFS, ino 367616, on dev 20, 0 flags 0x0, nlink 62
mode 040755, owner 2171, group 0, size 9216
VFS_UNMOUNT(/home) = 0 forced
called vfs_hooks_unmount(/home)
Successfully exiting dounmount /home (/dev/cgd0a)...
unmounting 0xfffffe810e750008 / (/dev/raid7a)...
Entering dounmount / (/dev/raid7a)...
VFS_SYNC(/) = 0
VFS_UNMOUNT(/) = 16
cgd0: detached
...
dk4: detached
unmounting 0xfffffe810e750008 / (/dev/raid7a)...
Entering dounmount / (/dev/raid7a)...
VFS_SYNC(/) = 0
VFS_UNMOUNT(/) = 16
forcefully unmounting / (/dev/raid7a)...
Entering dounmount / (/dev/raid7a)...
VFS_SYNC(/) = 0
force: tag VT_UFS, ino 2, on dev 18, 112 flags 0x0, nlink 31
mode 040755, owner 0, group 0, size 1024
force: tag VT_UFS, ino 107681, on dev 18, 112 flags 0x0, nlink 1
mode 0100555, owner 0, group 0, size 30818
force: tag VT_UFS, ino 107611, on dev 18, 112 flags 0x0, nlink 1
mode 0100555, owner 0, group 0, size 92163
force: tag VT_UFS, ino 43209, on dev 18, 112 flags 0x0, nlink 1
mode 0100444, owner 0, group 0, size 105508
force: tag VT_UFS, ino 43185, on dev 18, 112 flags 0x0, nlink 1
mode 0100444, owner 0, group 0, size 31017
force: tag VT_UFS, ino 43053, on dev 18, 112 flags 0x0, nlink 1
mode 0100444, owner 0, group 0, size 55263
force: tag VT_UFS, ino 43180, on dev 18, 112 flags 0x0, nlink 1
mode 0100444, owner 0, group 0, size 1611473
force: tag VT_UFS, ino 107676, on dev 18, 112 flags 0x0, nlink 3
mode 0100555, owner 0, group 0, size 13022
VFS_UNMOUNT(/) = 0 forced
called vfs_hooks_unmount(/)
Successfully exiting dounmount / (/dev/raid7a)...
raid7: detached
raid7: detached
dk7 at wd1 (df25cd61-6326-11e3-8f70-10bf48bd3389) deleted
dk7: detached
dk0 at wd0 (80706d87-e1f8-11e3-9080-10bf48bd3389) deleted
dk0: detached
wd1: detached
wd0: detached
atabus5: detached
atabus4: detached
acpi0: entering state S5
From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/48849 CVS commit: src/sys
Date: Sat, 14 Jun 2014 07:39:01 +0000
Module Name: src
Committed By: hannken
Date: Sat Jun 14 07:39:01 UTC 2014
Modified Files:
src/sys/dev: ccd.c cgd.c dksubr.c
src/sys/dev/dm: dm.h dm_target_linear.c dm_target_snapshot.c
dm_target_stripe.c
src/sys/dev/raidframe: rf_copyback.c rf_disks.c rf_reconstruct.c
src/sys/sys: param.h
Log Message:
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.
Fix for PR kern/48849 (root mirror raid fails on shutdown)
Welcome to 6.99.44
To generate a diff of this commit:
cvs rdiff -u -r1.148 -r1.149 src/sys/dev/ccd.c
cvs rdiff -u -r1.87 -r1.88 src/sys/dev/cgd.c
cvs rdiff -u -r1.50 -r1.51 src/sys/dev/dksubr.c
cvs rdiff -u -r1.25 -r1.26 src/sys/dev/dm/dm.h
cvs rdiff -u -r1.13 -r1.14 src/sys/dev/dm/dm_target_linear.c
cvs rdiff -u -r1.15 -r1.16 src/sys/dev/dm/dm_target_snapshot.c
cvs rdiff -u -r1.18 -r1.19 src/sys/dev/dm/dm_target_stripe.c
cvs rdiff -u -r1.49 -r1.50 src/sys/dev/raidframe/rf_copyback.c
cvs rdiff -u -r1.85 -r1.86 src/sys/dev/raidframe/rf_disks.c
cvs rdiff -u -r1.119 -r1.120 src/sys/dev/raidframe/rf_reconstruct.c
cvs rdiff -u -r1.453 -r1.454 src/sys/sys/param.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->closed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Mon, 16 Jun 2014 08:07:44 +0000
State-Changed-Why:
Fix committed.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.