NetBSD Problem Report #45948
From sborrill@precedence.co.uk Wed Feb 8 08:19:05 2012
Return-Path: <sborrill@precedence.co.uk>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id 4837563BD87
for <gnats-bugs@gnats.NetBSD.org>; Wed, 8 Feb 2012 08:19:05 +0000 (UTC)
Message-Id: <201202080819.q188J0Ej011913@precedence.co.uk>
Date: Wed, 8 Feb 2012 08:19:00 GMT
From: netbsd@precedence.co.uk
Reply-To: netbsd@precedence.co.uk
To: gnats-bugs@gnats.NetBSD.org
Subject: dk(4) on raid(4) panic on halt with netbsd-5
X-Send-Pr-Version: 3.95
>Number: 45948
>Category: kern
>Synopsis: dk(4) on raid(4) will panic with lock error if mounted when halting (netbsd-5)
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Feb 08 08:20:00 +0000 2012
>Closed-Date: Sat Dec 19 02:33:35 +0000 2015
>Last-Modified: Sat Dec 19 02:33:35 +0000 2015
>Originator: Stephen Borrill
>Release: NetBSD 5.1_STABLE
>Organization:
>Environment:
System: NetBSD 5.1_STABLE NetBSD 5.1_STABLE (DEBUG) #0: Thu Feb 2 17:13:56 GMT 2012 root@builder.internal.precedence.co.uk:/usr/obj/5.0/i386/sys/arch/i386/compile/DEBUG i386
Architecture: i386
Machine: i386
>Description:
On a netbsd-5 system, if a gpt partition table is created on a RAIDframe device (type
does not matter) and a wedge is created, if the wedge is mounted when the machine is
halted, it will panic. It will not panic if the wedge is not mounted.
raid0: RAID Level 1
raid0: Components: /dev/wd1a component1[**FAILED**]
raid0: Total Sectors: 10485632 (5119 MB)
dk0 at raid0: 077b35b2-4d9c-11e1-9d54-525400123456
dk0: 10485535 blocks at 64, type: ffs
# mount /dev/dk0 /mnt
# mount
/dev/wd0a on / type ffs (local)
kernfs on /kern type kernfs (local)
ptyfs on /dev/pts type ptyfs (local)
procfs on /proc type procfs (local)
/dev/dk0 on /mnt type ffs (local)
# sysctl -w ddb.onpanic=1
ddb.onpanic: 0 -> 1
# halt -p
Feb 2 16:21:39 halt: halted by root
Feb 2 16:21:39 syslogd: Exiting on signal 15
syncing disks... done
unmounting /mnt (/dev/dk0)...Mutex error: lockdebug_wantlock: locking
against myself
lock address : 0x00000000c12dcd1c type : sleep/adaptive
initialized : 0x00000000c05066de
shared holds : 0 exclusive: 1
shares wanted: 0 exclusive: 1
current cpu : 0 last held: 0
current lwp : 0x00000000cb292320 last held: 0x00000000cb292320
last locked : 0x00000000c04a4eb5 unlocked : 0x00000000c04a50b8
owner field : 0x00000000cb292320 wait/spin: 0/0
Turnstile chain at 0xc0c62b20.
=> No active turnstile for this lock.
panic: LOCKDEBUG
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c05dc2dc cs 8 eflags 246 cr2 bb91c004 ilevel 0
Stopped in pid 284.1 (halt) at netbsd:breakpoint+0x4: popl %ebp
db{0}> bt
breakpoint(c0b5f79e,cb3bd7a8,c0b9a580,c051205f,0,1,0,0,cb3bd7a8,8) at netbsd:breakpoint+0x4
panic(c0b16344,c0b6b914,c0901f97,c0afdb1d,9340,1292320,0,c12dcd1c,0,0) at netbsd:panic+0x1b0
lockdebug_abort1(c0afdb1d,1,0,0,ca70a180,c0c6ba20,0,c0b9a698,c0c6ba20,0) at netbsd:lockdebug_abort1+0xbb
mutex_vector_enter(c12dcd1c,0,0,4,0,c1209828,c1209864,c04dec5a,a8,a8) at netbsd:mutex_vector_enter+0x394
dkwedge_del(cb3bd8b0,cb33eb00,10,c050aaef,c0c640cc,c0c60c20,cb3bd8dc,306b64,c0c640cc,c0c60c20) at netbsd:dkwedge_del+0x198
dkwedge_delall(c1209828,c0b84280,0,c08d41c0,1203,cb292320,cb3bd9dc,c0504f74,1203,3) at netbsd:dkwedge_delall+0x61
raidclose(1203,3,6000,cb292320,6000,3,6,3,cb624234,0) at netbsd:raidclose+0x12f
bdev_close(1203,3,6000,cb292320,0,0,cb3bda2c,1203,6000,0) at netbsd:bdev_close+0x84
spec_close(cb3bda38,20002,cb3bda4c,c055e5c8,cb624234,c09031a0,cb624234,3,ffffffff,3) at netbsd:spec_close+0x24b
VOP_CLOSE(cb624234,3,ffffffff,c12dcc00,c12dcc00,0,cb3bda9c,c04a4f67,cb624234,3) at netbsd:VOP_CLOSE+0x6c
vn_close(cb624234,3,ffffffff,c043cd8f,0,cb292320,0,c0900240,a800,cb292320) at netbsd:vn_close+0x4e
dkclose(a800,3,6000,cb292320,6000,3,6,3,cb4925d0,0) at netbsd:dkclose+0xe7
bdev_close(a800,3,6000,cb292320,0,0,cb3bdb1c,a800,6000,0) at netbsd:bdev_close+0x84
spec_close(cb3bdb28,20002,cb3bdb3c,c055e5c8,cb4925d0,c09031a0,cb4925d0,3,ffffffff,c12cf000) at netbsd:spec_close+0x24b
VOP_CLOSE(cb4925d0,3,ffffffff,0,0,c12cd7cc,c12cd780,cb37f6d4,cb37f6d4,cb37f6f8) at netbsd:VOP_CLOSE+0x6c
ffs_unmount(cb37f6d4,80000,0,0,0,0,cb3bdbbc,c055c49f,cb37f6d4,80000) at netbsd:ffs_unmount+0x1f4
VFS_UNMOUNT(cb37f6d4,80000,ca3a6cc0,0,1000,c0549cba,1,cb37f6d4,cb37f7cc,cb2ae000) at netbsd:VFS_UNMOUNT+0x26
dounmount(cb37f6d4,80000,cb292320,0,cb3bdbf8,cb292320,0,cb292320,cb3bdd00,c0b95fe0) at netbsd:dounmount+0x13f
vfs_unmountall(cb292320,0,0,c04c530d,ca38a63c,808,cb3bdc2c,c05e2dfb,0,cb292320) at netbsd:vfs_unmountall+0x86
vfs_shutdown(0,cb292320,0,0,cb3bdd00,0,cb3bdcdc,c04fed94,808,0) at netbsd:vfs_shutdown+0x8d
cpu_reboot(808,0,0,0,0,0,cb3bdc9c,c05c9d52,23,cb3bdcc0) at netbsd:cpu_reboot+0x13b
sys_reboot(cb292320,cb3bdd00,cb3bdd28,cb3bdd40,c05c9d00,ca3bcf60,1,808,0,bfbfeb28) at netbsd:sys_reboot+0x74
syscall(cb3bdd48,b3,ab,1f,1f,1,d,bfbfeb28,2,256) at netbsd:syscall+0xc8
db{0}> x 0x00000000c12dcd1c
0xc12dcd1c: cb292324
(gdb) list *(0x00000000c05066de)
0xc05066de is in disk_init (/usr/src/5.0/sys/kern/subr_disk.c:195).
190 mutex_init(&diskp->dk_rawlock, MUTEX_DEFAULT, IPL_NONE);
191 mutex_init(&diskp->dk_openlock, MUTEX_DEFAULT, IPL_NONE);
192 LIST_INIT(&diskp->dk_wedges);
193 diskp->dk_nwedges = 0;
194 diskp->dk_labelsector = LABELSECTOR;
195 disk_blocksize(diskp, DEV_BSIZE);
196 diskp->dk_name = name;
197 diskp->dk_driver = driver;
(gdb) list *(0x00000000c04a4eb5)
0xc04a4eb5 is in dkclose (/usr/src/5.0/sys/dev/dkwedge/dk.c:973).
968
969 KASSERT(sc->sc_dk.dk_openmask != 0);
970
971 mutex_enter(&sc->sc_dk.dk_openlock);
972
973 if (fmt == S_IFCHR)
974 sc->sc_dk.dk_copenmask &= ~1;
975 else
976 sc->sc_dk.dk_bopenmask &= ~1;
977 sc->sc_dk.dk_openmask =
(gdb) list *(0x00000000c04a50b8)
0xc04a50b8 is in dkopen (/usr/src/5.0/sys/dev/dkwedge/dk.c:954).
949 sc->sc_dk.dk_openmask =
950 sc->sc_dk.dk_copenmask | sc->sc_dk.dk_bopenmask;
951
952 popen_fail:
953 mutex_exit(&sc->sc_parent->dk_rawlock);
954 mutex_exit(&sc->sc_dk.dk_openlock);
955 return (error);
956 }
957
958 /*
>How-To-Repeat:
# qemu-img create -f qcow wd0.fs G5
# qemu-img create -f qcow wd1.fs G5
# qemu -hda wd0.fs -cdrom i386cd.iso
*install minimal NetBSD and halt*
# qemu -hda wd0.fs -hdb wd1.fs -boot c
*Login as root*
# cat > raid0.conf
START array
1 2 0
START disks
/dev/wd1a
absent
START layout
128 1 1 1
START queue
fifo 100
^D
# raidctl -C raid0.conf raid0
# raidctl -i raid0
# gpt create raid0
# gpt add -t ufs -b 64 raid0
Partition added, use:
dkctl raid0 addwedge <wedgename> 64 10485535 <type>
to create a wedge for it
# dkctl raid0d addwedge dk0 64 10485535 ufs
dk0 created successfully.
# newfs -O2 -f 4096 -b 32768 -I /dev/rdk0
/dev/rdk0: 5119.9MB (10485528 sectors) block size 32768, fragment size 4096
using 7 cylinder groups of 731.44MB, 23406 blks, 46080 inodes.
super-block backups (for fsck_ffs -b #) at:
192, 1498176, 2996160, 4494144, 5992128, 7490112, 8988096,
# mount /dev/dk0 /mnt
# halt -p
>Fix:
<how to correct or work around the problem, if known (multiple
lines)>
>Release-Note:
>Audit-Trail:
From: Stephen Borrill <netbsd@precedence.co.uk>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/45948: dk(4) on raid(4) panic on halt with netbsd-5
Date: Mon, 5 Mar 2012 16:41:52 +0000 (GMT)
Work around is to ensure that filesystems on wedges are
explicitly unmounted.
So something like the following in /etc/rc.shutdown.local:
fs=`mount | awk '{if ($1 ~ "^/dev/dk[0-9]") print $3}'`
for f in $fs
do
echo "Unmounting $fs"
umount $fs
done
There may be processes with open files on the filesystem which will stop
it being unmounted. You may want to add # KEYWORD: shutdown to relevant
rc.d scripts
In my case, this is only likely to be samba and istgt,
so I stop those in the rc.shutdown.local script:
fs=`mount | awk '{if ($1 ~ "^/dev/dk[0-9]") print $3}'`
if [ -n "$fs" ]; then
for srv in smbd istgt
do
/etc/rc.d/$srv status > /dev/null
if [ $? = 0 ]; then
/etc/rc.d/$srv stop
fi
done
for f in $fs
do
echo "Unmounting $fs"
umount $fs
done
fi
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/45948: dk(4) on raid(4) panic on halt with netbsd-5
Date: Mon, 5 Mar 2012 17:32:55 +0000
On Mon, Mar 05, 2012 at 04:45:02PM +0000, Stephen Borrill wrote:
> Work around is to ensure that filesystems on wedges are
> explicitly unmounted.
The code that -6 and -current have to unmount things in order isn't in
-5, right? So this problem only affects -5?
--
David A. Holland
dholland@netbsd.org
From: Stephen Borrill <netbsd@precedence.co.uk>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/45948: dk(4) on raid(4) panic on halt with netbsd-5
Date: Mon, 5 Mar 2012 19:24:07 +0000 (GMT)
On Mon, 5 Mar 2012, David Holland wrote:
> On Mon, Mar 05, 2012 at 04:45:02PM +0000, Stephen Borrill wrote:
> > Work around is to ensure that filesystems on wedges are
> > explicitly unmounted.
>
> The code that -6 and -current have to unmount things in order isn't in
> -5, right? So this problem only affects -5?
Right - it is tested as working correctly in -6 and -current.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/45948: dk(4) on raid(4) panic on halt with netbsd-5
Date: Mon, 5 Mar 2012 19:36:16 +0000
On Mon, Mar 05, 2012 at 07:25:04PM +0000, Stephen Borrill wrote:
> > The code that -6 and -current have to unmount things in order isn't in
> > -5, right? So this problem only affects -5?
>
> Right - it is tested as working correctly in -6 and -current.
I have so tagged it, thanks.
--
David A. Holland
dholland@netbsd.org
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 19 Dec 2015 02:33:35 +0000
State-Changed-Why:
Problem only affected -5 and -5 is now EOL.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.