NetBSD Problem Report #42532

From tron@zhadum.org.uk  Sun Dec 27 23:20:21 2009
Return-Path: <tron@zhadum.org.uk>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 121C063B844
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 27 Dec 2009 23:20:21 +0000 (UTC)
Message-Id: <20091227232017.4CB16FA09C@lyssa.zhadum.org.uk>
Date: Sun, 27 Dec 2009 23:20:17 +0000 (GMT)
From: tron@zhadum.org.uk
Reply-To: tron@zhadum.org.uk
To: gnats-bugs@gnats.NetBSD.org
Subject: "dm" driver crashes on concurrent access
X-Send-Pr-Version: 3.95

>Number:         42532
>Category:       kern
>Synopsis:       "dm" driver crashes on concurrent access
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    haad
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Dec 27 23:25:00 +0000 2009
>Closed-Date:    Wed Dec 30 00:12:23 +0000 2009
>Last-Modified:  Wed Dec 30 20:55:01 +0000 2009
>Originator:     tron@zhadum.org.uk
>Release:        NetBSD 5.99.22 2009-12-20 sources
>Organization:
Matthias Scheler                                  http://zhadum.org.uk/
>Environment:
System: NetBSD lyssa.zhadum.org.uk 5.99.22 NetBSD 5.99.22 (LYSSA) #0: Sun Dec 20 13:57:52 GMT 2009 tron@lyssa.zhadum.org.uk:/src/sys/compile/LYSSA i386
Architecture: i386
Machine: i386
>Description:
The code which is supposed to report iostat(8) data for the "dm" driver
can lead to kernel panics:

vg0-lv1: busy < 0
panic: iostat_unbusy
cpu2: Begin traceback...
vg0-lv1: busy < 0?
(0,0,ce8a0acc,0,ffffffff,1a,0,d09e,0,0) at 0
cpu2: End traceback...

dumping to dev 0,1 offset 4195364
dump succeeded

Here is the back-trace:

#0  0xc02488ab in cpu_reboot ()
#1  0xc031cba4 in panic ()
#2  0xc0312e5a in iostat_unbusy ()
#3  0xce9a7694 in dmstrategy ()
#4  0xc0301717 in spec_strategy ()
#5  0xc03db050 in VOP_STRATEGY ()
#6  0xc0193857 in genfs_do_io ()
#7  0xc019401e in genfs_gop_write ()
#8  0xc0194c5f in genfs_do_putpages ()
#9  0xc01954e8 in genfs_putpages ()
#10 0xc03dacd0 in VOP_PUTPAGES ()
#11 0xc05d5c27 in ffs_full_fsync ()
#12 0xc05d5f30 in ffs_fsync ()
#13 0xc03db74d in VOP_FSYNC ()
#14 0xc0321d31 in sched_sync ()
#15 0xc01002e1 in lwp_trampoline ()

Kernel with debugging symbols:

#0  cpu_reboot (howto=260, bootstr=0x0)
    at /usr/src/sys/arch/i386/i386/machdep.c:861
#1  0xc031cba4 in panic (fmt=0xc04a82e2 "iostat_unbusy")
    at /usr/src/sys/kern/subr_prf.c:299
#2  0xc0312e5a in iostat_unbusy (stats=0xce6fea80, bcount=65536, read=0)
    at /usr/src/sys/kern/subr_iostat.c:208
#3  0xce9a7694 in ?? ()
#4  0xce6fea80 in ?? ()
#5  0x00010000 in ?? ()
#6  0x00000000 in ?? ()

>How-To-Repeat:
1.) Create a volume group consisting of two physical devices.
2.) Create two logical volumes on the volume group.
3.) Create and mount file-systems on the two logical volume.
4.) un-tar NetBSD sources to both file-systems in parallel.

>Fix:
None provided.

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->haad
Responsible-Changed-By: tron@NetBSD.org
Responsible-Changed-When: Mon, 28 Dec 2009 09:34:03 +0000
Responsible-Changed-Why:
Adam wants to handle this PR.


From: Adam Hamsik <haaaad@gmail.com>
To: Matthias Scheler <tron@NetBSD.org>
Cc: 
Subject: Re: kern/42532: "dm" driver crashes on concurrent access (fwd)
Date: Mon, 28 Dec 2009 21:00:12 +0100

 Hi,

 Can you test something like this ? I think that it will make dm working =
 in your case.

 -       /* FIXME: have to be called with IPL_BIO*/
 +       /* FIXME: There is a problem with disk(9) api because disk_busy =
 and
 +          disk_unbusy are not protected against concurrent usage and =
 needs
 +          explicit locking from caller.*/
 +       mutex_enter(&dmv->diskp->dk_openlock);
         disk_busy(dmv->diskp);
 +       mutex_exit(&dmv->diskp->dk_openlock);
        =20
         /* Select active table */
         tbl =3D dm_table_get_entry(&dmv->table_head, DM_TABLE_ACTIVE);
 @@ -459,8 +565,10 @@ dmstrategy(struct buf *bp)
         if (issued_len < buf_len)
                 nestiobuf_done(bp, buf_len - issued_len, EINVAL);
 =20
 -       /* FIXME have to be called with SPL_BIO*/
 +       /* FIXME disk_unbusy need to manage locking by itself. */
 +       mutex_enter(&dmv->diskp->dk_openlock);
         disk_unbusy(dmv->diskp, buf_len, bp !=3D NULL ? bp->b_flags & =
 B_READ : 0);
 +       mutex_exit(&dmv->diskp->dk_openlock);

 Regards

 Adam.

From: Matthias Scheler <tron@NetBSD.org>
To: Adam Hamsik <haaaad@gmail.com>
Cc: NetBSD GNATS <gnats-bugs@NetBSD.org>
Subject: Re: kern/42532: "dm" driver crashes on concurrent access (fwd)
Date: Tue, 29 Dec 2009 12:40:50 +0000

 On Mon, Dec 28, 2009 at 09:00:12PM +0100, Adam Hamsik wrote:
 > Can you test something like this ? I think that it will make dm
 > working in your case.

 Yes, this patch prevents any panics during my stress test.

 	Thanks a lot

 -- 
 Matthias Scheler                                  http://zhadum.org.uk/

From: Adam Hamsik <haad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/42532 CVS commit: src/sys/dev/dm
Date: Tue, 29 Dec 2009 23:37:48 +0000

 Module Name:	src
 Committed By:	haad
 Date:		Tue Dec 29 23:37:48 UTC 2009

 Modified Files:
 	src/sys/dev/dm: device-mapper.c dm.h dm_dev.c dm_ioctl.c

 Log Message:
 Add private lock to dm_dev_t used for mutual exclusion for diks(9) api
 routines. This change fixes PR kern/42532.


 To generate a diff of this commit:
 cvs rdiff -u -r1.10 -r1.11 src/sys/dev/dm/device-mapper.c
 cvs rdiff -u -r1.16 -r1.17 src/sys/dev/dm/dm.h
 cvs rdiff -u -r1.6 -r1.7 src/sys/dev/dm/dm_dev.c
 cvs rdiff -u -r1.17 -r1.18 src/sys/dev/dm/dm_ioctl.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: haad@NetBSD.org
State-Changed-When: Wed, 30 Dec 2009 00:12:23 +0000
State-Changed-Why:
PR was fixed by last commit in device-mapper driver.


From: Matthias Scheler <tron@zhadum.org.uk>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/42532 ("dm" driver crashes on concurrent access)
Date: Wed, 30 Dec 2009 20:51:58 +0000

 On Wed, Dec 30, 2009 at 12:12:23AM +0000, Adam Hamsik wrote:
 > Synopsis: "dm" driver crashes on concurrent access
 > 
 > State-Changed-From-To: open->closed
 > State-Changed-By: haad@NetBSD.org
 > State-Changed-When: Wed, 30 Dec 2009 00:12:23 +0000
 > State-Changed-Why:
 > PR was fixed by last commit in device-mapper driver.

 Yes, the change fixed the problem for me.

 	Thanks a lot

 -- 
 Matthias Scheler                                  http://zhadum.org.uk/

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.