NetBSD Problem Report #41051

From www@NetBSD.org  Sat Mar 21 07:22:34 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 1D9A863B9E6
	for <gnats-bugs@gnats.netbsd.org>; Sat, 21 Mar 2009 07:22:34 +0000 (UTC)
Message-Id: <20090321072233.D9E1263B8C5@www.NetBSD.org>
Date: Sat, 21 Mar 2009 07:22:33 +0000 (UTC)
From: pooka@iki.fi
Reply-To: pooka@iki.fi
To: gnats-bugs@NetBSD.org
Subject: do_sys_rename mp resource race
X-Send-Pr-Version: www-1.0

>Number:         41051
>Category:       kern
>Synopsis:       do_sys_rename mp resource race
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Mar 21 07:25:00 +0000 2009
>Closed-Date:    Sat Sep 05 20:28:55 +0000 2015
>Last-Modified:  Sat Sep 05 20:28:55 +0000 2015
>Originator:     Antti Kantee
>Release:        
>Organization:
>Environment:
>Description:
As reported by Uebayasi-san, a MNT_FORCE unmount during rename may
cause the renamelock to be destroyed while it is still locked.

db{1}> bt
breakpoint() at netbsd:breakpoint+0x5
panic() at netbsd:panic+0x249
lockdebug_abort() at netbsd:lockdebug_abort+0x42
mutex_destroy() at netbsd:mutex_destroy+0x38
vfs_destroy() at netbsd:vfs_destroy+0x58
puffs_msgif_close() at netbsd:puffs_msgif_close+0x67
putter_fop_close() at netbsd:putter_fop_close+0x63
closef() at netbsd:closef+0x70
fd_close() at netbsd:fd_close+0x137
fd_free() at netbsd:fd_free+0xb7
exit1() at netbsd:exit1+0x12f
sigexit() at netbsd:sigexit+0x1ab
postsig() at netbsd:postsig+0x13b
lwp_userret() at netbsd:lwp_userret+0x177
syscall() at netbsd:syscall+0x17c

> Hmm, can you verify that vfs_destroy+0x58 is the line where it destroys
> mnt_renamelock?

Yes, it is.
>How-To-Repeat:
run a puffs file server which crashes in rename.  get lucky with
timing.
>Fix:
take reference to mp around rename

>Release-Note:

>Audit-Trail:
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41051: do_sys_rename mp resource race
Date: Sat, 21 Mar 2009 08:05:39 +0000

 On Sat, Mar 21, 2009 at 07:25:00AM +0000, pooka@iki.fi wrote:

 > >Fix:
 > take reference to mp around rename

 Disagree. I think VOPs should not, in general, be dropping refs to vnodes
 that the caller provides.

From: Antti Kantee <pooka@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41051: do_sys_rename mp resource race
Date: Sat, 21 Mar 2009 10:36:09 +0200

 On Sat Mar 21 2009 at 08:10:06 +0000, Andrew Doran wrote:
 > The following reply was made to PR kern/41051; it has been noted by GNATS.
 > 
 > From: Andrew Doran <ad@netbsd.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: kern/41051: do_sys_rename mp resource race
 > Date: Sat, 21 Mar 2009 08:05:39 +0000
 > 
 >  On Sat, Mar 21, 2009 at 07:25:00AM +0000, pooka@iki.fi wrote:
 >  
 >  > >Fix:
 >  > take reference to mp around rename
 >  
 >  Disagree. I think VOPs should not, in general, be dropping refs to vnodes
 >  that the caller provides.

 do_sys_rename() is not a VOP, neither is vfs_destroy().  What do you
 mean by a VOP here?

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41051: do_sys_rename mp resource race
Date: Sat, 21 Mar 2009 14:34:36 +0000

 On Sat, Mar 21, 2009 at 08:40:03AM +0000, Antti Kantee wrote:

 >  >  > take reference to mp around rename
 >  >  
 >  >  Disagree. I think VOPs should not, in general, be dropping refs to vnodes
 >  >  that the caller provides.
 >  
 >  do_sys_rename() is not a VOP, neither is vfs_destroy().  What do you
 >  mean by a VOP here?

 Well in general the model is that references propagate. So if you hold a
 reference at some point, say a a file descriptor, everything down the chain
 from that persists, be it a file, vnode, mount, socket - whatever. I'm
 guessing you probably already grok that.

 VOP_RENAME() is a pariah here because it tosses the caller's dvp references.
 On the way in both mount structures involved are valid but after it returns,
 either can be gone. So the operation is not logically complete from the
 driver's perspective (do_sys_rename) but the objects involved are already
 gone.

 (Incidentally, it looks like there is also a bug in vflush, where it is
 doing insmntque() on block/char device vnodes for some reason. I have
 no idea why.)

From: Antti Kantee <pooka@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41051: do_sys_rename mp resource race
Date: Sun, 22 Mar 2009 06:20:44 +0200

 On Sat Mar 21 2009 at 14:35:01 +0000, Andrew Doran wrote:
 > The following reply was made to PR kern/41051; it has been noted by GNATS.
 > 
 > From: Andrew Doran <ad@netbsd.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: kern/41051: do_sys_rename mp resource race
 > Date: Sat, 21 Mar 2009 14:34:36 +0000
 > 
 >  On Sat, Mar 21, 2009 at 08:40:03AM +0000, Antti Kantee wrote:
 >  
 >  >  >  > take reference to mp around rename
 >  >  >  
 >  >  >  Disagree. I think VOPs should not, in general, be dropping refs to vnodes
 >  >  >  that the caller provides.
 >  >  
 >  >  do_sys_rename() is not a VOP, neither is vfs_destroy().  What do you
 >  >  mean by a VOP here?
 >  
 >  Well in general the model is that references propagate. So if you hold a
 >  reference at some point, say a a file descriptor, everything down the chain
 >  from that persists, be it a file, vnode, mount, socket - whatever. I'm
 >  guessing you probably already grok that.
 >  
 >  VOP_RENAME() is a pariah here because it tosses the caller's dvp references.
 >  On the way in both mount structures involved are valid but after it returns,
 >  either can be gone. So the operation is not logically complete from the
 >  driver's perspective (do_sys_rename) but the objects involved are already
 >  gone.

 Could you explain what you want to do about the problem.  I have a
 hard time interpreting the above in any other way except that you
 are disagreeing with the suggested fix because you think file system
 locking should be completely reworked so that this problem with rename
 can be fixed.  I do agree that improvements in that area are necessary,
 but I think that is way out of scope here, especially considering the
 release branch.

 (also, do_sys_rename handles only one mountpoint, the rest is up to the
 file system, but that's a minor detail)

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41051: do_sys_rename mp resource race
Date: Sun, 22 Mar 2009 20:02:42 +0000

 On Sun, Mar 22, 2009 at 04:25:01AM +0000, Antti Kantee wrote:

 >  Could you explain what you want to do about the problem.  I have a
 >  hard time interpreting the above in any other way except that you
 >  are disagreeing with the suggested fix because you think file system
 >  locking should be completely reworked so that this problem with rename
 >  can be fixed.

 I don't think we'd need to go that far to fix this one, but there would be a
 good chunk of work involved in it.

 >  I do agree that improvements in that area are necessary, but I think that
 >  is way out of scope here, especially considering the release branch.

 For an immediate remedy a ref to the mount and a comment about it would do.

 >  (also, do_sys_rename handles only one mountpoint, the rest is up to the
 >  file system, but that's a minor detail)

 Right.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41051: do_sys_rename mp resource race
Date: Sun, 22 Mar 2009 22:30:21 +0000

 On Sat, Mar 21, 2009 at 08:10:06AM +0000, Andrew Doran wrote:
  >> take reference to mp around rename
  >  
  >  Disagree. I think VOPs should not, in general, be dropping refs to vnodes
  >  that the caller provides.

 Of course they shouldn't... but you know they *all* do that, right?
 Cleaning this up is not exactly a trivial undertaking...

 -- 
 David A. Holland
 dholland@netbsd.org

From: ad@netbsd.org
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41051: do_sys_rename mp resource race
Date: Mon, 23 Mar 2009 09:15:03 +0000

 On Sun, Mar 22, 2009 at 10:35:01PM +0000, David Holland wrote:

 >  On Sat, Mar 21, 2009 at 08:10:06AM +0000, Andrew Doran wrote:
 >   >> take reference to mp around rename
 >   >  
 >   >  Disagree. I think VOPs should not, in general, be dropping refs to vnodes
 >   >  that the caller provides.
 >  
 >  Of course they shouldn't... but you know they *all* do that, right?

 Wrong.

 >  Cleaning this up is not exactly a trivial undertaking...

 Right.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, pooka@iki.fi
Subject: Re: kern/41051: do_sys_rename mp resource race
Date: Tue, 24 Mar 2009 20:42:05 +0000

 On Mon, Mar 23, 2009 at 09:20:04AM +0000, ad@netbsd.org wrote:
  >>>  Disagree. I think VOPs should not, in general, be dropping refs to vnodes
  >>>  that the caller provides.
  >>  
  >>  Of course they shouldn't... but you know they *all* do that, right?
  >  
  >  Wrong.

 Have you been changing them around, or are you just being difficult
 because "all" isn't quite exactly precise?

 -- 
 David A. Holland
 dholland@netbsd.org

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 05 Sep 2015 20:28:55 +0000
State-Changed-Why:
fixed a long time ago, and was left open because it was being used as a 
political football.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.