NetBSD Problem Report #39206

From simonb@thistledown.com.au  Fri Jul 25 14:12:15 2008
Return-Path: <simonb@thistledown.com.au>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 4D57B63B99A
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 25 Jul 2008 14:12:15 +0000 (UTC)
Message-Id: <20080725141213.6F8F1AFD05@thoreau.thistledown.com.au>
Date: Sat, 26 Jul 2008 00:12:13 +1000 (EST)
From: Simon Burge <simonb@NetBSD.org>
Reply-To: Simon Burge <simonb@NetBSD.org>
To: gnats-bugs@gnats.NetBSD.org
Subject: ffs um_lock handling isn't great
X-Send-Pr-Version: 3.95

>Number:         39206
>Category:       kern
>Synopsis:       ffs um_lock handling isn't great
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jul 25 14:15:00 +0000 2008
>Last-Modified:  Sun Jul 27 15:45:01 +0000 2008
>Originator:     Simon Burge
>Release:        NetBSD -current, mid 2008
>Organization:
>Environment:
	Architecture: any
	Machine: any
>Description:
        (struct ufs_mount *)->um_lock handling is a bit suspect,
        and seems to be released/gained fairly freely leaving the
        possibility of race conditions.

        One example pointed out by pooka@ is at the top of
        ffs_alloccg().  It appears that once the free block check at
        the top of this function succeeds, this function isn't allowed
        to fail.  This is noted in the "XXX fvdl mapsearch ..." comment
        further down.  This function is entered with um_lock held, and
        once the free block check has passed um_lock is dropped.  This
        then allows another thread to reach the same point, and could
        lead to problems if there was only one block free in the CG
        before the first thread get there.

        This PR is entered as priority "medium" and not "high" since no
        actual problems have been observed in practice yet.

>How-To-Repeat:
        Code inspection.  Pooka suggested introducing a sleep after
        um_lock is dropped at the top of ffs_alloccg() and have more
        than one thread trying to allocate blocks in the filesystem.

>Fix:
	None given.

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: Simon Burge <simonb@NetBSD.org>, gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/39206: ffs um_lock handling isn't great
Date: Fri, 25 Jul 2008 15:24:44 +0000

 On Fri, Jul 25, 2008 at 02:15:00PM +0000, Simon Burge wrote:
  >         One example pointed out by pooka@ is at the top of
  >         ffs_alloccg().  It appears that once the free block check at
  >         the top of this function succeeds, this function isn't allowed
  >         to fail.  This is noted in the "XXX fvdl mapsearch ..." comment
  >         further down.  This function is entered with um_lock held, and
  >         once the free block check has passed um_lock is dropped.  This
  >         then allows another thread to reach the same point, and could
  >         lead to problems if there was only one block free in the CG
  >         before the first thread get there.
  > 
  >         This PR is entered as priority "medium" and not "high" since no
  >         actual problems have been observed in practice yet.

 Unless this is the source of those occasional "ffs_alloccg: map
 corrupted" panics...

 -- 
 David A. Holland
 dholland@netbsd.org

From: Antti Kantee <pooka@cs.hut.fi>
To: David Holland <dholland-bugs@netbsd.org>
Cc: Simon Burge <simonb@NetBSD.org>, gnats-bugs@netbsd.org,
	kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/39206: ffs um_lock handling isn't great
Date: Fri, 25 Jul 2008 19:44:18 +0300

 On Fri Jul 25 2008 at 15:24:44 +0000, David Holland wrote:
 > On Fri, Jul 25, 2008 at 02:15:00PM +0000, Simon Burge wrote:
 >  >         One example pointed out by pooka@ is at the top of
 >  >         ffs_alloccg().  It appears that once the free block check at
 >  >         the top of this function succeeds, this function isn't allowed
 >  >         to fail.  This is noted in the "XXX fvdl mapsearch ..." comment
 >  >         further down.  This function is entered with um_lock held, and
 >  >         once the free block check has passed um_lock is dropped.  This
 >  >         then allows another thread to reach the same point, and could
 >  >         lead to problems if there was only one block free in the CG
 >  >         before the first thread get there.
 >  > 
 >  >         This PR is entered as priority "medium" and not "high" since no
 >  >         actual problems have been observed in practice yet.
 > 
 > Unless this is the source of those occasional "ffs_alloccg: map
 > corrupted" panics...

 If we are talking about pre-vmlocking corruption, I doubt it.  Unless,
 of course, this code is entered from interrupt context with the help of
 softdep (which I am not sure of and won't bother to read the code now).

From: Antti Kantee <pooka@cs.hut.fi>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/39206: ffs um_lock handling isn't great
Date: Sun, 27 Jul 2008 18:43:41 +0300

 On Fri Jul 25 2008 at 14:15:00 +0000, Simon Burge wrote:
 >         Code inspection.  Pooka suggested introducing a sleep after
 >         um_lock is dropped at the top of ffs_alloccg() and have more
 >         than one thread trying to allocate blocks in the filesystem.

 Just wanted to state the obvious in case someone wants to try to prove
 this is a problem: the threads need to want to allocate a block from the
 same cylinder group, which must have room to satisfy only one allocation.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.