NetBSD Problem Report #40681

From dholland@netbsd.org  Wed Feb 18 20:29:20 2009
Return-Path: <dholland@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 224A663B8C3
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 18 Feb 2009 20:29:20 +0000 (UTC)
Message-Id: <20090218202920.0B13C63B284@mail.netbsd.org>
Date: Wed, 18 Feb 2009 20:29:20 +0000 (UTC)
From: dholland@netbsd.org
Reply-To: dholland@netbsd.org
To: gnats-bugs@gnats.NetBSD.org
Subject: possible data loss path in the buffer cache code
X-Send-Pr-Version: 3.95

>Number:         40681
>Category:       kern
>Synopsis:       possible data loss path in the buffer cache code
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Feb 18 20:30:00 +0000 2009
>Last-Modified:  Wed Feb 18 22:00:04 +0000 2009
>Originator:     David A. Holland
>Release:        NetBSD 5.99.7 (20090218) (vfs_bio.c 1.215)
>Organization:
>Environment:
  -
>Description:

It appears that if you call getblk(), it finds an existing buffer, and
it attempts to resize it, and the allocation for the resize fails, the
buffer is unceremoniously invalidated, even if it previously contained
valid and dirty data.

getblk calls allocbuf; allocbuf returns ENOMEM; getblk then
unconditionally removes the buffer from b_hash and then calls
brelse(bp, BC_INVAL).

>How-To-Repeat:

code reading.

>Fix:

If I understand things correctly, BC_INVAL should already be set on
the buffer if was new, so we should remove from b_hash only if
BC_INVAL is already set, and call brelse(bp, 0) instead of brelase(bp,
BC_INVAL).

If I don't understand things correctly then I have no idea how callers
of getblk are supposed to tell if they got an already-valid block back
or not and would appreciate enlightenment. :-/

>Audit-Trail:
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/40681: possible data loss path in the buffer cache code
Date: Wed, 18 Feb 2009 21:59:12 +0000

 > It appears that if you call getblk(), it finds an existing buffer, and
 > it attempts to resize it, and the allocation for the resize fails, the
 > buffer is unceremoniously invalidated, even if it previously contained
 > valid and dirty data.

 There was an assertion in the code that implied requests for disk address
 `blkno' always had the same size `size'. See line 1184. It was changed with
 revision 1.78 so that it did not fire for block devices.

 You could legitmately take the position that requests should always have the
 same size because the code can't handle overlap. I guess that could break
 heuristics like the disk partition code mentioned in the log message for
 1.78.

 > If I don't understand things correctly then I have no idea how callers
 > of getblk are supposed to tell if they got an already-valid block back
 > or not and would appreciate enlightenment. :-/

 I don't see how they can, unless (a) they always request the same size or
 (b) are stacked with a pile of fragile assumptions. Even without
 invalidating the buffer on failure, the expanded segment contains garbage
 until initialized.

 Andrew
Home
PR Database Search
(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.