NetBSD Problem Report #30816
From blymn@blymn.cust.internode.on.net Sat Jul 23 14:20:25 2005
Return-Path: <blymn@blymn.cust.internode.on.net>
Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203])
by narn.netbsd.org (Postfix) with ESMTP id 1F33F63B117
for <gnats-bugs@gnats.NetBSD.org>; Sat, 23 Jul 2005 14:20:25 +0000 (UTC)
Message-Id: <200507231420.j6NEKH9x025625@blymn.cust.internode.on.net>
Date: Sat, 23 Jul 2005 23:50:17 +0930 (CST)
From: blymn@baea.com.au
Reply-To: blymn@baea.com.au
To: gnats-bugs@netbsd.org
Subject: dump(8) broken for larger values of blocking
X-Send-Pr-Version: 3.95
>Number: 30816
>Category: bin
>Synopsis: large blocking factors cannot be used with dump
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jul 23 14:21:01 +0000 2005
>Originator: Brett Lymn (Master of the Siren)
>Release: NetBSD 3.99.6
>Organization:
Brett Lymn
>Environment:
System: NetBSD siren 3.99.6 NetBSD 3.99.6 (SIREN.ACPI.MP) #10: Sun Jul 17 19:29:12 CST 2005 toor@siren:/usr/src/sys/arch/amd64/compile/SIREN.ACPI.MP amd64
Architecture: x86_64
Machine: amd64
>Description:
The b option of dump(8) may have a value of between 1 and 1000
according to the usage message from dump. If a blocksize above about
200 is used then dump misbehaves in various ways, either looping
indefinitely or quitting with a "master/slave protocol botched" whilst
pass III is being done. It seems the larger b is the more likely you
get the master/slave protocol botched message, values near 256 result
in a hang due to an infinite loop in tape.c:doslave(), for some reason
p->count is zero which causes the first for loop in doslave() to
never terminate.
>How-To-Repeat:
I was dumping a 40Gb partition to a DLT40 tape drive using a
blocksize of 512, this resulted in dump hanging during pass III of the
dump. The machine was up multi-user but the filesystem in question does
fsck clean (i.e. this problem is not due to attempting to back up a
corrupt fs)
>Fix:
The problem can be worked around by using a lower blocking size at
the expense of the tape drive not streaming, a blocksize of 128 appears to
work reliably. I had a look at the code and there is only one place that
the request count could be zero and that is in tape.c:flushtape() where it
is deliberately zeroed and a comment of "Sentinel" is next to this statement.
This "sentinel" state does not seem to be checked anywhere in the code.
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.