NetBSD Problem Report #41717

From apb@cequrux.com  Tue Jul 14 06:48:00 2009
Return-Path: <apb@cequrux.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 4806763BBD1
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 14 Jul 2009 06:48:00 +0000 (UTC)
Message-Id: <20090714063646.6A100100ADCA@apb-laptoy.apb.alt.za>
Date: Tue, 14 Jul 2009 06:36:46 +0000 (UTC)
From: apb@cequrux.com
Reply-To: apb@cequrux.com
To: gnats-bugs@gnats.NetBSD.org
Subject: graceful recovery from USB disk error
X-Send-Pr-Version: 3.95

>Number:         41717
>Category:       kern
>Synopsis:       graceful recovery from USB disk error
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Tue Jul 14 06:50:00 +0000 2009
>Originator:     Alan Barrett
>Release:        NetBSD 5.99.15
>Organization:
Not much
>Environment:
System: NetBSD 5.99.15 i386
Architecture: i386
Machine: i386
>Description:
        When a USB disk is disconnected, that may be a transient error
        caused by something as simple as a cable being bumped.  It would
        be nice if the OS could recover gracefully, by retaining any
        buffered information, attempting to reset the device (or its
        parent), reconnecting the device, and continuing.

	The same applies to any removable device.

>How-To-Repeat:
    # sd0 is an external USB disk:
	umass0 at uhub2 port 6 configuration 1 interface 0
	umass0: Western Digital My Book, rev 2.00/1.65, addr 5
	umass0: using SCSI over Bulk-Only
	scsibus0 at umass0: 2 targets, 1 lun per target
	sd0 at scsibus0 target 0 lun 0: <WD, 10EACS External, 1.65> disk
		fixed version 4
	sd0: 931 GB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 1953525168
		sectors
	rnd: sd0 attached as an entropy source (collecting and estimating)

    # cgd2 is a cgd using sd0e as the backing store.  sd0f is a
    # one-sector partition that contains the configuration for the cgd.
	cgdconfig cgd2 /dev/sd0e /dev/rsd0f

    # /mnt is an ffs file system on cgd2a
	mount -t ffs -o nolog /dev/cgd2a /mnt

    # while performing an rsync operation to copy files to /mnt,
    # rsync printed many error messages about "Input/output error".
    # I assume that it received EIO errors from file system operations.
    # Meanwhile, the following messages appeared in the syslog:

Jul 13 22:00:57 apb-laptoy /netbsd: umass0: at uhub2 port 6 (addr 5) disconnecte
d
Jul 13 22:00:57 apb-laptoy /netbsd: sd0(umass0:0:0:0): generic HBA error
Jul 13 22:00:57 apb-laptoy /netbsd: sd0: cache synchronization failed
Jul 13 22:00:57 apb-laptoy /netbsd: rnd: sd0 detached as an entropy source
Jul 13 22:00:57 apb-laptoy /netbsd: sd0: detached
Jul 13 22:00:57 apb-laptoy /netbsd: scsibus0: detached
Jul 13 22:00:57 apb-laptoy /netbsd: umass0: detached
Jul 13 22:00:57 apb-laptoy /netbsd: cgd2: error 5
Jul 13 22:01:13 apb-laptoy syslogd[151]: last message repeated 14392 times
Jul 13 22:01:13 apb-laptoy /netbsd: cgd2: error
Jul 13 22:01:13 apb-laptoy /netbsd: 5
Jul 13 22:01:13 apb-laptoy /netbsd: cgd2: error 5
Jul 13 22:01:17 apb-laptoy syslogd[151]: last message repeated 5816 times
Jul 13 22:01:17 apb-laptoy /netbsd: cgd2: error
Jul 13 22:01:17 apb-laptoy /netbsd: 5
Jul 13 22:01:17 apb-laptoy /netbsd: cgd2: error 5
Jul 13 22:01:21 apb-laptoy syslogd[151]: last message repeated 7583 times
Jul 13 22:01:21 apb-laptoy /netbsd:
Jul 13 22:01:21 apb-laptoy /netbsd: cgd2: error 5
Jul 13 22:01:21 apb-laptoy syslogd[151]: last message repeated 290 times
Jul 13 22:01:21 apb-laptoy /netbsd: cgd2: err
Jul 13 22:01:21 apb-laptoy /netbsd: or 5
Jul 13 22:01:21 apb-laptoy /netbsd: cgd2: error 5

>Fix:
        It is quite likely that the USB cable was bumped, and that the
        device really was disconnected for a fraction of a second.

        It is strange that there is no syslog message about the device
        being reconnected; I suspect that this is a bug.

        It is expected that the cgd device and file system above the
        problematic USB disk did not recover; making them recover is a
        requested new feature.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.