NetBSD Problem Report #43169
From leo@marco.de Fri Apr 16 12:25:00 2010
Return-Path: <leo@marco.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 1EB8263B8BC
for <gnats-bugs@gnats.NetBSD.org>; Fri, 16 Apr 2010 12:25:00 +0000 (UTC)
Message-Id: <20100416110540.1CCAF73DB@zork.local>
Date: Fri, 16 Apr 2010 13:05:40 +0200 (CEST)
From: leo@marco.de
Reply-To: leo@marco.de
To: gnats-bugs@gnats.NetBSD.org
Subject: sys/dev/ata/wd.c doesn't call disk_unbusy in case of errors/retries
X-Send-Pr-Version: 3.95
>Number: 43169
>Category: kern
>Synopsis: sys/dev/ata/wd.c doesn't call disk_unbusy in case of errors/retries
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: jdolecek
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Apr 16 12:25:00 +0000 2010
>Closed-Date: Sat Oct 07 20:04:55 +0000 2017
>Last-Modified: Sat Oct 07 20:04:55 +0000 2017
>Originator: leo@marco.de
>Release: NetBSD 5.99.26
>Organization:
Matthias Pfaller Software Entwicklung
marco Systemanalyse und Entwicklung GmbH Tel +49 8131 5161 41
Hans-Böckler-Str. 2, D 85221 Dachau Fax +49 8131 5161 66
http://www.marco.de/ Email leo@marco.de
>Environment:
System: NetBSD zork 5.99.26 NetBSD 5.99.26 (ZORK) #0: Sat Apr 10 14:30:43 CEST 2010 leo@zork:/usr/src-current/sys/arch/amd64/compile/ZORK amd64
Architecture: x86_64
Machine: amd64
>Description:
When wd.c retries a request (e.g. after a timeout because the drive
is powered down), it will issue a callout to wdrestart and return
without doing an disk_unbusy. wdrestart will call __wdstart which
will call disk_busy a second time while wddone will call disk_unbusy
only one time.
The same problem applies to ATACMD_TRY_AGAIN.
>How-To-Repeat:
Provoke a disk errror and watch the disk beeing 100% busy all the time
afterwards.
>Fix:
Maybe it would be best to move disk_busy from __wdstart to wdstart?
In that case it would be called only one time per request, but the
disk still would be marked busy until the callout is serviced
(I'm not sure if this should be considered a bug or a feature...).
>Release-Note:
>Audit-Trail:
From: Matthias Pfaller <leo@marco.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/43169: sys/dev/ata/wd.c doesn't call disk_unbusy in case
of errors/retries
Date: Thu, 22 Apr 2010 19:04:45 +0200
This is a multi-part message in MIME format.
--------------000506070409080104070201
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
I'm running for a couple of days with the attached patch. After
readerrors the drive is no longer reported 100% busy.
Matthias
--------------000506070409080104070201
Content-Type: text/plain;
name="P.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="P.txt"
Index: wd.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ata/wd.c,v
retrieving revision 1.384
diff -u -u -r1.384 wd.c
--- wd.c 24 Feb 2010 22:37:57 -0000 1.384
+++ wd.c 22 Apr 2010 16:55:20 -0000
@@ -585,6 +585,8 @@
wd->openings--;
wd->retries = 0;
+ /* Instrumentation. */
+ disk_busy(&wd->sc_dk);
wdstart1(wd, bp);
}
}
@@ -664,6 +666,7 @@
/* No memory -- fail the iop. */
bp->b_error = ENOMEM;
bp->b_resid = bp->b_bcount;
+ disk_unbusy(&wd->sc_dk, 0, (bp->b_flags & B_READ));
biodone(bp);
wd->openings++;
return;
@@ -716,8 +719,6 @@
wd->sc_wdc_bio.flags |= ATA_LBA;
if (bp->b_flags & B_READ)
wd->sc_wdc_bio.flags |= ATA_READ;
- /* Instrumentation. */
- disk_busy(&wd->sc_dk);
switch (wd->atabus->ata_bio(wd->drvp, &wd->sc_wdc_bio)) {
case ATACMD_TRY_AGAIN:
callout_reset(&wd->sc_restart_ch, hz, wdrestart, wd);
@@ -943,7 +944,7 @@
return error;
}
-/*
+/*
* Caller must hold wd->sc_dk.dk_openlock.
*/
static int
@@ -1461,7 +1462,7 @@
return 0;
}
-
+
case DIOCSSTRATEGY:
{
struct disk_strategy *dks = (void *)addr;
@@ -1630,7 +1631,7 @@
err = 0;
break;
default:
- panic("wddump: unknown error type %d", err);
+ panic("wddump: unknown error type %d", err);
}
if (err != 0) {
printf("\n");
--------------000506070409080104070201--
Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Sun, 23 Jul 2017 13:53:02 +0000
Responsible-Changed-Why:
I've noticed this bug also while working on jdolecek-ncq branch, fixed it there.
State-Changed-From-To: open->analyzed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sun, 23 Jul 2017 13:53:02 +0000
State-Changed-Why:
Fix committed on dev branch, will close once the branch is merged.
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/43169 CVS commit: [jdolecek-ncq] src/sys/dev/ata
Date: Sun, 23 Jul 2017 13:50:43 +0000
Module Name: src
Committed By: jdolecek
Date: Sun Jul 23 13:50:43 UTC 2017
Modified Files:
src/sys/dev/ata [jdolecek-ncq]: wd.c
Log Message:
for wd, only call disk_busy() on the first try, do not call it on retries,
as unbusy is called just once when the xfer is finished
also noticed in PR kern/43169 by Matthias Pfaller, but contrary to suggested
fix done in way to keep the disk marked busy during the timeouts, as I think
it's more correct
To generate a diff of this commit:
cvs rdiff -u -r1.428.2.28 -r1.428.2.29 src/sys/dev/ata/wd.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->feedback
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sat, 07 Oct 2017 17:44:37 +0000
State-Changed-Why:
Change committed on HEAD (-current). Can you confirm the fix?
State-Changed-From-To: feedback->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sat, 07 Oct 2017 20:04:55 +0000
State-Changed-Why:
Fix confirmed in NCQ branch, problem fixed there. Submitter no longer can
confirm.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.