NetBSD Problem Report #52614
From gson@gson.org Thu Oct 12 19:45:29 2017
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 633B57A16F
for <gnats-bugs@gnats.NetBSD.org>; Thu, 12 Oct 2017 19:45:29 +0000 (UTC)
Message-Id: <20171012194522.C5939989E68@guava.gson.org>
Date: Thu, 12 Oct 2017 22:45:22 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: qemu virtual CD-ROM report read errors since recent wdc changes
X-Send-Pr-Version: 3.95
>Number: 52614
>Category: kern
>Synopsis: qemu virtual CD-ROM reports read errors since recent wdc changes
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: jdolecek
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Oct 12 19:50:00 +0000 2017
>Closed-Date: Wed Oct 24 06:56:05 +0000 2018
>Last-Modified: Wed Oct 24 09:20:00 +0000 2018
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date >= 2017.10.07.20.02.07
>Organization:
>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:
The testbed on babylon5.netbsd.org has done more than 16,000
i386 installs in qemu over the last six years without printing
the string "cd0a: error reading" to the console even once.
This changed on source date 2017.10.07.20.02.07, which happens
to be immediately after a bunch of wdc related commits. Since
then, the string with "cd0a: error reading" has been printed many
times:
babylon5.netbsd.org$ zgrep -c 'cd0a: error reading' ./2017.10*/install.log.gz
[44 lines showing zero matches omitted]
./2017.10.07.20.02.07/install.log.gz:5
./2017.10.07.20.32.20/install.log.gz:2
./2017.10.07.21.53.16/install.log.gz:1
./2017.10.08.00.45.25/install.log.gz:3
./2017.10.08.01.05.13/install.log.gz:8
./2017.10.08.03.39.50/install.log.gz:2
./2017.10.08.08.29.57/install.log.gz:5
./2017.10.08.09.10.11/install.log.gz:4
./2017.10.08.14.03.46/install.log.gz:0
./2017.10.08.15.00.40/install.log.gz:0
./2017.10.08.15.29.33/install.log.gz:2
./2017.10.08.18.46.10/install.log.gz:0
./2017.10.08.20.44.19/install.log.gz:1
./2017.10.08.21.18.14/install.log.gz:3
./2017.10.08.21.33.38/install.log.gz:1
./2017.10.09.05.24.26/install.log.gz:2
./2017.10.09.10.31.50/install.log.gz:0
./2017.10.09.12.07.03/install.log.gz:0
./2017.10.09.14.28.01/install.log.gz:6
./2017.10.09.17.49.28/install.log.gz:2
./2017.10.09.23.42.40/install.log.gz:2
./2017.10.10.03.11.01/install.log.gz:1
./2017.10.10.05.35.15/install.log.gz:3
./2017.10.10.09.29.14/install.log.gz:2
./2017.10.10.11.52.51/install.log.gz:3
./2017.10.10.13.47.27/install.log.gz:4
./2017.10.10.16.04.59/install.log.gz:2
./2017.10.10.16.44.24/install.log.gz:0
./2017.10.10.17.20.42/install.log.gz:1
./2017.10.10.19.31.57/install.log.gz:3
./2017.10.10.21.37.49/install.log.gz:1
./2017.10.11.00.17.03/install.log.gz:1
./2017.10.11.06.49.03/install.log.gz:9
./2017.10.11.08.29.17/install.log.gz:7
./2017.10.11.10.53.25/install.log.gz:1
./2017.10.11.12.27.49/install.log.gz:4
./2017.10.11.17.08.32/install.log.gz:0
./2017.10.12.09.53.55/install.log.gz:0
>How-To-Repeat:
Install NetBSD-current/i386 in qemu using a virtual CD-ROM.
Observe the console output.
>Fix:
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: gson@NetBSD.org
Responsible-Changed-When: Thu, 12 Oct 2017 19:53:50 +0000
Responsible-Changed-Why:
Over to committer.
State-Changed-From-To: open->feedback
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sun, 22 Oct 2017 13:23:53 +0000
State-Changed-Why:
Does this still happen after the recent atapi fixes?
From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/52614 (qemu virtual CD-ROM reports read errors since recent wdc changes)
Date: Sun, 22 Oct 2017 23:29:41 +0300
On Sun, 22 Oct 2017 13:23:53 +0000 (UTC), jdolecek@NetBSD.org wrote:
> Does this still happen after the recent atapi fixes?
Yes - the install attempt from 2017.10.22.14.25.33 sources failed
after multiple errors reading from cd0a:
http://releng.netbsd.org/b5reports/i386/build/2017.10.22.14.25.33/install.log
--
Andreas Gustafsson, gson@gson.org
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: jdolecek@NetBSD.org, Andreas Gustafsson <gson@gson.org>
Subject: Re: kern/52614 (qemu virtual CD-ROM reports read errors since recent
wdc changes)
Date: Mon, 23 Oct 2017 05:20:44 +0800 (+08)
On Sun, 22 Oct 2017, jdolecek@NetBSD.org wrote:
> Does this still happen after the recent atapi fixes?
In addition to the read errors, I am regularly seeing reports of "driver
resource shortage" from qemu's cd0@piixide . In some cases, it is able
to recover and continue, in some cases it seems fatal. About 50% of the
time I have to repeat the qemu install procedure.
This is with sources updated on 2017-10-22 at 1:11:56 UTC
+------------------+--------------------------+----------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+------------------+--------------------------+----------------------------+
State-Changed-From-To: feedback->open
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Mon, 23 Oct 2017 07:34:07 +0000
State-Changed-Why:
Still happens, need to investigate.
From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/52614 (qemu virtual CD-ROM reports read errors since recent wdc changes)
Date: Mon, 18 Dec 2017 20:25:11 +0200
On October 23, jdolecek@NetBSD.org wrote:
> Still happens, need to investigate.
Any progress on this? It has now been causing random installation
failures on the testbed for almost two months. Here's the log
output from a recent one:
http://releng.netbsd.org/b5reports/i386/build/2017.12.18.05.35.36/install.log
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, jaromir.dolecek@gmail.com
Cc:
Subject: Re: kern/52614: qemu virtual CD-ROM report read errors since recent wdc changes
Date: Fri, 22 Jun 2018 17:10:06 +0300
In private email, Jaromir suggested I try a qemu configuration with
ahcisata instead of wdc.
I tried this by adding "-machine q35" to the qemu command line, which
should cause qemu to emulate a more modern PC. The hard disk and
CD-ROM(s) were now detected as SATA devices:
[ 1.0205182] ahcisata0 at pci0 dev 31 function 2: vendor 8086 product 2922 (rev. 0x02)
[ 1.0205182] ahcisata0: interrupting at ioapic0 pin 16
[ 1.0205182] ahcisata0: AHCI revision 1.0, 6 ports, 32 slots, CAP 0xc0141f05<SAM,ISS=0x1=Gen1,SNCQ,S64A>
[ 1.0205182] atabus0 at ahcisata0 channel 0
[ 1.0205182] atabus1 at ahcisata0 channel 1
[ 1.0205182] atabus2 at ahcisata0 channel 2
[ 1.0205182] atabus3 at ahcisata0 channel 3
[ 1.0205182] atabus4 at ahcisata0 channel 4
[ 1.0205182] atabus5 at ahcisata0 channel 5
(...)
[ 1.3749109] ahcisata0 port 0: device present, speed: 1.5Gb/s
[ 1.3749109] ahcisata0 port 1: device present, speed: 1.5Gb/s
[ 1.3749109] ahcisata0 port 2: device present, speed: 1.5Gb/s
[ 4.3760485] wd0 at atabus0 drive 0
[ 4.3810162] wd0: <QEMU HARDDISK>
[ 4.3810162] wd0: 1536 MB, 3120 cyl, 16 head, 63 sec, 512 bytes/sect x 3145728 sectors
[ 4.4214675] atapibus0 at atabus1: 1 targets
[ 4.4340258] cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable
[ 4.4340258] atapibus1 at atabus2: 1 targets
[ 4.4463895] cd1 at atapibus1 drive 0: <QEMU DVD-ROM, QM00005, 2.5+> cdrom removable
[ 4.4711559] WARNING: 2 errors while detecting hardware; check system log.
but when sysinst tried to mount the CD, it failed with the following errors:
[ 39.5120853] cd0(ahcisata0:1:0): request sense for a request sense ?
[ 39.5120853] cd0(ahcisata0:1:0): request sense failed with error 22
[ 39.5120853] cd0(ahcisata0:1:0): generic HBA error
[ 39.5120853] cd0: secperunit and ncylinders are zero
[ 39.5211147] cd0(ahcisata0:1:0): request sense for a request sense ?
[ 39.5211147] cd0(ahcisata0:1:0): request sense failed with error 22
[ 39.5211147] cd0(ahcisata0:1:0): generic HBA error
[ 39.5211147] WARNING: cd0: total sector size in disklabel (536870911) != the size of cd0 (0)
[ 39.5211147] WARNING: cd0: end of partition `a' exceeds the size of cd0 (0)
[ 39.5211147] WARNING: cd0: end of partition `d' exceeds the size of cd0 (0)
[ 39.5286895] cd0(ahcisata0:1:0): request sense for a request sense ?
[ 39.5286895] cd0(ahcisata0:1:0): request sense failed with error 22
[ 39.5286895] cd0(ahcisata0:1:0): generic HBA error
[ 39.5286895] cd1(ahcisata0:2:0): request sense for a request sense ?
[ 39.5286895] cd1(ahcisata0:2:0): request sense failed with error 22
[ 39.5286895] cd1(ahcisata0:2:0): generic HBA error
[ 39.5286895] cd1(ahcisata0:2:0): request sense for a request sense ?
[ 39.5286895] cd1(ahcisata0:2:0): request sense failed with error 22
[ 39.5286895] cd1(ahcisata0:2:0): generic HBA error
[ 39.5286895] cd1: secperunit and ncylinders are zero
[ 39.5286895] cd1(ahcisata0:2:0): request sense for a request sense ?
[ 39.5286895] cd1(ahcisata0:2:0): request sense failed with error 22
[ 39.5286895] cd1(ahcisata0:2:0): generic HBA error
[ 39.5286895] WARNING: cd1: total sector size in disklabel (536870911) != the size of cd1 (0)
[ 39.5286895] WARNING: cd1: end of partition `a' exceeds the size of cd1 (0)
[ 39.5494644] WARNING: cd1: end of partition `d' exceeds the size of cd1 (0)
[ 39.5494644] cd1(ahcisata0:2:0): request sense for a request sense ?
[ 39.5494644] cd1(ahcisata0:2:0): request sense failed with error 22
[ 39.5494644] cd1(ahcisata0:2:0): generic HBA error
This is with qemu 2.12.0. The full console log from the install
attempt, including qemu command line, is available at
http://www.gson.org/netbsd/bugs/build/i386/2018/2018.06.22.10.17.04/install.log
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, jaromir.dolecek@gmail.com
Cc:
Subject: Re: kern/52614: qemu virtual CD-ROM report read errors since recent wdc changes
Date: Fri, 22 Jun 2018 20:01:19 +0300
I repeated the "-machine q35" test with sources from the beginning of October,
before the SATA-NCQ merge, and it failed the same way as with today's sources:
http://www.gson.org/netbsd/bugs/build/i386/2017/2017.10.01.01.45.02/install.log
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/52614: qemu virtual CD-ROM report read errors since recent wdc changes
Date: Tue, 21 Aug 2018 11:29:54 +0300
Jaromir,
This bug is still causing large numbers of random installation
failures on the testbed. For example, the i386 install has failed
more than 200 times this year.
I added a bunch of debug printfs to the kernel to try to figure out
what's happening. Here's a summary of what I have found so far.
If there are other tests I can run to help debug this, please let me
know.
During a typical failed sysinst run, the "if (avail == 0)" branch
in ata_get_xfer_ext() was entered more than 9000 times, always with
flags == 0. From reading the code, this condition results in
ata_get_xfer_ext() returning NULL.
53 of the NULL-returning ata_get_xfer_ext() calls were from
wdc_atapi_scsipi_request(), causing sc_xfer->error to be
set to XS_RESOURCE_SHORTAGE.
One read from cd0a failed, with bp->b_error == 16 (EBUSY).
I'm not sure how to interpret these results - are these frequent NULL
returns from ata_get_xfer_ext() themselves the problem, or are they
expected and the bug is the scsipi code not recovering from them?
Here are the kernel log messages from the last few seconds leading up
to the cd0a read error, with the debug printfs in place:
[ 4066.9440596] ata_get_xfer_ext avail 0, flags 00000000
[ 4066.9440596] ata_get_xfer_ext() returned NULL
[ 4066.9440596] cd0(piixide0:0:1): adapter resource shortage
[ 4067.1440971] ata_get_xfer_ext avail 0, flags 00000000
[ 4068.9066925] ata_get_xfer_ext avail 0, flags 00000000
[ 4068.9066925] ata_get_xfer_ext() returned NULL
[ 4068.9066925] cd0(piixide0:0:1): adapter resource shortage
[ 4070.8755630] ata_get_xfer_ext avail 0, flags 00000000
[ 4070.8755630] ata_get_xfer_ext() returned NULL
[ 4070.8755630] cd0(piixide0:0:1): adapter resource shortage
[ 4071.0664209] ata_get_xfer_ext avail 0, flags 00000000
[ 4071.0664209] ata_get_xfer_ext avail 0, flags 00000000
[ 4071.0664209] ata_get_xfer_ext avail 0, flags 00000000
[ 4071.0664209] ata_get_xfer_ext avail 0, flags 00000000
[ 4072.8642938] ata_get_xfer_ext avail 0, flags 00000000
[ 4072.8642938] ata_get_xfer_ext() returned NULL
[ 4072.8642938] cd0(piixide0:0:1): adapter resource shortage
[ 4072.8642938] cddone error=16
[ 4072.8642938] cd0a: error (errno=16) reading fsbn 3565164 of 3565164-3565179 (cd0 bn 3565164; cn 35651 tn 0 sn 64)
This was produced with the following patches applied. The first
patch is to stop sysinst from intercepting the console output,
which will otherwise cause parts of the kernel messages to be lost.
Index: src/usr.sbin/sysinst/run.c
===================================================================
RCS file: /cvsroot/src/usr.sbin/sysinst/run.c,v
retrieving revision 1.5
diff -u -r1.5 run.c
--- src/usr.sbin/sysinst/run.c 30 Dec 2014 10:10:22 -0000 1.5
+++ src/usr.sbin/sysinst/run.c 15 Aug 2018 12:36:42 -0000
@@ -387,7 +387,9 @@
char *cp, *ncp;
struct termios rtt, tt;
struct timeval tmo;
+#if 0
static int do_tioccons = 2;
+#endif
(void)tcgetattr(STDIN_FILENO, &tt);
if (openpty(&master, &slave, NULL, &tt, win) == -1) {
@@ -401,6 +403,7 @@
ttysig_ignore = 1;
ioctl(master, TIOCPKT, &ttysig_ignore);
+#if 0
/* Try to get console output into our pipe */
if (do_tioccons) {
if (ioctl(slave, TIOCCONS, &do_tioccons) == 0
@@ -415,6 +418,7 @@
do_tioccons = 1;
}
}
+#endif
if (logfp)
fflush(logfp);
Index: src/sys/dev/ata/ata_subr.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ata/ata_subr.c,v
retrieving revision 1.4
diff -u -r1.4 ata_subr.c
--- src/sys/dev/ata/ata_subr.c 20 Oct 2017 07:06:07 -0000 1.4
+++ src/sys/dev/ata/ata_subr.c 15 Aug 2018 15:31:48 -0000
@@ -288,6 +288,7 @@
retry:
avail = ffs32(chq->queue_xfers_avail & mask);
if (avail == 0) {
+ printf("ata_get_xfer_ext avail 0, flags %08x\n", flags);
/*
* Catch code which tries to get another recovery xfer while
* already holding one (wrong recursion).
@@ -299,6 +300,7 @@
if (flags & C_WAIT) {
chq->queue_flags |= QF_NEED_XFER;
error = cv_wait_sig(&chq->queue_busy, &chp->ch_lock);
+ printf("ata_get_xfer_ext cv_wait_sig error=%d\n", error);
if (error == 0)
goto retry;
}
Index: src/sys/dev/scsipi/atapi_wdc.c
===================================================================
RCS file: /cvsroot/src/sys/dev/scsipi/atapi_wdc.c,v
retrieving revision 1.129
diff -u -r1.129 atapi_wdc.c
--- src/sys/dev/scsipi/atapi_wdc.c 17 Oct 2017 18:52:51 -0000 1.129
+++ src/sys/dev/scsipi/atapi_wdc.c 15 Aug 2018 15:26:24 -0000
@@ -389,6 +389,7 @@
xfer = ata_get_xfer_ext(atac->atac_channels[channel], false, 0);
if (xfer == NULL) {
+ printf("ata_get_xfer_ext() returned NULL\n");
sc_xfer->error = XS_RESOURCE_SHORTAGE;
scsipi_done(sc_xfer);
return;
Index: src/sys/dev/scsipi/cd.c
===================================================================
RCS file: /cvsroot/src/sys/dev/scsipi/cd.c,v
retrieving revision 1.341
diff -u -r1.341 cd.c
--- src/sys/dev/scsipi/cd.c 17 Jun 2017 22:35:50 -0000 1.341
+++ src/sys/dev/scsipi/cd.c 14 Aug 2018 17:47:47 -0000
@@ -591,6 +591,9 @@
if (obp->b_error)
obp->b_resid = obp->b_bcount;
+ if (obp->b_error)
+ printf("cd bounce error=%d\n", obp->b_error);
+
free(bounce, M_DEVBUF);
biodone(obp);
}
@@ -743,6 +746,7 @@
return;
bad:
+ printf("cdstrategy bad error=%d\n", error);
bp->b_error = error;
bp->b_resid = bp->b_bcount;
biodone(bp);
@@ -915,6 +919,8 @@
struct buf *bp = xs->bp;
if (bp) {
+ if (error)
+ printf("cddone error=%d\n", error);
bp->b_error = error;
bp->b_resid = xs->resid;
if (error) {
Index: src/sys/kern/subr_disk.c
===================================================================
RCS file: /cvsroot/src/sys/kern/subr_disk.c,v
retrieving revision 1.122
diff -u -r1.122 subr_disk.c
--- src/sys/kern/subr_disk.c 7 Mar 2018 21:13:24 -0000 1.122
+++ src/sys/kern/subr_disk.c 14 Aug 2018 10:00:34 -0000
@@ -139,7 +139,7 @@
pr = addlog;
} else
pr = printf;
- (*pr)("%s%d%c: %s %sing fsbn ", dname, unit, partname, what,
+ (*pr)("%s%d%c: %s (errno=%d) %sing fsbn ", dname, unit, partname, what, bp->b_error,
bp->b_flags & B_READ ? "read" : "writ");
sn = bp->b_blkno;
if (bp->b_bcount <= DEV_BSIZE)
--
Andreas Gustafsson, gson@gson.org
From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: Andreas Gustafsson <gson@gson.org>
Cc: Jaromir Dolecek <jdolecek@netbsd.org>, "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/52614: qemu virtual CD-ROM report read errors since recent
wdc changes
Date: Tue, 21 Aug 2018 11:21:26 +0200
2018-08-21 10:29 GMT+02:00 Andreas Gustafsson <gson@gson.org>:
> During a typical failed sysinst run, the "if (avail == 0)" branch
> in ata_get_xfer_ext() was entered more than 9000 times, always with
> flags == 0. From reading the code, this condition results in
> ata_get_xfer_ext() returning NULL.
Thanks, this does help.
It shows there is definitely codepath where the new code is wrong - if
SCSIPI fails to get the xfer and returns with EAGAIN, nothing ever
re-triggers the SCSI code to retry that transfer again and it
eventually times out. I'll look at this in more detail, maybe I can
fix this separately.
The EAGAIN is more likely to happen on the legacy IDE interfaces,
where the controller has just one preallocated xfer for all I/O (disk
and ATAPI) now. I plan to rewrite the xfer handling code to actually
avoid this and don't artificially limit the number of middle layer
CCBs, e.g. switch to pool allocation or just have more of them. Stay
tuned, I should have something soon.
Jaromir
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52614 CVS commit: [jdolecek-ncqfixes] src/sys/dev
Date: Sat, 22 Sep 2018 09:23:00 +0000
Module Name: src
Committed By: jdolecek
Date: Sat Sep 22 09:23:00 UTC 2018
Modified Files:
src/sys/dev/ata [jdolecek-ncqfixes]: TODO.ncq ata.c ata_subr.c atavar.h
satapmp_subr.c wd.c wdvar.h
src/sys/dev/ic [jdolecek-ncqfixes]: ahcisata_core.c mvsata.c siisata.c
src/sys/dev/scsipi [jdolecek-ncqfixes]: atapi_wdc.c
src/sys/dev/usb [jdolecek-ncqfixes]: umass_isdata.c
Log Message:
separate ata_xfer slot allocation and the memory allocation, so that
there can be more queued xfers than number of supported slots by controller,
and use a pool instead of custom pre-allocation
primarily to help PR kern/52614
remove no longer needed custom wd(4) logic for flush cache
switch also wd(4) trim/suspend/setcache/wdioctlstrategy to sleep waiting
for the memory, they are all called from process context and this
avoids spurious failures
To generate a diff of this commit:
cvs rdiff -u -r1.4.2.3 -r1.4.2.4 src/sys/dev/ata/TODO.ncq
cvs rdiff -u -r1.141.6.5 -r1.141.6.6 src/sys/dev/ata/ata.c
cvs rdiff -u -r1.6.2.4 -r1.6.2.5 src/sys/dev/ata/ata_subr.c
cvs rdiff -u -r1.99.2.4 -r1.99.2.5 src/sys/dev/ata/atavar.h
cvs rdiff -u -r1.14 -r1.14.2.1 src/sys/dev/ata/satapmp_subr.c
cvs rdiff -u -r1.441.2.3 -r1.441.2.4 src/sys/dev/ata/wd.c
cvs rdiff -u -r1.46.6.1 -r1.46.6.2 src/sys/dev/ata/wdvar.h
cvs rdiff -u -r1.62.2.4 -r1.62.2.5 src/sys/dev/ic/ahcisata_core.c
cvs rdiff -u -r1.41.2.3 -r1.41.2.4 src/sys/dev/ic/mvsata.c
cvs rdiff -u -r1.35.6.4 -r1.35.6.5 src/sys/dev/ic/siisata.c
cvs rdiff -u -r1.129.6.3 -r1.129.6.4 src/sys/dev/scsipi/atapi_wdc.c
cvs rdiff -u -r1.36 -r1.36.6.1 src/sys/dev/usb/umass_isdata.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Mon, 22 Oct 2018 20:24:44 +0000
State-Changed-Why:
Can you test after the jdolecek-ncqfixes branch merge?
From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/52614 (qemu virtual CD-ROM reports read errors since recent wdc changes)
Date: Wed, 24 Oct 2018 09:50:00 +0300
jdolecek@NetBSD.org wrote:
> Can you test after the jdolecek-ncqfixes branch merge?
Looks like the bug is fixed. The b5 i386 testbed has now done 13
successful installs in a row since the jdolecek-ncqfixes merge, when
the previous record since the sata-ncq merge was four in a row. Also,
the "cd0a: error reading" message is absent from the install logs.
Thank you!
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: feedback->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Wed, 24 Oct 2018 06:56:05 +0000
State-Changed-Why:
Reported fixed. Thanks for report.
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52614 (qemu virtual CD-ROM reports read errors since recent wdc changes)
Date: Wed, 24 Oct 2018 15:15:34 +0700
Date: Wed, 24 Oct 2018 06:55:01 +0000 (UTC)
From: Andreas Gustafsson <gson@gson.org>
Message-ID: <20181024065501.083BA7A237@mollari.NetBSD.org>
| the previous record since the sata-ncq merge was four in a row.
That's just this month, unless you're counting the read error messages.
At the end of Aug there was a sequence of 12 in a row with no install
fail, which continued for 8 more at the beginning of Sep, for a run of
20 build/install/test with no build or install failures
Just before that run in Aug there was another longish sequence of
successful installs (though this one broken by a lot of build failures
as well - which are irrelevant for this purpose.) I'm sure there have
been others.
So while things are certainly looking promising, I would not claim
success quite yet (though this PR need not be reopened, or not
unless another failure of the same kind occur).
| Also,
| the "cd0a: error reading" message is absent from the install logs.
That, if anything, is a more promising sign I think.
kre
From: Andreas Gustafsson <gson@gson.org>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: jdolecek@NetBSD.org, gnats-bugs@NetBSD.org
Subject: Re: kern/52614 (qemu virtual CD-ROM reports read errors since recent wdc changes)
Date: Wed, 24 Oct 2018 11:35:35 +0300
Robert Elz wrote:
> | the previous record since the sata-ncq merge was four in a row.
>
> That's just this month, unless you're counting the read error messages.
Mea culpa. I was in fact counting installs with no "cd0a: error
reading" message, not successful installs. But in any case, going
from at most 4 in a row of those to at least 13 does look promising.
--
Andreas Gustafsson, gson@gson.org
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: jdolecek@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
Andreas Gustafsson <gson@gson.org>
Subject: Re: kern/52614 (qemu virtual CD-ROM reports read errors since recent
wdc changes)
Date: Wed, 24 Oct 2018 17:17:49 +0800 (+08)
> | Also,
> | the "cd0a: error reading" message is absent from the install logs.
>
> That, if anything, is a more promising sign I think.
Yes, this is, i think, the most important item. There were many times
when "only a few" (or even only 1) read error occurred and the driver
was able to retry and recover.
Prior to this issue rising, I had never seen the cd0a: error message,
so now that it has disappeared I would strongly feel that we've "fixed
the bug".
+------------------+--------------------------+----------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+------------------+--------------------------+----------------------------+
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.