NetBSD Problem Report #49054

From www@NetBSD.org  Thu Jul 31 12:04:39 2014
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 2D886A86BD
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 31 Jul 2014 12:04:39 +0000 (UTC)
Message-Id: <20140731120437.96FD3A86EE@mollari.NetBSD.org>
Date: Thu, 31 Jul 2014 12:04:37 +0000 (UTC)
From: 6bone@6bone.informatik.uni-leipzig.de
Reply-To: 6bone@6bone.informatik.uni-leipzig.de
To: gnats-bugs@NetBSD.org
Subject: lsi 1020/1030 issue
X-Send-Pr-Version: www-1.0

>Number:         49054
>Category:       kern
>Synopsis:       lsi 1020/1030 issue
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jul 31 12:05:01 +0000 2014
>Closed-Date:    Tue Jan 20 17:13:05 +0000 2015
>Last-Modified:  Thu Mar 26 16:10:12 +0000 2015
>Originator:     Uwe Toenjes
>Release:        NetBSD-6.99.47
>Organization:
University of Leipzig
>Environment:
NetBSD augate.ipv6.uni-leipzig.de 6.99.47 NetBSD 6.99.47 (amd64)
>Description:
Hello,

I am trying to use an external scsi raid with an lsi 1020 scsi adapter.

with linux all works well. netbsd (current-6.99.47) reports:

mpt1: mpt_done: IOC overrun!
probe(mpt1:0:0:0): generic HBA error

dmesg only reports:

mpt1 at pci11 dev 8 function 0: vendor 0x1000 product 0x0030 (rev. 0xc1)
mpt1: interrupting at ioapic0 pin 18
scsibus2 at mpt1: 16 targets, 8 luns per target
mpt1: mpt_done: IOC overrun!
probe(mpt1:0:0:0): generic HBA error

I compiled the kernel with SCSIDEBUG,SCSIVERBOSE and SCSIPI_DEBUG.
Unfortunately scsictl still provides no further information.

With Linux (Knoppix) the system works well. Linux detects the hardware as following:

[   31.204255] Fusion MPT SPI Host driver 3.04.20
[   31.204479] mptbase: ioc0: Initiating bringup
[   31.616683] ioc0: LSI53C1020A A1: Capabilities={Initiator,Target}
[   32.034250] scsi6 : ioc0: LSI53C1020A A1, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=18
[   32.654889] scsi 6:0:0:0: Direct-Access     ES-6600  12 Bay Volume    R001 PQ: 0 ANSI: 5
[   32.654905] scsi target6:0:0: Beginning Domain Validation
[   32.661633] scsi target6:0:0: Ending Domain Validation
[   32.661716] scsi target6:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 127)
[   32.661984] sd 6:0:0:0: Attached scsi generic sg5 type 0
[   32.662267] sd 6:0:0:0: [sdc] 1953121280 512-byte logical blocks: (999 GB/931 GiB)
[   32.912791] sd 6:0:0:0: [sdc] Write Protect is off
[   32.912796] sd 6:0:0:0: [sdc] Mode Sense: cb 00 00 08
[   33.163118] sd 6:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   34.915241]  sdc: unknown partition table
[   36.166788] sd 6:0:0:0: [sdc] Attached SCSI disk

>How-To-Repeat:

>Fix:

The following change fixes the problem by avoiding the SCSI3 inquiry probe
if the device reports available_bytes that are more than the size of the
SCSI3 inquiry packet. The device in question reports 91 bytes for
additional_length and is:

sd1 at scsibus2 target 0 lun 0: <ES-6600, 12 Bay Volume, R001> disk fixed

If this change is acceptable, it should be pulled up to 7. I left a print
to say when the SCSI3 probe is skipped.

Index: scsipi_base.c
===================================================================
RCS file: /cvsroot/src/sys/dev/scsipi/scsipi_base.c,v
retrieving revision 1.159
diff -u -u -r1.159 scsipi_base.c
--- scsipi_base.c	20 Apr 2012 20:23:21 -0000	1.159
+++ scsipi_base.c	11 Sep 2014 12:05:06 -0000
@@ -1064,7 +1064,7 @@

 	/*
 	 * If we request more data than the device can provide, it SHOULD just
-	 * return a short reponse.  However, some devices error with an
+	 * return a short response.  However, some devices error with an
 	 * ILLEGAL REQUEST sense code, and yet others have even more special
 	 * failture modes (such as the GL641USB flash adapter, which goes loony
 	 * and sends corrupted CRCs).  To work around this, and to bring our
@@ -1081,6 +1081,7 @@
 	    10000, NULL, flags | XS_CTL_DATA_IN);
 	if (!error &&
 	    inqbuf->additional_length > SCSIPI_INQUIRY_LENGTH_SCSI2 - 4) {
+	    if (inqbuf->additional_length <= SCSIPI_INQUIRY_LENGTH_SCSI3 - 4) {
 #if 0
 printf("inquire: addlen=%d, retrying\n", inqbuf->additional_length);
 #endif
@@ -1091,6 +1092,11 @@
 #if 0
 printf("inquire: error=%d\n", error);
 #endif
+#if 1
+	    } else {
+printf("inquire: addlen=%d, not retrying\n", inqbuf->additional_length);
+#endif
+	    }
 	}

 #ifdef SCSI_OLD_NOINQUIRY
>Release-Note:

>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49054 CVS commit: src/sys/dev/scsipi
Date: Mon, 6 Oct 2014 10:42:08 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Mon Oct  6 14:42:08 UTC 2014

 Modified Files:
 	src/sys/dev/scsipi: scsipi_base.c

 Log Message:
 PR/49054: Uwe Toenjes: Some RAID controllers return more bytes in the
 scsi 3 inquiry command than expected by the size of the scsi 3 inquiry
 packet. This can be detected by looking at the additional_length field
 returned by the scsi 2 inquiry. If that's the case, avoid doing the
 scsi 3 inquiry because we can't handle the extra bytes later.
 XXX: Pullup -7


 To generate a diff of this commit:
 cvs rdiff -u -r1.160 -r1.161 src/sys/dev/scsipi/scsipi_base.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/49054: lsi 1020/1030 issue
Date: Fri, 17 Oct 2014 12:48:29 +0200

 This change causes fallout (originally noted in PR kern/49289), with a
 variety of SCSI controllers and disks:

 Example 1:

 mpt0 at pci1 dev 4 function 0: Symbios Logic 53c1020/53c1030 (rev. 0x07)
 mpt0: applying 1030 quirk
 mpt0: interrupting at ivec 1f69
 scsibus0 at mpt0: 16 targets, 8 luns per target
 [..]
 inquire: addlen=139, not retrying
 sd0 at scsibus0 target 0 lun 0: <SEAGATE, ST373207LSUN72G, 045A> disk fixed
 sd0: 70007 MB, 14089 cyl, 24 head, 424 sec, 512 bytes/sect x 143374738 sectors
 dk0 at sd0: sb2k5Root/a
 dk0: 93008640 blocks at 0, type: ffs 

 Example 2:

 esiop0 at pci2 dev 2 function 0: Symbios Logic 53c1010-66 (ultra3-wide scsi)
 esiop0: using on-board RAM
 esiop0: interrupting at ivec 1f29
 scsibus0 at esiop0: 16 targets, 8 luns per target
 [..]
 inquire: addlen=139, not retrying
 sd0 at scsibus0 target 0 lun 0: <SEAGATE, ST336607LSUN36G, 0307> disk fixed
 sd0: 34732 MB, 24622 cyl, 27 head, 107 sec, 512 bytes/sect x 71132959 sectors
 inquire: addlen=139, not retrying
 sd0: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged queueing
 inquire: addlen=139, not retrying
 sd1 at scsibus0 target 1 lun 0: <SEAGATE, ST336607LSUN36G, 0307> disk fixed
 sd1: 34732 MB, 24622 cyl, 27 head, 107 sec, 512 bytes/sect x 71132959 sectors
 inquire: addlen=139, not retrying
 sd1: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged queueing


 Example 3:

 esp0 at sbus0 slot 14 offset 0x8800000 vector 20 ipl 3: FAS366/HME, 40MHz, SCSI ID 7
 scsibus0 at esp0: 16 targets, 8 luns per target
 [..]
 inquire: addlen=139, not retrying
 sd1 at scsibus0 target 0 lun 0: <SEAGATE, ST39175LC, 0001> disk fixed
 sd1: 8683 MB, 11721 cyl, 5 head, 303 sec, 512 bytes/sect x 17783240 sectors
 inquire: addlen=139, not retrying
 sd1: sync (100.00ns offset 15), 16-bit (20.000MB/s) transfers, tagged queueing
 sd2 at scsibus0 target 1 lun 0: <FUJITSU, MAE3091L SUN9.0G, 0706> disk fixed
 sd2: 8637 MB, 4926 cyl, 27 head, 133 sec, 512 bytes/sect x 17689267 sectors
 sd2: sync (100.00ns offset 15), 16-bit (20.000MB/s) transfers, tagged queueing
 [..]
 inquire: addlen=91, not retrying
 cd0 at scsibus0 target 6 lun 0: <TOSHIBA, XM-5401TASUN4XCD, 3485> cdrom removable
 inquire: addlen=91, not retrying
 cd0: sync (248.00ns offset 15), 8-bit (4.032MB/s) transfers


 I can probably find a lot more examples. I can also easily test patches...

 Martin

From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49054 CVS commit: src/sys/dev/scsipi
Date: Mon, 17 Nov 2014 13:43:48 -0500

 Module Name:	src
 Committed By:	christos
 Date:		Mon Nov 17 18:43:48 UTC 2014

 Modified Files:
 	src/sys/dev/scsipi: scsipi_base.c

 Log Message:
 PR/49054: Add a quirk for the ES-6600 RAID controller which does not do
 INQUIRY3 properly. Unfortunately looking at the length does not solve
 the problem since other devices send greater lengths too.


 To generate a diff of this commit:
 cvs rdiff -u -r1.162 -r1.163 src/sys/dev/scsipi/scsipi_base.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Fri, 16 Jan 2015 05:43:07 +0000
State-Changed-Why:
Is this fixed?


From: 6bone@6bone.informatik.uni-leipzig.de
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, 
    dholland@NetBSD.org
Subject: Re: kern/49054 (lsi 1020/1030 issue)
Date: Fri, 16 Jan 2015 14:32:50 +0100 (CET)

 It is fixed in netbsd-7.99

State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 20 Jan 2015 17:13:05 +0000
State-Changed-Why:
confirmed fixed, thanks.


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49054 CVS commit: [netbsd-7] src/sys/dev/scsipi
Date: Thu, 26 Mar 2015 16:09:52 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Thu Mar 26 16:09:52 UTC 2015

 Modified Files:
 	src/sys/dev/scsipi [netbsd-7]: scsipi_base.c

 Log Message:
 Pull up the following revisions, requested by christos in #644:

 	sys/dev/scsipi/scsipi_base.c	1.161 - 1.164

 Use size for the size argument of memcmp, not the result of a compare.

 PR/49054: Add a quirk for the ES-6600 RAID controller which does not do
 INQUIRY3 properly. Unfortunately looking at the length does not solve
 the problem since other devices send greater lengths too.

 src is too big these days to tolerate superfluous apostrophes.  It's
 "its", people!

 PR/49054: Uwe Toenjes: Some RAID controllers return more bytes in the
 scsi 3 inquiry command than expected by the size of the scsi 3 inquiry
 packet. This can be detected by looking at the additional_length field
 returned by the scsi 2 inquiry. If that's the case, avoid doing the
 scsi 3 inquiry because we can't handle the extra bytes later.


 To generate a diff of this commit:
 cvs rdiff -u -r1.160 -r1.160.2.1 src/sys/dev/scsipi/scsipi_base.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.