NetBSD Problem Report #51761

From www@NetBSD.org  Mon Jan  2 16:00:04 2017
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 045847A2D2
	for <gnats-bugs@gnats.NetBSD.org>; Mon,  2 Jan 2017 16:00:04 +0000 (UTC)
Message-Id: <20170102160002.6ACCA7A322@mollari.NetBSD.org>
Date: Mon,  2 Jan 2017 16:00:02 +0000 (UTC)
From: flxd@NetBSD.org
Reply-To: flxd@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: SCSI gets stuck on VAXstation 4000
X-Send-Pr-Version: www-1.0

>Number:         51761
>Category:       port-vax
>Synopsis:       SCSI gets stuck on VAXstation 4000
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    mlelstv
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jan 02 16:05:00 +0000 2017
>Closed-Date:    Thu Aug 17 18:54:21 +0000 2017
>Last-Modified:  Thu Aug 17 18:54:21 +0000 2017
>Originator:     Felix Deichmann
>Release:        7.99.53
>Organization:
>Environment:
NetBSD 7.99.53 (INSTALL.201612310510Z) vax
>Description:
SCSI gets stuck when unpacking sets in sysinst (or accessing fs) on a VAXstation 4000 (asc at vsbus) using -current. NetBSD-7 is fine.
Happens with different disks and different machines, VAXstation 4000/96 and 4000/60.
The problem occurs within seconds after/during disk access on the VAXstation 4000/96, but is harder to reproduce on the 4000/60 (can take much longer to happen, seems more difficult to produce concurrent requests?).
Tested workaround is to set adapt->adapt_openings = 1 in src/sys/dev/ic/ncr53c9x.c.


VAXstation 4000/96:

NetBSD 7.99.53 (INSTALL.201612310510Z)
MicroVAX 4000/{90,90A,96}
total memory = 127 MB
avail memory = 119 MB
mainbus0 (root)
cpu0 at mainbus0: KA49, NVAX, 10KB L1 cache, 256KB L2 cache
ze0 at mainbus0
ze0: hardware address 00:00:f8:xx:xx:xx
vsbus0 at mainbus0
vsbus0: 8K entry DMA SGMAP at PA 0x27000000 (VA 0x8af91000)
vsbus0: interrupt mask 0
dz0 at vsbus0 csr 0x25000000 vec 524 ipl 17 maskbit 3
dz0: 4 lines
lkkbd0 at dz0
lkkbd0: no keyboard
wskbd0 at lkkbd0 (mux ignored)
asc0 at vsbus0 csr 0x26000080 vec 510 ipl 17 maskbit 1
asc0: NCR53C94, 25MHz, SCSI ID 6
scsibus0 at asc0: 8 targets, 8 luns per target
spx0 at vsbus0 csr 0x38000000 vec 514 ipl 15 maskbit 2
spx0: Using Boldface 8x16 font
spx0: RAMDAC ID: 0x4a, Bt459 (SPX/LCSPX) RAMDAC type
wsdisplay0 at spx0 (kbdmux ignored)
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <DEC, RZ28M    (C) DEC, 0616> disk fixed
sd0: 2007 MB, 3045 cyl, 16 head, 84 sec, 512 bytes/sect x 4110480 sectors
sd0: sync (160.00ns offset 15), 8-bit (6.250MB/s) transfers, tagged queueing
boot device: ze0
root on md0a dumps on md0b
root file system type: ffs

# df -k
Filesystem    1K-blocks       Used      Avail %Cap Mounted on
/dev/md0a          1899       1794        105  94% /
/dev/sd0a       1764206         20    1675976   0% /mnt
# dd if=/dev/zero of=/mnt/test bs=64k count=1024
sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
 0053+0 recorsd0(asc0:0:0:0): dcommand aborteds, data =  00i 00n 00
    4e1 005 012 00+ 000 00
records out
9961472 bytes transferred in 1.946 secs (5118947 bytes/sec)

# sd0(asc0:0:0:0): command aborted, data = 00 00 00 00 4e 00 01 00 00 00
sd0(asc0:0:0:0): asc0: timed out [ecb 0x87eccf88 (flags 0x1, dleft 10000, stat 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 3), resid 0, msg(q 0,o 0) >
sd0(asc0:0:0:0): asc0: timed out [ecb 0x87eccb60 (flags 0x1, dleft 10000, stat 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 3), resid 0, msg(q 0,o 0) >
sd0(asc0:0:0:0): asc0: timed out [ecb 0x87eccd58 (flags 0x1, dleft 10000, stat 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 3), resid 0, msg(q 0,o 0) >
[much more of above "timed out" messages follow]


VAXstation 4000/60:

NetBSD 7.99.53 (INSTALL.201612310510Z)
VAXstation 4000/60
total memory = 81500 KB
avail memory = 74484 KB
mainbus0 (root)
cpu0 at mainbus0: KA46, Mariah, 2KB L1 cache, 256KB L2 cache
vsbus0 at mainbus0
vsbus0: 32K entry DMA SGMAP at PA 0x6e0000 (VA 0x806e0000)
vsbus0: interrupt mask 0
le0 at vsbus0 csr 0x200e0000 vec 770 ipl 17 maskbit 1 buf 0x0-0xffff
le0: address 08:00:2b:xx:xx:xx
le0: 32 receive buffers, 8 transmit buffers
dz0 at vsbus0 csr 0x200a0000 vec 124 ipl 17 maskbit 4
dz0: 4 lines
lkkbd0 at dz0
lkkbd0: no keyboard
wskbd0 at lkkbd0 (mux ignored)
asc0 at vsbus0 csr 0x200c0080 vec 774 ipl 17 maskbit 0
asc0: NCR53C94, 25MHz, SCSI ID 7
scsibus0 at asc0: 8 targets, 8 luns per target
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <DEC, RZ28M    (C) DEC, 0616> disk fixed
sd0: 2007 MB, 3045 cyl, 16 head, 84 sec, 512 bytes/sect x 4110480 sectors
sd0: sync (160.00ns offset 15), 8-bit (6.250MB/s) transfers, tagged queueing
boot device: le0
root on md0a dumps on md0b
root file system type: ffs

# sync
asc0: reselect from target 0 lun 0 tag 20:38 with no nexus; sending ABORT
asc0: target didn't send tag: 0 bytes in fifo
sd0: async, 8-bit transfers
# sd0(asc0:0:0:0): asc0: timed out [ecb 0x84ed6888 (flags 0x1, dleft a800, stat 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 3), resid 0, msg(q 0,o 0) >
sd0(asc0:0:0:0): asc0: timed out [ecb 0x84ed6850 (flags 0x1, dleft 2000, stat 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 3), resid 0, msg(q 0,o 0) >
sd0(asc0:0:0:0): asc0: timed out [ecb 0x84ed6818 (flags 0x1, dleft 3000, stat 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 3), resid 0, msg(q 0,o 0) >
[much more of above "timed out" messages follow]
>How-To-Repeat:
Try to install NetBSD-current to disk using sysinst (boot install.ram over network) on a VAXstation 4000.
>Fix:
Workaround: adapt->adapt_openings = 1 in src/sys/dev/ic/ncr53c9x.c

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: port-vax-maintainer->mlelstv
Responsible-Changed-By: mlelstv@NetBSD.org
Responsible-Changed-When: Tue, 03 Jan 2017 07:18:27 +0000
Responsible-Changed-Why:
probably a mp scsipi fallout


From: Felix Deichmann <flxd@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-vax/51761: SCSI gets stuck on VAXstation 4000
Date: Fri, 6 Jan 2017 08:58:50 +0100

 The SCSI issues are an effect of a gcc bug in its built-in ffs function
 for vax.

 scsipi_get_tag() in src/sys/dev/scsipi/scsipi_base.c uses ffs to obtain
 a free tag from the pool. It will erroneously return tag id 0x38 (word
 1, bit 24) after 0x1f has been reached (word 0, bit 31), instead of the
 expected 0x20 (word 1, bit 0).

 This is because the statement

 		bit = ffs(periph->periph_freetags[word]);

 is compiled as

     15b0:	ea 00 20 48 	ffs $0x0,$0x20,0x50(r10)[r8],r9
     15b4:	aa 50 59

 r8 is obviously a byte index, while it should be a longword index like
 variable "word" is a longword index into u_int32_t periph_freetags[].

 A tested workaround is to use a generic function like inlined ffs32 from
 <sys/bitops.h> instead of the gcc built-in ffs function.

From: "Felix Deichmann" <flxd@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51761 CVS commit: src/external/gpl3/gcc/dist/gcc/config/vax
Date: Thu, 8 Jun 2017 15:28:27 +0000

 Module Name:	src
 Committed By:	flxd
 Date:		Thu Jun  8 15:28:27 UTC 2017

 Modified Files:
 	src/external/gpl3/gcc/dist/gcc/config/vax: builtins.md

 Log Message:
 Fix PR port-vax/51761 as suggested by Paul Koning on port-vax list.
 Installation (install.ram, -Os) on my VS4000 is possible without SCSI timeouts
 again.
 Other variable-length bit field instructions should be checked for correct
 constraints, too!


 To generate a diff of this commit:
 cvs rdiff -u -r1.5 -r1.6 \
     src/external/gpl3/gcc/dist/gcc/config/vax/builtins.md

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->pending-pullups
State-Changed-By: flxd@NetBSD.org
State-Changed-When: Sat, 10 Jun 2017 17:04:54 +0000
State-Changed-Why:
[pullup-8 #29] gcc vax __builtin_ffs code generation fix


From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51761 CVS commit: [netbsd-8] src/external/gpl3/gcc/dist/gcc/config/vax
Date: Wed, 14 Jun 2017 04:49:32 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Wed Jun 14 04:49:32 UTC 2017

 Modified Files:
 	src/external/gpl3/gcc/dist/gcc/config/vax [netbsd-8]: builtins.md

 Log Message:
 Pull up following revision(s) (requested by flxd in ticket #29):
 	external/gpl3/gcc/dist/gcc/config/vax/builtins.md: revision 1.6
 Fix PR port-vax/51761 as suggested by Paul Koning on port-vax list.
 Installation (install.ram, -Os) on my VS4000 is possible without SCSI
 timeouts again.
 Other variable-length bit field instructions should be checked for correct
 constraints, too!


 To generate a diff of this commit:
 cvs rdiff -u -r1.5 -r1.5.8.1 \
     src/external/gpl3/gcc/dist/gcc/config/vax/builtins.md

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Thu, 17 Aug 2017 18:54:21 +0000
State-Changed-Why:
Pullup to netbsd-8 done. Thank you.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.