NetBSD Problem Report #56131

From www@netbsd.org  Mon Apr 26 14:38:15 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 988D01A9272
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 26 Apr 2021 14:38:15 +0000 (UTC)
Message-Id: <20210426143813.AF6991A9273@mollari.NetBSD.org>
Date: Mon, 26 Apr 2021 14:38:13 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: mount WAPBL SCSI disk causes panic
X-Send-Pr-Version: www-1.0

>Number:         56131
>Category:       port-mac68k
>Synopsis:       mount WAPBL SCSI disk causes panic
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Apr 26 14:40:01 +0000 2021
>Last-Modified:  Mon Aug 15 12:20:01 +0000 2022
>Originator:     Rin Okuyama
>Release:        9.99.82
>Organization:
Department of Physics, Meiji University
>Environment:
NetBSD  9.99.82 NetBSD 9.99.82 (Q840AV) #123: Mon Apr 26 22:10:37 JST 2021  rin@latipes:/sys/arch/mac68k/compile/Q840AV mac68k
>Description:
On my Quadra 840AV (mac68k with esp(4) driver), mount SCSI disk (WAPBL
is enabled) causes panic as follows:

| Enter pathname of shell or RETURN for /bin/sh:
| We recommend that you create a non-root account and use su(1) for root access.
| # mount /
| panic: LIST_INSERT_HEAD 0x9f401c ../../../../kern/subr_pool.c:494
| cpu0: Begin traceback...
| ?(?)
| db_panic(2053d0,31284,0,418fa8,91e5b34) at 0
| vpanic(337b6b,91e5b40,91e5b8c,23de96,337b6b) + 162
| panic(337b6b,9f401c,38266f,1ee,1020) + c
| pool_put(418fa8,9f5910,9f5920,0,9c23e0) + 2e4
| scsipi_put_xs(9f5910,6,9f59a4,91e5ccc,91e5cd0) + e8
| scsipi_execute_xs(9f5910,9c23d8,9f59a4,91e5c42,6) + 2f2
| scsipi_command(c2ad04,91e5c42,6,91e5ccc,17) + 78
| scsipi_mode_sense(c2ad04,8,8,91e5ccc,17,20,4,1770) + 4c
| sd_mode_sense(c18dc0,8,91e5ccc,13,8,20,91e5cc8) + 80
| sdioctl(?)
| bdev_ioctl(0,408,40046474,c749e0,2,c50680) + 1a2
| spec_ioctl(91e5d40,38e684,c0a1e4,40046474,c749e0) + 10a
| VOP_IOCTL(c0a1e4,40046474,c749e0,2,fffffffe) + 38
| wapbl_start(c2703c,c27000,c0a1e4,0,f02e20,7440,200,0,180e5c,180f5e) + 2fc
| ffs_wapbl_start(c27000) + 296
| ffs_mount(?)
| VFS_MOUNT(c27000,ffff9200,9b26a0,91e5efc) + 402
| do_sys_mount(c50680,319f,0,ffff9200,2010000) + 3b4
| sys___mount50(c50680,91e5f38,91e5f30,b,0) + 26
| syscall_plain(19a,c50680,91e5fb4,ffff9200,ffff91f8) + d2
| syscall(19a) + 70
| trap0() + e
| cpu0: End traceback...

This is almost reproducible when fsck is not executed before mount
(even if file system is clean). Running fsck seems to reduce probability
of panic somehow, but not 100%. This panic does not take place if WAPBL
is disabled (``log'' is discarded from fstab/command line).

Also, other mac68k machine (Quadra 800) with other variant of esp(4),
panic does not occur even if WAPBL enabled.

Full dmesg for two machines:

(Quadra Q840AV: panic)
https://dmesgd.nycbug.org/index.cgi?do=view&id=5968

(Quadra Q800: no-panic)
https://dmesgd.nycbug.org/index.cgi?do=view&id=6026
>How-To-Repeat:
Mount WAPBL enabled SCSI disk on Quadra 840AV.
>Fix:
N/A

>Release-Note:

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56131: mount WAPBL SCSI disk causes panic
Date: Wed, 9 Jun 2021 01:04:40 +0000

 On Mon, Apr 26, 2021 at 02:40:01PM +0000, rokuyama.rk@gmail.com wrote:
  > | panic: LIST_INSERT_HEAD 0x9f401c ../../../../kern/subr_pool.c:494
  > | [...]
  > | panic(337b6b,9f401c,38266f,1ee,1020) + c
  > | pool_put(418fa8,9f5910,9f5920,0,9c23e0) + 2e4
  > | scsipi_put_xs(9f5910,6,9f59a4,91e5ccc,91e5cd0) + e8
  > | scsipi_execute_xs(9f5910,9c23d8,9f59a4,91e5c42,6) + 2f2
  > | scsipi_command(c2ad04,91e5c42,6,91e5ccc,17) + 78
  > | scsipi_mode_sense(c2ad04,8,8,91e5ccc,17,20,4,1770) + 4c
  > | sd_mode_sense(c18dc0,8,91e5ccc,13,8,20,91e5cc8) + 80
  > | sdioctl(?)
  > | bdev_ioctl(0,408,40046474,c749e0,2,c50680) + 1a2
  > | spec_ioctl(91e5d40,38e684,c0a1e4,40046474,c749e0) + 10a
  > | VOP_IOCTL(c0a1e4,40046474,c749e0,2,fffffffe) + 38
  > | wapbl_start(c2703c,c27000,c0a1e4,0,f02e20,7440,200,0,180e5c,180f5e) + 2fc
  > | ffs_wapbl_start(c27000) + 296
  > | ffs_mount(?)
  > | VFS_MOUNT(c27000,ffff9200,9b26a0,91e5efc) + 402
  >
  > [...]
  > 
  > This is almost reproducible when fsck is not executed before mount
  > (even if file system is clean). Running fsck seems to reduce probability
  > of panic somehow, but not 100%. This panic does not take place if WAPBL
  > is disabled (``log'' is discarded from fstab/command line).
  > 
  > Also, other mac68k machine (Quadra 800) with other variant of esp(4),
  > panic does not occur even if WAPBL enabled.

 So it must be the driver (rather that wapbl) that's wrong. My guess is
 that there's some pattern of I/Os or maybe a timing issue that makes
 it blow up, and that because running fsck flushes the journal, it
 reduces the probability of seeing the pattern.

 It looks like the apparently missing step between wapbl_start and
 VOP_IOCTL is wapbl_dkcache_init() being inlined, in which case the
 ioctl is DIOCGCACHE. That might or might not be useful info though,
 since it looks like the problem is that the pool is corrupt...

 -- 
 David A. Holland
 dholland@netbsd.org

From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56131 CVS commit: src/sys/arch/mac68k/obio
Date: Mon, 15 Aug 2022 12:16:25 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Mon Aug 15 12:16:25 UTC 2022

 Modified Files:
 	src/sys/arch/mac68k/obio: esp.c espvar.h

 Log Message:
 Rework avdma to fix PR port-mac68k/56131 as well as add synchronous
 transfer support.

 According to analysis by Michael Zucca, PSC (DMAC for Quadra/Centris AV)
 seems to require that DMA buffer is
   (1) aligned to 16-byte boundaries, and
   (2) multiple of 16 bytes in size.
 If the buffer does not satisfy these constraints, esp.c rev 1.63 and
 prior carry out partial PIO to align or shave off it.

 However, partial PIO does not always work correctly for combination of
 NCR53C94 and PSC, which results in failures observed as port-mac68k/56131.

 Also, PIO spoils synchronous transfer, which is timing critical.

 Therefore, for buffers that do not satisfy the boundary conditions,
 completely stop using PIO and use DMA with ``bounce'' buffers.

 This fixes port-mac68k/56131 and enables sync transfer as a big bonus.

 Note that bounce DMA does not hurt performance at all. For filesystem
 and swap I/O, buffers always satisfy the constraints above, and bounce
 DMA is necessary only
   (a) when disk is attached, or
   (b) for special utilities like fsck(8) or fdisk(8),
 as far as I can tell.

 Also:

 - Stop providing ``DMA-friendly'' sc_imess and sc_omess; transfer for
   MSGIN or MSGOUT does not almost certainly satisfy boundary condition
   (2). Again, this does not affect performance at all.

 - SCSI bus frequency is 20MHz (i.e., 5MB/s for sync transfer) for AV
   models, according to ``Quadra 840AV Service Source''.


 To generate a diff of this commit:
 cvs rdiff -u -r1.63 -r1.64 src/sys/arch/mac68k/obio/esp.c
 cvs rdiff -u -r1.9 -r1.10 src/sys/arch/mac68k/obio/espvar.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.