NetBSD Problem Report #38758
From hf@spg.tu-darmstadt.de Mon May 26 11:35:16 2008
Return-Path: <hf@spg.tu-darmstadt.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id BBF7263B8E3
for <gnats-bugs@gnats.NetBSD.org>; Mon, 26 May 2008 11:35:16 +0000 (UTC)
Message-Id: <200805261134.m4QBYvYG003499@Gstoder.nt.e-technik.tu-darmstadt.de>
Date: Mon, 26 May 2008 13:34:57 +0200 (CEST)
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Reply-To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@gnats.NetBSD.org
Cc: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Subject: LOCKDEBUG kernel hangs in esp(4) driver
X-Send-Pr-Version: 3.95
>Number: 38758
>Category: port-mac68k
>Synopsis: LOCKDEBUG kernel hangs in esp(4) driver
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: hauke
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon May 26 11:40:00 +0000 2008
>Closed-Date: Mon May 11 20:03:15 +0000 2009
>Last-Modified: Mon May 11 20:03:15 +0000 2009
>Originator: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
>Release: NetBSD 4.99.63
>Organization:
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-3281
>Environment:
System: NetBSD 4.99.63 (DEBUG) #0: Mon May 26 13:07:58 CEST 2008
hf@Hochstuhl:/var/obj/netbsd-builds/developer/mac68k/sys/arch/mac68k/compile/DEBUG
Architecture: m68k
Machine: mac68k
>Description:
A GENERIC kernel built with DIAGNOSTIC, DEBUG and LOCKDEBUG
options hangs when mounting root. Breaking into the debugger
shows it is in esp scsi code. Attempting a 'reboot' from the
debugger results in a LOCKDEBUG panic.
>How-To-Repeat:
Build and boot the following kernel:
include "arch/mac68k/conf/GENERIC"
options DIAGNOSTIC
options DEBUG
options LOCKDEBUG
Bootstrapping NetBSD/mac68k.
Getting mapping from MMU.
Loaded at 0x0
System RAM: 142606336 bytes in 34816 pages.
Low = 0x0, high = 0x8800000
On-board video at addr 0x0xf9000080 (phys 0x0xf9000080), len 0xfff80.
Done.
Bootstrapping the pmap system.
Pmap bootstrapped.
Moving ROMBase from 0x40800000 to 0x4eb000.
Video address 0x0xf9000080 -> 0x0x6eb080.
Loaded initial symtab at 0x354ec0, strtab at 0x3a9448, # entries 20752
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 4.99.63 (DEBUG) #0: Mon May 26 13:07:58 CEST 2008
hf@Hochstuhl:/var/obj/netbsd-builds/developer/mac68k/sys/arch/mac68k/compile/DEBUG
Apple Macintosh Quadra 650 (68040)
cpu: delay factor 1280
fpu: mc68040
total memory = 136 MB
avail memory = 127 MB
mrg: 'Quadra/Centris ROMs' ROM glue, tracing off, debug off, silent traps
mrg: I/O map kludge for ROMs that use hardware addresses directly.
mainbus0 (root)
obio0 at mainbus0
esp0 at obio0 addr 0 (quick): address 0x3fb000: NCR53C96, 16MHz, SCSI ID 7
scsibus0 at esp0: 8 targets, 8 luns per target
adb0 at obio0
asc0 at obio0: Apple Sound Chip
intvid0 at obio0 @ f9000080: DAFB video subsystem, monitor sense 7
intvid0: 1152 x 870, 256 color
macfb0 at intvid0
wsdisplay0 at macfb0 (kbdmux ignored)
sn0 at obio0: integrated SONIC Ethernet adapter
sn0: Ethernet address 08:00:07:ce:67:59
iwm0 at obio0: Apple GCR floppy disk controller
zsc0 at obio0 chip type 0
zsc0 channel 0: d_speed 9600 DCD clk 0 CTS clk 0
zstty0 at zsc0 channel 0 (console i/o)
zsc0 channel 1: d_speed 9600 DCD clk 0 CTS clk 0
zstty1 at zsc0 channel 1
nubus0 at mainbus0
sm0 at nubus0 slot e: AsanteFAST 10/100 NB
sm0: SMC91C100, revision 0, buffer size: 128 KB
sm0: MAC address 00:00:94:75:93:d9, default media MII
nsphy0 at sm0 phy 5: DP83840 10/100 media interface, rev. 0
nsphy0: 10baseT, 100baseTX, auto
scsibus0: waiting 2 seconds for devices to settle...
adb0 (direct, II series): 2 targets
aed0 at adb0 addr 0: ADB Event device
akbd0 at adb0 addr 2: standard keyboard (ISO layout)
wskbd0 at akbd0 (mux ignored)
ams0 at adb0 addr 3: 1-button, 100 dpi mouse
wsmouse0 at ams0 (mux ignored)
sd0 at scsibus0 target 0 lun 0: <SEAGATE, ST336706LW, 010A> disk fixed
sd0: 35003 MB, 26302 cyl, 4 head, 681 sec, 512 bytes/sect x 71687370 sectors
sd0: async, 8-bit transfers, tagged queueing
boot device: sd0
root on sd0a dumps on sd0b <<=== stalled here
Panic switch: PC is 0x79276.
Stopped in pid 0.2 (system) at netbsd:cpu_Debugger+0x6: unlk a6
db> t
cpu_Debugger(2d9100,79276,0,a06fe64,358e) + 6
nmihand(80,80,0,4,91) + 2c
lev7intr(?)
ncr53c9x_intr(d00c00,9,193196,d00c00,cc2f88) + 12
esp_quick_dma_go(d00c00,d00c00,3,90) + 100
ncr53c9x_intr(d00c00) + 10b4
esp_intr(d00c00) + 1a
via2_intr(0,a06ffa8,356a,68,1) + 50
intr_dispatch(68) + 4e
intrhand(a087be0) + a
lwp_trampoline() + e
db> reboot
syncing disks... Mutex error: lockdebug_wantlock: acquiring sleep lock from interrupt context
lock address : 0x000000000a084fc0 type : sleep/adaptive
shared holds : 0 exclusive: 0
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 0x000000000a087be0 last held: 000000000000000000
last locked : 0x000000000018e60e unlocked : 0x000000000018e7ba
initialized : 0x000000000017e338
owner field : 000000000000000000 wait/spin: 0/0
Turnstile chain at 0x348400.
=> No active turnstile for this lock.
panic: LOCKDEBUG
Stopped in pid 0.2 (system) at netbsd:cpu_Debugger+0x6: unlk a6
db>
>Fix:
None.
>Release-Note:
>Audit-Trail:
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
tsutsui@ceres.dti.ne.jp
Subject: Re: kern/38758: LOCKDEBUG kernel hangs in esp(4) driver
Date: Tue, 27 May 2008 20:52:21 +0900
> >Synopsis: LOCKDEBUG kernel hangs in esp(4) driver
At least esp(4) on sun3/80 works fine with DEBUG+LOCKDEBUG+DIAGNOSTIC
(though it's too slow) so maybe your problem is mac68k specific.
---
Izumi Tsutsui
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>, kern-bug-people@NetBSD.org,
gnats-admin@NetBSD.org
Subject: Re: kern/38758: LOCKDEBUG kernel hangs in esp(4) driver
Date: Tue, 27 May 2008 16:37:27 +0200
Izumi Tsutsui wrote:
>>> Synopsis: LOCKDEBUG kernel hangs in esp(4) driver
>
> At least esp(4) on sun3/80 works fine with DEBUG+LOCKDEBUG+DIAGNOSTIC
> (though it's too slow) so maybe your problem is mac68k specific.
sun3 uses real DMA, doesn't it? Different code paths, probably. The
mac68k MD code doesn't use simple_lock() anywhere, though.
Here's the output from a LOCKDEBUG netbsd-4 kernel:
NetBSD 4.0_STABLE (DEBUG) #0: Tue May 27 15:59:40 CEST 2008
hf@Hochstuhl:/var/obj/netbsd-builds/4/mac68k/sys/arch/mac68k/compile/DEBUG
Apple Macintosh Quadra 650 (68040)
cpu: delay factor 1280
total memory = 136 MB
avail memory = 128 MB
[...]
esp0 at obio0 addr 0 (quick): address 0x3a6000: NCR53C96, 16MHz, SCSI ID 7
scsibus0 at esp0: 8 targets, 8 luns per target
[...]
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <SEAGATE, ST336706LW, 010A> disk fixed
sd0: 35003 MB, 26302 cyl, 4 head, 681 sec, 512 bytes/sect x 71687370 sectors
sd0: async, 8-bit transfers, tagged queueing
boot device: sd0
root on sd0a dumps on sd0b
simple_lock: lock held
lock: 0xcb11c4, currently at: /public/netbsd-4/sys/dev/ic/ncr53c9x.c:2084
last locked: /public/netbsd-4/sys/dev/ic/ncr53c9x.c:2084
last unlocked: /public/netbsd-4/sys/dev/ic/ncr53c9x.c:2799
?(?)
_simple_lock(cb11c4,27ccca,824) at 0
ncr53c9x_intr(cb1000,1,17d8e2,cb1000,c82f88) + 38
esp_quick_dma_go(cb1000,cb1000,3) + 100
ncr53c9x_intr(cb1000) + 10d8
esp_intr(cb1000) + 1a
via2_intr(0) + 50
intr_dispatch(68) + 48
intrhand(?)
mi_switch(30700c,0) + a
ltsleep(a9aef00,11,29cabc,0,a9aef08) + 340
biowait(a9aef00,a9aef00,4200) + 60
readdisklabel(2,236444,d62800,c68be0,d62a00,d62a00) + 7e
sdopen(401,0,6000,0) + 2ea
sdsize(401,12000,cc70,80060492,f9000080) + ac
cpu_dumpconf(10d30,ffffcffc,480,0,1000) + 36
main(392ff4,251526,78e000,1000,30b80c) + 1dc
low() + 2
Stopped at netbsd:cpu_Debugger+0x6: unlk a6
db> ps
PID PPID PGRP UID S FLAGS LWPS COMMAND WAIT
2 0 0 0 2 0x20200 1 scsibus0 sccomp
1 0 0 0 2 0x20000 1 init initexe
0 -1 0 0 2 0x20200 1 swapper biowait
db>
Like the -current LOCKDEBUG kernel ist doesn't even make it to a
single-user prompt, but panics instead of locking up.
hauke
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut fu"r Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-3281
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
hauke@Espresso.Rhein-Neckar.DE, tsutsui@ceres.dti.ne.jp
Subject: Re: kern/38758: LOCKDEBUG kernel hangs in esp(4) driver
Date: Tue, 27 May 2008 23:55:26 +0900
> sun3 uses real DMA, doesn't it? Different code paths, probably. The
> mac68k MD code doesn't use simple_lock() anywhere, though.
I think the problem is esp interrupts are implicitly enabled
by spl2() in esp_quick_dma_go().
I have no idea if there is a safe way to enable serial
interrupts without enabling esp interrupts.
---
Izumi Tsutsui
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org
Subject: Re: kern/38758: LOCKDEBUG kernel hangs in esp(4) driver
Date: Tue, 27 May 2008 17:26:49 +0200
At 23:55 Uhr +0900 27.05.2008, Izumi Tsutsui wrote:
>> sun3 uses real DMA, doesn't it? Different code paths, probably. The
>> mac68k MD code doesn't use simple_lock() anywhere, though.
>
>I think the problem is esp interrupts are implicitly enabled
>by spl2() in esp_quick_dma_go().
Isn't spl2() supposed to block level 2 interrupts (esp here)? With
the A/UX interrup mapping of the Quadras, we could in addition to
serial interrupts (lvl 4) take interrupts from the onboard SONIC
ethernet (lvl 3), the sound chip (lvl 5) and the VIA1 (ADB, clock -
lvl 6).
>I have no idea if there is a safe way to enable serial
>interrupts without enabling esp interrupts.
Commenting out the spl2() .. splhigh() sequence above does not make a
difference.
hauke
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-3281
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: hf@spg.tu-darmstadt.de
Cc: gnats-bugs@NetBSD.org, kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
tsutsui@ceres.dti.ne.jp
Subject: Re: kern/38758: LOCKDEBUG kernel hangs in esp(4) driver
Date: Wed, 28 May 2008 00:37:24 +0900
> Commenting out the spl2() .. splhigh() sequence above does not make a
> difference.
Ah, esp_quick_dma_go() itself calls ncr53c9x_intr()
and that may cause the recursive lock problem.
(though I'm not sure what the code intends)
Anyway, it's MD problem, not in MI ncr53c9x.
---
Izumi Tsutsui
Responsible-Changed-From-To: kern-bug-people->port-mac68k-maintainer
Responsible-Changed-By: hauke@NetBSD.org
Responsible-Changed-When: Tue, 27 May 2008 16:01:13 +0000
Responsible-Changed-Why:
Izumi Tsutsui says it's mac68k MD, not MI.
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org, Hauke Fath <hf@spg.tu-darmstadt.de>
Cc:
Subject: Re: port-mac68k/38758
Date: Wed, 28 May 2008 14:07:49 +0200
The following patch fixes the problem for HEAD as well as netbsd-4:
Index: esp.c
===================================================================
RCS file: /cvsroot/src/sys/arch/mac68k/obio/esp.c,v
retrieving revision 1.50
diff -u -u -r1.50 esp.c
--- esp.c 13 Apr 2008 04:55:52 -0000 1.50
+++ esp.c 28 May 2008 12:02:40 -0000
@@ -866,7 +866,9 @@
printf("g!\n");
}
#endif
+#if 0 /* XXX hf */
ncr53c9x_intr(sc);
+#endif
if (espspl != -1)
splx(espspl);
espspl = -1;
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-3281
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
Subject: Re: port-mac68k/38758
Date: Wed, 28 May 2008 16:55:57 +0200
The netbsd-3 branch has the same problem.
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-3281
Responsible-Changed-From-To: port-mac68k-maintainer->hauke
Responsible-Changed-By: hauke@NetBSD.org
Responsible-Changed-When: Mon, 02 Jun 2008 10:12:32 +0000
Responsible-Changed-Why:
I think I understand the situation; might as well take it myself.
State-Changed-From-To: open->analyzed
State-Changed-By: hauke@NetBSD.org
State-Changed-When: Mon, 02 Jun 2008 10:16:32 +0000
State-Changed-Why:
After we've been called from the MI interrupt handler,
we detect an interrupt while running at splhigh(), and
handle it by calling the MI interrupt handler.
Since that works just fine without LOCKDEBUG, we
explicitely unlock the MI handler for this call.
From: Hauke Fath <hauke@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/38758 CVS commit: src/sys/arch/mac68k/obio
Date: Mon, 2 Jun 2008 12:01:11 +0000 (UTC)
Module Name: src
Committed By: hauke
Date: Mon Jun 2 12:01:11 UTC 2008
Modified Files:
src/sys/arch/mac68k/obio: esp.c
Log Message:
esp_quick_dma_go() gets called from the MI ncr53c9x_intr() handler,
which protects itself against multiple invocation with a
simple_lock. Follow the example of ncr53c9x_poll() for servicing an
interrupt that came while we run in splhigh(), and 'manually' unlock
the MI handler for calling ncr53c9x_intr().
Fixes PR mac68k/38758.
To generate a diff of this commit:
cvs rdiff -r1.50 -r1.51 src/sys/arch/mac68k/obio/esp.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->pending-pullups
State-Changed-By: hauke@NetBSD.org
State-Changed-When: Mon, 11 May 2009 06:22:17 +0000
State-Changed-Why:
pullup to netbsd-4 requested.
From: Manuel Bouyer <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/38758 CVS commit: [netbsd-4] src/sys/arch/mac68k
Date: Mon, 11 May 2009 19:31:40 +0000
Module Name: src
Committed By: bouyer
Date: Mon May 11 19:31:40 UTC 2009
Modified Files:
src/sys/arch/mac68k/conf [netbsd-4]: GENERIC
src/sys/arch/mac68k/obio [netbsd-4]: esp.c
Log Message:
Pull up following revision(s) (requested by hauke in ticket #1315):
sys/arch/mac68k/obio/esp.c: revision 1.51 via patch
sys/arch/mac68k/conf/GENERIC: revision 1.188
Add LOCKDEBUG option, commented out, so that people know it's there.
esp_quick_dma_go() gets called from the MI ncr53c9x_intr() handler,
which protects itself against multiple invocation with a
simple_lock. Follow the example of ncr53c9x_poll() for servicing an
interrupt that came while we run in splhigh(), and 'manually' unlock
the MI handler for calling ncr53c9x_intr().
Fixes PR mac68k/38758.
To generate a diff of this commit:
cvs rdiff -u -r1.177.2.2 -r1.177.2.3 src/sys/arch/mac68k/conf/GENERIC
cvs rdiff -u -r1.44 -r1.44.14.1 src/sys/arch/mac68k/obio/esp.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: pending-pullups->closed
State-Changed-By: hauke@NetBSD.org
State-Changed-When: Mon, 11 May 2009 20:03:15 +0000
State-Changed-Why:
Fix pulled up to netbsd-4.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.