NetBSD Problem Report #40099

From mhitch@net3.msu.montana.edu  Wed Dec  3 23:20:19 2008
Return-Path: <mhitch@net3.msu.montana.edu>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id C61DE63B8BD
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  3 Dec 2008 23:20:18 +0000 (UTC)
Message-Id: <20081203212254.A8058224B3@net3.msu.montana.edu>
Date: Wed,  3 Dec 2008 14:22:54 -0700 (MST)
From: mhitch@lightning.msu.montana.edu
Reply-To: mhitch@lightning.msu.montana.edu
To: gnats-bugs@gnats.NetBSD.org
Subject: device_t/softc split broke cac(4)/ld(4): panic: iostat_unbusy
X-Send-Pr-Version: 3.95

>Number:         40099
>Category:       kern
>Synopsis:       device_t/softc split broke cac(4)/ld(4): panic: iostat_unbusy
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Dec 03 23:25:00 +0000 2008
>Closed-Date:    Thu Dec 04 11:53:54 +0000 2008
>Last-Modified:  Sat Dec 13 21:50:01 +0000 2008
>Originator:     Michael L. Hitch
>Release:        NetBSD 4.99.72
>Organization:
	Montana State University
>Environment:


System: NetBSD net3.msu.montana.edu 4.99.72 NetBSD 4.99.72 (GENERIC) #0: Tue Dec 2 23:18:05 MST 2008 mhitch@net3.msu.montana.edu:/home/mhitch/NetBSD-current/OBJ/i386/home/mhitch/NetBSD-current/src/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
	The device_t/softc changes made to ld(4) and cac(4) cause a 'panic:
	iostat_unbusy' when booting on an HP server (DL360 in my case).
>How-To-Repeat:
	The usual:  build a kernel from sources after September 9 (the date the
	device_t/softc changes were made) and try to boot the kernel.

cac0 at pci0 dev 1 function 0: Compaq Integrated Array
cac0: interrupting at ioapic0 pin 3     
cac0: 2 channels, firmware <1.50>   
ld0 at cac0 unit 0: RAID1 array
ld0: 17359 MB, 8817 cyl, 64 head, 63 sec, 512 bytes/sect x 35553120 sectors
...
scsibus0: waiting 2 seconds for devices to settle...
fd0 at fdc0: busy < 0
panic: iostat_unbusy
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c052e42c cs 8 eflags 246 cr2 0 ilevel 6
Stopped in pid 0.5 (system) at  netbsd:breakpoint+0x4:  popl    %ebp
db{0}> bt
breakpoint(c0a344c4,cccace78,c0a5c140,c0479f35,6,5,0,0,cccace7c,0) at netbsd:bre
akpoint+0x4
panic(c09f8b47,cccfaeb0,0,0,0,0,0,0,c3107ec8,cccefd10) at netbsd:panic+0x1b8
iostat_unbusy(cccfaeb0,200,100000,c0450007,cccfaf36,0,0,0,0,c3107ec8) at netbsd:
iostat_unbusy+0xc4
lddone(cccefd10,c3107ec8,0,c05160f8,c0a59d14,0,cccefd10,c3107ec8,cccefd10,cdecd0
00) at netbsd:lddone+0x3c
ld_cac_done(cccefd10,c3107ec8,0,200,2,cccfaeb0,0,cccfaeb0,0,cccfaf34) at netbsd:
ld_cac_done+0x6a
cac_ccb_done(cccfaeb0,cc3c74c0,c0a5c140,c316ea40,2,6,cccacf6c,c051acbf,cccfaeb0,
0) at netbsd:cac_ccb_done+0xb1
cac_intr(cccfaeb0,0,0,0,c0100cbf,0,c31b1f00,c010734d,c316ea40,cc835cf0) at netbs
d:cac_intr+0x39 
intr_biglock_wrapper(c316ea40,cc835cf0,0,0,0,0,0,0,0,0) at netbsd:intr_biglock_w
rapper+0x1f  
DDB lost frame for netbsd:Xintr_ioapic_level3+0xad, trying 0xcccacf74
Xintr_ioapic_level3() at netbsd:Xintr_ioapic_level3+0xad
--- interrupt ---
--- switch to interrupt stack ---
Xspllower(2,0,0,0,0,0,0,3,0,0) at netbsd:Xspllower+0xf
softint_dispatch(cc3d5520,2,0,0,0,0,cc835d90,cc835d28,cc3c74c0,28) at netbsd:sof
tint_dispatch+0x67
DDB lost frame for netbsd:Xsoftintr+0x3d, trying 0xcc835d88
Xsoftintr() at netbsd:Xsoftintr+0x3d
--- interrupt ---
fatal page fault in supervisor mode
trap type 6 code 0 eip c0530907 cs 8 eflags 10206 cr2 3a ilevel 8
kernel: supervisor trap page fault, code=0
Faulted in DDB; continuing...

>Fix:
	My workaround was to boot -ca and disable 'cac' and specify a root on
	a different controller;  the DL360 also has a Smart Array 5300.

	The fix will be to identify where the device_t/softc split went wrong
	[probably in sys/dev/ic/cac.c or sys/dev/ic/ld_cac.c] and fix it.

	I've started looking at this, but quickly get lost trying to follow
	which softc and which device_t stuff belongs to.  I'm going to try
	adding some debug code to show the addresses of the associated
	structures and try to make some sense of them and see what structure
	is actually being passed to iostat_unbusy(), unless someone more
	familiar with the device_t/softc can easily spot the error.

>Release-Note:

>Audit-Trail:
From: "Juan Romero Pardines" <xtraeme@gmail.com>
To: "NetBSD GNATS" <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: kern/40099: device_t/softc split broke cac(4)/ld(4): panic: iostat_unbusy
Date: Thu, 4 Dec 2008 11:25:48 +0100

 Hi,

 I think that is due to a wrong pointer passed as first arg to lddone() in line
 219 of sys/dev/ic/ld_cac.c.

 The line should be: lddone(&sc->sc_ld, bp).

State-Changed-From-To: open->closed
State-Changed-By: ad@NetBSD.org
State-Changed-When: Thu, 04 Dec 2008 11:53:54 +0000
State-Changed-Why:
I applied the patch, thanks.


From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/40099 CVS commit: src/sys/dev/ic
Date: Thu,  4 Dec 2008 11:48:14 +0000 (UTC)

 Module Name:	src
 Committed By:	ad
 Date:		Thu Dec  4 11:48:14 UTC 2008

 Modified Files:
 	src/sys/dev/ic: ld_cac.c

 Log Message:
 PR kern/40099 device_t/softc split broke cac(4)/ld(4): panic: iostat_unbusy


 To generate a diff of this commit:
 cvs rdiff -r1.22 -r1.23 src/sys/dev/ic/ld_cac.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Michael L. Hitch" <mhitch@lightning.msu.montana.edu>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/40099: device_t/softc split broke cac(4)/ld(4): panic:
 iostat_unbusy
Date: Thu, 4 Dec 2008 06:56:33 -0700 (MST)

 > The line should be: lddone(&sc->sc_ld, bp).

    That does indeed fix it, thanks.  I was looking in that area last night,
 and I might have eventually gotten there, but I was getting tired.  The
 fix has already been applied, so this PR can be closed.

 --
 Michael L. Hitch			mhitch@montana.edu
 Computer Consultant
 Information Technology Center
 Montana State University	Bozeman, MT	USA

From: Manuel Bouyer <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/40099 CVS commit: [netbsd-5] src/sys/dev/ic
Date: Sat, 13 Dec 2008 21:45:36 +0000 (UTC)

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Dec 13 21:45:36 UTC 2008

 Modified Files:
 	src/sys/dev/ic [netbsd-5]: ld_cac.c

 Log Message:
 Pull up following revision(s) (requested by mhitch in ticket #186):
 	sys/dev/ic/ld_cac.c: revision 1.23
 PR kern/40099 device_t/softc split broke cac(4)/ld(4): panic: iostat_unbusy


 To generate a diff of this commit:
 cvs rdiff -r1.22 -r1.22.4.1 src/sys/dev/ic/ld_cac.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:
 		Source date October 7

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.