NetBSD Problem Report #46291
From cheusov@tut.by Tue Apr 3 17:30:59 2012
Return-Path: <cheusov@tut.by>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id DDE6363CD2A
for <gnats-bugs@gnats.netbsd.org>; Tue, 3 Apr 2012 17:30:58 +0000 (UTC)
Message-Id: <s93hax0wwyd.fsf@work.imb.invention.com>
Date: Tue, 03 Apr 2012 20:23:54 +0300
From: cheusov@tut.by
To: gnats-bugs@gnats.NetBSD.org
Subject: 6.0_BETA kernel crash: intr_biglock_wrapper -> fxp_intr
X-Send-Pr-Version: 3.95
>Number: 46291
>Category: kern
>Synopsis: 6.0_BETA kernel crash: intr_biglock_wrapper -> fxp_intr
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Apr 03 17:35:02 +0000 2012
>Last-Modified: Tue May 08 10:05:02 +0000 2012
>Originator: Aleksey Cheusov
>Release: NetBSD 6.0_BETA
>Organization:
>Environment:
System: NetBSD work.imb.invention.com 6.0_BETA NetBSD 6.0_BETA (GENERIC) #5: Fri Mar 30 13:24:57 FET 2012 cheusov@work.imb.invention.com:/srv/obj-current/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
My system crashed immediately after reboot(8).
Stacktrace is below.
#0 0xc05c5be8 in maybe_dump (howto=260) at /srv/src_netbsd6/sys/arch/i386/i386/machdep.c:878
#1 cpu_reboot (howto=260, bootstr=0x0) at /srv/src_netbsd6/sys/arch/i386/i386/machdep.c:899
#2 0xc07c320a in vpanic (fmt=0xc0c217c3 "trap", ap=0xda920dc4 "garbagehere") at /srv/src_netbsd6/sys/kern/subr_prf.c:308
#3 0xc07c32af in panic (fmt=0xc0c217c3 "trap") at /srv/src_netbsd6/sys/kern/subr_prf.c:205
#4 0xc081fe60 in trap (frame=0xda920e64) at /srv/src_netbsd6/sys/arch/i386/i386/trap.c:396
#5 0xc010d08f in ?? ()
#6 0xc037317d in fxp_intr (arg=0xc3776000) at /srv/src_netbsd6/sys/dev/ic/i82557.c:1123
#7 0xc04d2740 in intr_biglock_wrapper (vp=0xc34ffb60) at /srv/src_netbsd6/sys/arch/x86/x86/intr.c:605
#8 0xc0107af5 in ?? ()
>How-To-Repeat:
I saw such crash only once.
>Fix:
Unknown
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: Re: kern/46291: fxp crash
Date: Tue, 8 May 2012 11:27:30 +0200
--ZGiS0Q5IWpPtfppv
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Do you remember how early in the boot process this crashed?
Could it have been during fxp attaching, i.e. after the ethernet
address had been printed, but before phy attached?
Note that fxp_pci_attach establishes the interrupt handler before
calling fxp_attach (and also sets sc->sc_enabled = 1), so we could
accidently end up in fxp_intr() before sc->sc_ethercom.ec_if.if_softc
is set correctly.
This would also explain why you see it rarely. Does fxp share the interrupt
with something else on your machine?
I wonder if the attached patch would be the proper fix (but I don't understand
anything about this hardware).
Martin
--ZGiS0Q5IWpPtfppv
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=patch
Index: if_fxp_pci.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/if_fxp_pci.c,v
retrieving revision 1.79
diff -c -u -r1.79 if_fxp_pci.c
--- if_fxp_pci.c 2 Feb 2012 19:43:05 -0000 1.79
+++ if_fxp_pci.c 8 May 2012 09:28:42 -0000
@@ -498,8 +498,6 @@
/* Restore PCI configuration registers. */
fxp_pci_confreg_restore(psc);
- sc->sc_enabled = 1;
-
/*
* Map and establish our interrupt.
*/
@@ -520,6 +518,7 @@
/* Finish off the attach. */
fxp_attach(sc);
+ sc->sc_enabled = 1;
if (sc->sc_disable != NULL)
fxp_disable(sc);
--ZGiS0Q5IWpPtfppv--
From: Aleksey Cheusov <cheusov@tut.by>
To: gnats-bugs@netbsd.org
Cc: netbsd-bugs@netbsd.org
Subject: Re: kern/46291: fxp crash
Date: Tue, 8 May 2012 12:49:54 +0300
> =A0Do you remember how early in the boot process this crashed?
This crash happened during reboot process, emmidiately after reboot(8),
not while booting.
> Does fxp share the interrupt =A0with something else on your machine?
How can I check this? What command? BIOS settings?
> =A0I wonder if the attached patch would be the proper fix (but I don't un=
derstand
> =A0anything about this hardware).
Normally I don't reboot this machine at all because it is used for
building 5.1 packlages,
but I can try to reproduce the problem and then try your patch.
From: Martin Husemann <martin@duskware.de>
To: Aleksey Cheusov <cheusov@tut.by>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/46291: fxp crash
Date: Tue, 8 May 2012 12:00:45 +0200
--PmA2V3Z32TCmWXqI
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
On Tue, May 08, 2012 at 12:49:54PM +0300, Aleksey Cheusov wrote:
> This crash happened during reboot process, emmidiately after reboot(8),
> not while booting.
Ok, so the patch would not help.
> > Does fxp share the interrupt with something else on your machine?
>
> How can I check this? What command? BIOS settings?
dmesg output would be a good start.
Here is another patch for a similar race in case of shared interrupts
during detach.
Martin
--PmA2V3Z32TCmWXqI
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=patch
Index: i82557.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ic/i82557.c,v
retrieving revision 1.139
diff -u -r1.139 i82557.c
--- i82557.c 2 Feb 2012 19:43:03 -0000 1.139
+++ i82557.c 8 May 2012 10:01:52 -0000
@@ -2507,6 +2507,9 @@
fxp_stop(ifp, 1);
splx(s);
+ /* make sure the interrupt handler bails quickly */
+ sc->sc_enabled = 0;
+
/* Destroy our callout. */
callout_destroy(&sc->sc_callout);
--PmA2V3Z32TCmWXqI--
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.