NetBSD Problem Report #46291

From cheusov@tut.by  Tue Apr  3 17:30:59 2012
Return-Path: <cheusov@tut.by>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id DDE6363CD2A
	for <gnats-bugs@gnats.netbsd.org>; Tue,  3 Apr 2012 17:30:58 +0000 (UTC)
Message-Id: <s93hax0wwyd.fsf@work.imb.invention.com>
Date: Tue, 03 Apr 2012 20:23:54 +0300
From: cheusov@tut.by
To: gnats-bugs@gnats.NetBSD.org
Subject: 6.0_BETA kernel crash: intr_biglock_wrapper -> fxp_intr
X-Send-Pr-Version: 3.95

>Number:         46291
>Category:       kern
>Synopsis:       6.0_BETA kernel crash: intr_biglock_wrapper -> fxp_intr
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Apr 03 17:35:02 +0000 2012
>Last-Modified:  Tue May 08 10:05:02 +0000 2012
>Originator:     Aleksey Cheusov
>Release:        NetBSD 6.0_BETA
>Organization:
>Environment:
System: NetBSD work.imb.invention.com 6.0_BETA NetBSD 6.0_BETA (GENERIC) #5: Fri Mar 30 13:24:57 FET 2012 cheusov@work.imb.invention.com:/srv/obj-current/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
My system crashed immediately after reboot(8).
Stacktrace is below.

#0  0xc05c5be8 in maybe_dump (howto=260) at /srv/src_netbsd6/sys/arch/i386/i386/machdep.c:878
#1  cpu_reboot (howto=260, bootstr=0x0) at /srv/src_netbsd6/sys/arch/i386/i386/machdep.c:899
#2  0xc07c320a in vpanic (fmt=0xc0c217c3 "trap", ap=0xda920dc4 "garbagehere") at /srv/src_netbsd6/sys/kern/subr_prf.c:308
#3  0xc07c32af in panic (fmt=0xc0c217c3 "trap") at /srv/src_netbsd6/sys/kern/subr_prf.c:205
#4  0xc081fe60 in trap (frame=0xda920e64) at /srv/src_netbsd6/sys/arch/i386/i386/trap.c:396
#5  0xc010d08f in ?? ()
#6  0xc037317d in fxp_intr (arg=0xc3776000) at /srv/src_netbsd6/sys/dev/ic/i82557.c:1123
#7  0xc04d2740 in intr_biglock_wrapper (vp=0xc34ffb60) at /srv/src_netbsd6/sys/arch/x86/x86/intr.c:605
#8  0xc0107af5 in ?? ()
>How-To-Repeat:
I saw such crash only once.

>Fix:

Unknown
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: Re: kern/46291: fxp crash
Date: Tue, 8 May 2012 11:27:30 +0200

 --ZGiS0Q5IWpPtfppv
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 Do you remember how early in the boot process this crashed?
 Could it have been during fxp attaching, i.e. after the ethernet
 address had been printed, but before phy attached?

 Note that fxp_pci_attach establishes the interrupt handler before
 calling fxp_attach (and also sets sc->sc_enabled = 1), so we could
 accidently end up in fxp_intr() before sc->sc_ethercom.ec_if.if_softc
 is set correctly.

 This would also explain why you see it rarely. Does fxp share the interrupt
 with something else on your machine?

 I wonder if the attached patch would be the proper fix (but I don't understand
 anything about this hardware).

 Martin

 --ZGiS0Q5IWpPtfppv
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=patch

 Index: if_fxp_pci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/if_fxp_pci.c,v
 retrieving revision 1.79
 diff -c -u -r1.79 if_fxp_pci.c
 --- if_fxp_pci.c	2 Feb 2012 19:43:05 -0000	1.79
 +++ if_fxp_pci.c	8 May 2012 09:28:42 -0000
 @@ -498,8 +498,6 @@
  	/* Restore PCI configuration registers. */
  	fxp_pci_confreg_restore(psc);

 -	sc->sc_enabled = 1;
 -
  	/*
  	 * Map and establish our interrupt.
  	 */
 @@ -520,6 +518,7 @@

  	/* Finish off the attach. */
  	fxp_attach(sc);
 +	sc->sc_enabled = 1;
  	if (sc->sc_disable != NULL)
  		fxp_disable(sc);


 --ZGiS0Q5IWpPtfppv--

From: Aleksey Cheusov <cheusov@tut.by>
To: gnats-bugs@netbsd.org
Cc: netbsd-bugs@netbsd.org
Subject: Re: kern/46291: fxp crash
Date: Tue, 8 May 2012 12:49:54 +0300

 > =A0Do you remember how early in the boot process this crashed?

 This crash happened during reboot process, emmidiately after reboot(8),
 not while booting.

 >  Does fxp share the interrupt =A0with something else on your machine?

 How can I check this? What command? BIOS settings?

 > =A0I wonder if the attached patch would be the proper fix (but I don't un=
 derstand
 > =A0anything about this hardware).

 Normally I don't reboot this machine at all because it is used for
 building 5.1 packlages,
 but I can try to reproduce the problem and then try your patch.

From: Martin Husemann <martin@duskware.de>
To: Aleksey Cheusov <cheusov@tut.by>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/46291: fxp crash
Date: Tue, 8 May 2012 12:00:45 +0200

 --PmA2V3Z32TCmWXqI
 Content-Type: text/plain; charset=iso-8859-1
 Content-Disposition: inline
 Content-Transfer-Encoding: 8bit

 On Tue, May 08, 2012 at 12:49:54PM +0300, Aleksey Cheusov wrote:
 > This crash happened during reboot process, emmidiately after reboot(8),
 > not while booting.

 Ok, so the patch would not help.

 > >  Does fxp share the interrupt  with something else on your machine?
 > 
 > How can I check this? What command? BIOS settings?

 dmesg output would be a good start.

 Here is another patch for a similar race in case of shared interrupts
 during detach.

 Martin

 --PmA2V3Z32TCmWXqI
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=patch

 Index: i82557.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/ic/i82557.c,v
 retrieving revision 1.139
 diff -u -r1.139 i82557.c
 --- i82557.c	2 Feb 2012 19:43:03 -0000	1.139
 +++ i82557.c	8 May 2012 10:01:52 -0000
 @@ -2507,6 +2507,9 @@
  	fxp_stop(ifp, 1);
  	splx(s);

 +	/* make sure the interrupt handler bails quickly */
 +	sc->sc_enabled = 0;
 +
  	/* Destroy our callout. */
  	callout_destroy(&sc->sc_callout);


 --PmA2V3Z32TCmWXqI--

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.